@@ -270,17 +270,17 @@ We start with a bivariate normal distribution pinned down by
270270
271271$$
272272\mu=\left[\begin{array}{c}
273- 0 \\
274- 0
273+ .5 \\
274+ 1. 0
275275\end{array}\right],\quad\Sigma=\left[\begin{array}{cc}
2762761 & .5\\
277- .5 & 2
277+ .5 & 1
278278\end{array}\right]
279279$$
280280
281281``` {code-cell} python3
282- μ = np.array([0., 0 .])
283- Σ = np.array([[1., .5], [.5 ,2 .]])
282+ μ = np.array([.5, 1 .])
283+ Σ = np.array([[1., .5], [.5 ,1 .]])
284284
285285# construction of the multivariate normal instance
286286multi_normal = MultivariateNormal(μ, Σ)
@@ -291,17 +291,139 @@ k = 1 # choose partition
291291
292292# partition and compute regression coefficients
293293multi_normal.partition(k)
294- multi_normal.βs[0]
294+ multi_normal.βs[0],multi_normal.βs[1]
295295```
296296
297+ Let's illustrate the fact that you _ can regress anything on anything else_ .
297298
298- To illustrate the idea that you _ can regress anything on anything else_ , let's first compute the mean and variance of the distribution of $z_2$
299+ We have computed everything we need to compute two regression lines, one of $z_2$ on $z_1$, the other of $z_1$ on $z_2$.
300+
301+ We'll represent these regressions as
302+
303+ $$
304+ z_1 = a_1 + b_1 z_2 + \epsilon_1
305+ $$
306+
307+ and
308+
309+ $$
310+ z_2 = a_2 + b_2 z_1 + \epsilon_2
311+ $$
312+
313+ where we have the population least squares orthogonality conditions
314+
315+ $$
316+ E \epsilon_1 z_2 = 0
317+ $$
318+
319+ and
320+
321+ $$
322+ E \epsilon_2 z_1 = 0
323+ $$
324+
325+ Let's compute $a_1, a_2, b_1, b_2$.
326+
327+ ``` {code-cell} python3
328+
329+ beta = multi_normal.βs
330+
331+ a1 = μ[0] - beta[0]*μ[1]
332+ b1 = beta[0]
333+
334+ a2 = μ[1] - beta[1]*μ[0]
335+ b2 = beta[1]
336+ ```
337+
338+ Let's print out the intercepts and slopes.
339+
340+
341+ For the regression of $z_1$ on $z_2$ we have
342+
343+ ``` {code-cell} python3
344+ print ("a1 = ", a1)
345+ print ("b1 = ", b1)
346+ ```
347+
348+ For the regression of $z_2$ on $z_1$ we have
349+
350+ ``` {code-cell} python3
351+ print ("a2 = ", a2)
352+ print ("b2 = ", b2)
353+ ```
354+
355+
356+
357+ Now let's plot the two regression lines and stare at them.
358+
359+
360+ ``` {code-cell} python3
361+
362+ z2 = np.linspace(-4,4,100)
363+
364+
365+ a1 = np.squeeze(a1)
366+ b1 = np.squeeze(b1)
367+
368+ a2 = np.squeeze(a2)
369+ b2 = np.squeeze(b2)
370+
371+ z1 = b1*z2 + a1
372+
373+
374+ z1h = z2/b2 - a2/b2
375+
376+
377+ fig = plt.figure(figsize=(12,12))
378+ ax = fig.add_subplot(1, 1, 1)
379+ ax.set(xlim=(-4, 4), ylim=(-4, 4))
380+ ax.spines['left'].set_position('center')
381+ ax.spines['bottom'].set_position('zero')
382+ ax.spines['right'].set_color('none')
383+ ax.spines['top'].set_color('none')
384+ ax.xaxis.set_ticks_position('bottom')
385+ ax.yaxis.set_ticks_position('left')
386+ plt.ylabel('$z_1$', loc = 'top')
387+ plt.xlabel('$z_2$,', loc = 'right')
388+ plt.title('two regressions')
389+ plt.plot(z2,z1, 'r', label = "$z_1$ on $z_2$")
390+ plt.plot(z2,z1h, 'b', label = "$z_2$ on $z_1$")
391+ plt.legend()
392+ plt.show()
393+ ```
394+
395+ The red line is the expectation of $z_1$ conditional on $z_2$.
396+
397+ The intercept and slope of the red line are
398+
399+ ``` {code-cell} python3
400+ print("a1 = ", a1)
401+ print("b1 = ", b1)
402+ ```
403+
404+ The blue line is the expectation of $z_2$ conditional on $z_1$.
405+
406+ The intercept and slope of the blue line are
407+
408+ ``` {code-cell} python3
409+ print("-a2/b2 = ", - a2/b2)
410+ print("1/b2 = ", 1/b2)
411+ ```
412+
413+ We can use these regression lines or our code to compute conditional expectations.
414+
415+ Let's compute the mean and variance of the distribution of $z_2$
299416conditional on $z_1=5$.
300417
301418After that we'll reverse what are on the left and right sides of the regression.
302419
303420
304421
422+
423+
424+
425+
426+
305427``` {code-cell} python3
306428# compute the cond. dist. of z1
307429ind = 1
0 commit comments