ISI MSTAT PSB 2011 Problem 4 | Digging deep into Multivariate Normal

This is an interesting problem from ISI MSTAT PSB 2011 Problem 4 that tests the student's knowledge of how he visualizes the normal distribution in higher dimensions.

The Problem: ISI MSTAT PSB 2011 Problem 4

Suppose that \( X_1,X_2,... \) are independent and identically distributed \(d\) dimensional normal random vectors. Consider a fixed \( x_0 \in \mathbb{R}^d \) and for \(i=1,2,...,\) define \(D_i = \| X_i - x_0 \| \), the Euclidean distance between \( X_i \) and \(x_0\). Show that for every \( \epsilon > 0 \), \(P[\min_{1 \le i \le n} D_i > \epsilon] \rightarrow 0 \) as \( n \rightarrow \infty \)

Prerequisites:

  1. Finding the distribution of the minimum order statistic
  2. Multivariate Gaussian properties

Solution:

First of all, see that \( P(\min_{1 \le i \le n} D_i > \epsilon)=P(D_i > \epsilon)^n \) (Verify yourself!)

But, apparently we are more interested in the event \( \{D_i < \epsilon \} \).

Let me elaborate why this makes sense!

Let \( \phi \) denote the \( d \) dimensional Gaussian density, and let \( B(x_0, \epsilon) \) be the Euclidean ball around \( x_0 \) of radius \( \epsilon \) . Note that \( \{D_i < \epsilon\} \) is the event that the gaussian \( X_i \) will land in this Euclidean ball.

So, if we can show that this event has positive probability for any given $x_0, \epsilon$ pair, we will be done, since then in the limit, we will be exponentiating a number strictly less than 1 by a quantity that is growing larger and larger.

In particular, we have that : \( P(D_i < \epsilon)= \int_{B(x_0, \epsilon)} \phi(x) dx \geq |B(x_0, \epsilon)| \inf_{x \in B(x_0, \epsilon)} \phi(x) \) , and we know that by rotational symmetry and as Gaussians decay as we move away from the centre, this infimum exists and is given by \( \phi(x_0 + \epsilon \frac{x_0}{||x_0||}) \) . (To see that this is indeed a lower bound, note that \( B(x_0, \epsilon) \subset B(0, \epsilon + ||x_0||) \).

So, basically what we have shown here is that exists a \( \delta > 0 \) such that \( P(D_i < \epsilon )>\delta \).

As, \( \delta \) is a lower bound of a probability , hence it a fraction strictly below 1.

Thus, we have \( \lim_{n \rightarrow \infty} P(D_i > \epsilon)^n \leq \lim_{n \rightarrow \infty} (1-\delta)^n = 0 \).

Hence we are done.

Food for thought:

There is a fantastic amount of statistical literature on the equi-density contours of a multivariate Gaussian distribution .

Try to visualize them for non singular and a singular Gaussian distribution separately. They are covered extensively in the books of Kotz and Anderson. Do give it a read!

Some Useful Problems:

ISI MStat PSB 2008 Problem 8 | Bivariate Normal Distribution

This is a very beautiful sample problem from ISI MStat PSB 2008 Problem 8. It's a very simple problem, based on bivariate normal distribution, which again teaches us that observing the right thing makes a seemingly laborious problem beautiful . Fun to think, go for it !!

Problem- ISI MStat PSB 2008 Problem 8


Let \( \vec{Y} = (Y_1,Y_2)' \) have the bivariate normal distribution, \( N_2( \vec{0}, \sum ) \),

where, \(\sum\)= \begin{pmatrix} \sigma_1^2 & \rho\sigma_1\sigma_2 \\ \rho\sigma_2\sigma_1 & \sigma^2 \end{pmatrix} ;

Obtain the mean ad variance of \( U= \vec{Y'} {\sum}^{-1}\vec{Y} - \frac{Y_1^2}{\sigma^2} \) .

Prerequisites


Bivariate Normal

Conditonal Distribution of Normal

Chi-Squared Distribution

Solution :

This is a very simple and cute problem, all the labour reduces once you see what to need to see !

Remember , the pdf of \(N_2( \vec{0}, \sum)\) ?

Isn't \( \vec{Y}\sum^{-1}\vec{Y}\) is the exponent of e, in the pdf of bivariate normal ?

So, we can say \(\vec{Y}\sum^{-1}\vec{Y} \sim {\chi_2}^2 \) . Can We ?? verify it !!

Also, clearly \( \frac{Y_1^2}{\sigma^2} \sim {\chi_1}^2 \) ; since \(Y_1\) follows univariate normal.

So, expectation is easy to find accumulating the above deductions, I'm leaving it as an exercise .

Calculating the variance may be a laborious job at first, but now lets imagine the pdf of the conditional distribution of \( Y_2 |Y_1=y_1 \) , what is the exponent of e in this pdf ?? \( U = \vec{Y'} {\sum}^{-1}\vec{Y} - \frac{Y_1^2}{\sigma^2} \) , right !!

and also , \( U \sim \chi_1^2 \) . Now doing the last piece of subtle deduction, and claiming that \(U\) and \( \frac{Y_1^2}{\sigma^2} \) are independently distributed . Can you argue why ?? go ahead . So, \( U+ \frac{Y_1^2}{\sigma^2} \sim \chi_2^2 \).

So, \( Var( U + \frac{Y_1^2}{\sigma^2})= Var( U) + Var( \frac{Y_1^2}{\sigma^2}) \)

\( \Rightarrow Var(U)= 4-2=2 \) , [ since, Variance of a R.V following \(\chi_n^2\) is \(2n\).]

Hence the solution concludes.


Food For Thought

Before leaving, lets broaden our mind and deal with Multivariate Normal !

Let, \(\vec{X}\) be a 1x4 random vector, such that \( \vec{X} \sim N_4(\vec{\mu}, \sum ) \), \(\sum\) is positive definite matrix, then can you show that,

\( P( f_{\vec{X}}(\vec{x}) \ge c) = \begin{cases} 0 & c \ge \frac{1}{4\pi^2\sqrt{|\sum|}} \\ 1-(\frac{k+2}{2})e^{-\frac{k}{2}} & c < \frac{1}{4\pi^2\sqrt{|\sum|}} \end{cases}\)

Where, \( k=-2ln(4\pi^2c \sqrt{|\sum|}) \).

Keep you thoughts alive !!


Similar Problems and Solutions



ISI MStat PSB 2008 Problem 10
Outstanding Statistics Program with Applications

Outstanding Statistics Program with Applications

Subscribe to Cheenta at Youtube