ISI MSTAT PSB 2011 Problem 4 | Digging deep into Multivariate Normal

This is an interesting problem from ISI MSTAT PSB 2011 Problem 4 that tests the student's knowledge of how he visualizes the normal distribution in higher dimensions.

The Problem: ISI MSTAT PSB 2011 Problem 4

Suppose that \( X_1,X_2,... \) are independent and identically distributed \(d\) dimensional normal random vectors. Consider a fixed \( x_0 \in \mathbb{R}^d \) and for \(i=1,2,...,\) define \(D_i = \| X_i - x_0 \| \), the Euclidean distance between \( X_i \) and \(x_0\). Show that for every \( \epsilon > 0 \), \(P[\min_{1 \le i \le n} D_i > \epsilon] \rightarrow 0 \) as \( n \rightarrow \infty \)

Prerequisites:

  1. Finding the distribution of the minimum order statistic
  2. Multivariate Gaussian properties

Solution:

First of all, see that \( P(\min_{1 \le i \le n} D_i > \epsilon)=P(D_i > \epsilon)^n \) (Verify yourself!)

But, apparently we are more interested in the event \( \{D_i < \epsilon \} \).

Let me elaborate why this makes sense!

Let \( \phi \) denote the \( d \) dimensional Gaussian density, and let \( B(x_0, \epsilon) \) be the Euclidean ball around \( x_0 \) of radius \( \epsilon \) . Note that \( \{D_i < \epsilon\} \) is the event that the gaussian \( X_i \) will land in this Euclidean ball.

So, if we can show that this event has positive probability for any given $x_0, \epsilon$ pair, we will be done, since then in the limit, we will be exponentiating a number strictly less than 1 by a quantity that is growing larger and larger.

In particular, we have that : \( P(D_i < \epsilon)= \int_{B(x_0, \epsilon)} \phi(x) dx \geq |B(x_0, \epsilon)| \inf_{x \in B(x_0, \epsilon)} \phi(x) \) , and we know that by rotational symmetry and as Gaussians decay as we move away from the centre, this infimum exists and is given by \( \phi(x_0 + \epsilon \frac{x_0}{||x_0||}) \) . (To see that this is indeed a lower bound, note that \( B(x_0, \epsilon) \subset B(0, \epsilon + ||x_0||) \).

So, basically what we have shown here is that exists a \( \delta > 0 \) such that \( P(D_i < \epsilon )>\delta \).

As, \( \delta \) is a lower bound of a probability , hence it a fraction strictly below 1.

Thus, we have \( \lim_{n \rightarrow \infty} P(D_i > \epsilon)^n \leq \lim_{n \rightarrow \infty} (1-\delta)^n = 0 \).

Hence we are done.

Food for thought:

There is a fantastic amount of statistical literature on the equi-density contours of a multivariate Gaussian distribution .

Try to visualize them for non singular and a singular Gaussian distribution separately. They are covered extensively in the books of Kotz and Anderson. Do give it a read!

Some Useful Problems:

Size, Power, and Condition | ISI MStat 2019 PSB Problem 9

This is a problem from the ISI MStat Entrance Examination, 2019. This primarily tests one's familiarity with size, power of a test and whether he/she is able to condition an event properly.

The Problem:

Let Z be a random variable with probability density function

\( f(z)=\frac{1}{2} e^{-|z- \mu|} , z \in \mathbb{R} \) with parameter \( \mu \in \mathbb{R} \). Suppose, we observe \(X = \) max \( (0,Z) \).

(a)Find the constant c such that the test that "rejects when \( X>c \)" has size 0.05 for the null hypothesis \(H_0 : \mu=0 \).

(b)Find the power of this test against the alternative hypothesis \(H_1: \mu =2 \).

Prerequisites:

And believe me as Joe Blitzstein says: "Conditioning is the soul of statistics"

Solution:

(a) If you know what size of a test means, then you can easily write down the condition mentioned in part(a) in mathematical terms.

It simply means \( P_{H_0}(X>c)=0.05 \)

Now, under \( H_0 \), \( \mu=0 \).

So, we have the pdf of Z as \( f(z)=\frac{1}{2} e^{-|z|} \)

As the support of Z is \( \mathbb{R} \), we can partition it in \( \{Z \ge 0,Z <0 \} \).

Now, let's condition based on this partition. So, we have:

\( P_{H_0}(X > c)=P_{H_0}(X>c , Z \ge 0)+ P_{H_0}(X>c, Z<0) =P_{H_0}(X>c , Z \ge 0) =P_{H_0}(Z > c) \)

Do, you understand the last equality? (Try to convince yourself why)

So, \( P_{H_0}(X >c)=P_{H_0}(Z > c)=\int_{c}^{\infty} \frac{1}{2} e^{-|z|} dz = \frac{1}{2}e^{-c} \)

Equating \(\frac{1}{2}e^{-c} \) with 0.05, we get \( c= \ln{10} \)

(b) The second part is just mere calculation given already you know the value of c.

Power of test against \(H_1 \) is given by:

\(P_{H_1}(X>\ln{10})=P_{H_1}(Z > \ln{10})=\int_{\ln{10}}^{\infty} \frac{1}{2} e^{-|z-2|} dz = \frac{e^2}{20} \)

Try out this one:

The pdf occurring in this problem is an example of a Laplace distribution.Look it up on the internet if you are not aware and go through its properties.

Suppose you have a random variable V which follows Exponential Distribution with mean 1.

Let I be a Bernoulli(\(\frac{1}{2} \)) random variable. It is given that I,V are independent.

Can you find a function h (which is also a random variable), \(h=h(I,V) \) ( a continuous function of I and V) such that h has the standard Laplace distribution?