Cycles, Symmetry, and Counting | ISI MStat 2016 PSB | Problem 2

This problem is a beautiful and elegant application of basic counting principles, symmetry and double counting principles in combinatorics. This is Problem 2 from ISI MStat 2016 PSB.

Problem

Determine the average value of
$$
i_{1} i_{2}+i_{2} i_{3}+\cdots+i_{9} i_{10}+i_{10} i_{1}
$$ taken over all permutations \(i_{1}, i_{2}, \ldots, i_{10}\) of \(1,2, \ldots, 10\).

Prerequisites

Solution

The problem may seem mind boggling at first, when you will even try to do it for \(n = 4\), instead of \(n = 10\).

But, in mathematics, symmetry is really intriguing. Let's see how a symmetry argument holds here. It is just by starting to count. Let's see this problem in a geometrical manner.

\( i_{1}\to i_{2} \to i_{3} \to \cdots \to i_{9} \to i_{10} \to i_{1}\) is sort of a cycle right?

A hexagon
n = 6

Now, the symmetry argument starts from this symmetric figure.

We will do the problem for general \(n\).

Central Idea: Let's fix a pair say [ 4 - 5 ], we will see in all the permutations, in how many times, [ 4 - 5 ] can occur.

We will see that there is nothing particular about [ 4 - 5 ], and this is the symmetry argument. Therefore, the number is symmetric along with all such pairs.

Observe, along every such cycle containing [ 4 - 5 ], there are three parameters:

The number corresponding to the above questions are the following:

Basic Counting Principle in a Hexagon

So, in total [ 4 - 5 ] edge will occur \( n \times 2! \times (n-2)! \) times.

By the symmetry argument, every edge [\( i - j\)], will occur \( n \times 2! \times (n-2)! \) times.

Thus, when we sum over all such permutations, we get the following

$$ \sum_{i, j = 1; i \neq j}^{n} {ij} \times \text{number of times [i - j] pair occur} $$

$$ = \sum_{i, j = 1; i \neq j}^{n} {ij} \times n \times 2! \times (n-2)! = n \times (n-2)! \times \sum_{i, j = 1; i \neq j}^{n} {2ij} $$

Now, there are \( n!\) permutations in total. So, to take the average, we divide by \( n!\) to get $$ \frac{\sum_{i, j = 1; i \neq j}^{n} {2ij}}{n-1} $$

$$ = \frac{(\sum_{i}^{n} i)^2 - \sum_{i}^{n} i^2 }{n-1} $$

$$ = \frac{\frac{n^2(n+1)^2}{4} - \frac{n(n+1)(2n+1)}{6}}{n-1} = \frac{n(n+1)(3n+2)}{12} $$

Edit 1:

One of the readers, Vishal Routh has shared his solution using Conditional Expectation, I am sharing his solution in picture format.

Video Solution:

Restricted Regression Problem | ISI MStat 2017 PSB Problem 7

This problem is a regression problem, where we use the ordinary least square methods, to estimate the parameters in a restricted case scenario. This is ISI MStat 2017 PSB Problem 7.

Problem

Consider independent observations \({\left(y_{i}, x_{1 i}, x_{2 i}\right): 1 \leq i \leq n}\) from the regression model
$$
y_{i}=\beta_{1} x_{1 i}+\beta_{2} x_{2 i}+\epsilon_{i}, i=1, \ldots, n
$$ where \(x_{1 i}\) and \(x_{2 i}\) are scalar covariates, \(\beta_{1}\) and \(\beta_{2}\) are unknown scalar
coefficients, and \(\epsilon_{i}\) are uncorrelated errors with mean 0 and variance \(\sigma^{2}>0\). Instead of using the correct model, we obtain an estimate \(\hat{\beta_{1}}\) of \(\beta_{1}\) by minimizing
$$
\sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right)^{2}
$$ Find the bias and mean squared error of \(\hat{\beta}_{1}\).

Prerequisites

Solution

It is sort of a restricted regression problem because maybe we have tested the fact that \(\beta_2 = 0\). Hence, we are interested in the estimate of \(\beta_1\) given \(\beta_2 = 0\). This is essentially the statistical significance of this problem, and we will see how it turns out in the estimate of \(\beta_1\).

Let's start with some notational nomenclature.
\( \sum_{i=1}^{n} a_{i} b_{i} = s_{a,b} \)

Let's minimize \( L(\beta_1) = \sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right)^{2}\) by differentiating w.r.t \(\beta_1\) and equating to 0.

\( \frac{dL(\beta_1)}{d\beta_1}\sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right)^{2} = 0\)

\( \Rightarrow \sum_{i=1}^{n} x_{1 i} \left(y_{i}-\beta_{1} x_{1 i}\right) = 0 \)

\( \Rightarrow \hat{\beta_1} = \frac{s_{x_{1},y}}{s_{x_{1},x_{1}}} \)

From, the given conditions, \( E(Y_{i})=\beta_{1} X_{1 i}+\beta_{2} X_{2 i}\).

\( \Rightarrow E(s_{X_{1},Y}) = \beta_{1}s_{X_{1},X_{1}} +\beta_{2} s_{X_{1},X_{2}} \).

Since, \(x's\) are constant, \( E(\hat{\beta_1}) = \beta_{1} +\beta_{2} \frac{s_{X_{1},X_{2}}}{s_{X_{1},X_{1}}} \).

\( Bias(\hat{\beta_1}) = \beta_{2} \frac{s_{X_{1},X_{2}}}{s_{X_{1},X_{1}}} \).

Thus, observe that the more \( \beta_2 \) is close to 0, the more bias is close to 0.

From, the given conditions,

\( Y_{i} - \beta_{1} X_{1 i} - \beta_{2} X_{2 i}\) ~ Something\(( 0 , \sigma^2\)).

\( \hat{\beta_1} = \frac{s_{x_{1},y}}{s_{x_{1},x_{1}}}\) ~ Something\(( E(\hat{\beta_{1}}) , Var(\hat{\beta_1}))\).

\(Var(\hat{\beta_1}) = \frac{\sum_{i=1}^{n} x_{1i}^2 Var(Y_{i})}{s_{X_1, X_1}^2} = \frac{\sigma^2}{s_{X_1, X_1}} \)

\( MSE(\hat{\beta_1}) = Variance + \text{Bias}^2 = \frac{\sigma^2}{s_{X_1, X_1}} + \beta_{2}^2(\frac{s_{X_{1},X_{2}}}{s_{X_{1},X_{1}}})^2\)

Observe, that even the MSE is minimized if \(\beta_2 = 0\).