ISI MStat PSB 2004 Problem 1 | Games and Probability

This is a very beautiful sample problem from ISI MStat PSB 2004 Problem 1. Games are best ways to understand the the role of chances in life, solving these kind of problems always indulges me to think and think more on the uncertainties associated with the system. Think it over !!

Problem- ISI MStat PSB 2004 Problem 1


Suppose two teams play a series of games, each producing a winner and a loser, until one team has won two or more games than the other. Let G be the number of games played. Assume echs team has a chance of 0.5 to win each game, independent of the results of the previous games.

(a) Find the probability distribution of G.

(b) Find the expected value of G.

Prerequisites


Naive Probability

Counting priciples

Geometric distribution.

Conditional expectation .

Solution :

While solving this kind of problem, First what we should do is to observe those things which remains invariant.

Here, observe that the game will always terminate, with consecutive wins of one team. Imagine there are two teams \(T_1\) and \( T_2\) . If the first two matches are won by any of the two teams we are done, but say \(T_1\) won the first match, but \(T_2\) won the 2nd match, so its a draw and the 2 matches played are basically lost, and we should renew the counting a fresh.

So, can I claim that G (as defined in the question), will always be even !! verify this claim yourself !

so, considering the event G=g , g is even. So, if the game terminates at the gth game, then it is evident from the logic we established above, then both the (g-1)th and gth game was won by the winning team. So, among the first (g-2) games, both the team won equal number of games, and ended in a draw. So, after the (g-2)th game, the teams can be at a draw in \( 2^( \frac{g-2}{2}) \) ways, and the last two matches can be won by any of the two teams in 2 ways. And the g matches can result in \(2^g\) different arrangements of wins and loss (from a perspective of any of the teams ).

(a) So, P(G=g)= \( \frac{2* 2^( \frac{g-2}{2})}{2^g}= \frac{1}{2^{\frac{g}{2}}} \) ; g=2,4,6,.......

Hence the distribution of G. Hold on ! is is looking like geometric somehow ?? Find it out!!

(b) While finding expectation of G, we can use the conventional definition of expectation and, and since I said that distribution of G is (somewhat) geometric, basically to be precise \( \frac{G}{2} \sim Geo(0.5) \), so, clearly expectation will be 4. But I will take this chance to show another beautiful and elegant method to find out expectation, using conditional expectation. So, we will first condition on the win over first match and develop a recursion, which I'm obsessed with. Though one may not find this method useful in this problem since the distribution of G is known to us, but life is not about this problem, isn't it ! What if the distribution is not known but the pattern is visible, only if you are ready to see it. Lets proceed,

Suppose, without loss of generality, let us assume, \(T_1\) won the first game, so 1 game is gone with probability 0.5 and with same probability it is expected that E(G|\(T_1\) is leading by 1 game ) number of games is to be played, similarly( OR) if \(T_2\) wins the first game, then with probability 0.5 , 1 game is gone and an expected E(G|\(T_2\) is leading by 1 game) number of games is to be played.

So, if we right the above words mathematically, it looks like,

E(G)= P(\(T_1\) wins the 1rst game )( 1+E(G|\(T_1\) is leading by 1 game))+P(\(T_4\) wins the 1rst game)(1+E(G|\(T_2\) is leading by 1 game)),......................(*)

So, now we need to find out E(G|\(T_1\) is leading by 1 game), the other is same due to symmetry!

so, expected number of games to be played when we know that \(T_1\) is leading the by 1 game, is the next game can be a win for \(T_1\), with probability 0.5, OR, \(T_1\) can lose the game with probability 0.5, and reach the stage of draw, and from here, all the games played is wasted and we start counting afresh (Loss of memory, remember !! ) , so 1 game is lost and still a expected E(G) numbers of games to follow before it terminates. So, mathematically,

E(G|\(T_1\) is leading by 1 game)= P(\(T_1\) wins again)x1 + P(\(T_1\) looses and draws)(1+E(G))=0.5+0.5+(0.5)E(G)=1+(0.5)E(G).

plugging in this, in (*), one will obtain a recursion of E(G), and calculate is as 4. So, on an average, the game terminates after 4 matches.


Food For Thought

Can you generalize the above problem, when I say, that the chance of winning each match, for one of the two team is some p, ( 0<p<1) ? Try it !

Wait!! Before leaving, lets toss some coins,

You toss a fair coin repeatedly. What is the expected number of tosses you think, you have to perform to get the pattern of HH (heads and heads) for the first time? What about the expected number of tosses when your pattern of interest is TH (or HT) ?? for which pattern you think you need to spend more time?? Does your intuition corroborates, the mathematical conclusions?? if not what's the reason you think, you are misled by your intuition ??

Think it over and over !! You are dealing with one of the most beautiful perspective of uncertainty !!


Similar Problems and Solutions



ISI MStat PSB 2008 Problem 10
Outstanding Statistics Program with Applications

Outstanding Statistics Program with Applications

Subscribe to Cheenta at Youtube


ISI MStat PSB 2013 Problem 5 | Simple Random Sampling

This is a sample problem from ISI MStat PSB 2013 Problem 5. It is based on the simple random sampling model, finding the unbiased estimates of the population size. But think over the "Food for Thought" any kind of discussion will be appreciated. Give it a try!

Problem- ISI MStat PSB 2013 Problem 5


A box has a unknown number of tickets serially numbered 1,2,.....,N. Two tickets are drawn using simple random sampling without replacement (SRSWOR) from the box. If X and Y are the numbers on the tickets and Z=max(X,Y), show that

(a) Z is not ubiased for N.

(b) \( aX+ bY+ c \) is unbiased for N if and only if \(a+b=2 \) and \( c=-1 \).

Prerequisites


Naive Probability

Counting priciples

Unbiased estimators.

Simple random sampling .

Solution :

For this problem, first let us find the pmf of Z, where we will need some counting techniques.

Since we are drawing balls at a random and not replacing the drawn ball after each draw (SRSWOR), so, clearly its about choosing two numbers from the se of N elements {1,.....,N}. So, all possible sample of size 2 , that than be drawn from the population of N units is \( { N \choose 2}\) .

now Z defined here as the maximum of the two chosen numbers, so, all possible values of Z are 2,3,....,N.

Now lets assume that Z=k, so now we just need to find out what are the possible pairs, such that k comes the max among both, or in other words if k is the maximum of the drawn numbers, what are the possible values that the other number can take ? Well, its simple the other ticket can carry any number less than k, an since there are k-1 such numbers. So there are (k-1) such pairs where the maximum numbered ticket is k. (not concerned on the ordering, of the two observation)

So, the pmf of Z=max(X,Y) , i.e. \( P(Z=k) = \begin{cases} \frac{k-1}{{N \choose 2}} & k=2,3,....,N \\ 0 & otherwise \end{cases} \)

(a) So, now to check whether Z is unbiased for N, we need to check E(Z),

\(E(Z)= \sum_{k=2}^N{k}{\frac{k-1}{{N \choose 2}}} =\frac{1}{{N \choose 2}}\sum_{k=2}^N{k(k-1)}=(\frac{2}{3}) (N+1) \).

so, \( E(Z)=\frac{2}{3} (N+1) \neq N \). Hence Z is not Unbiased for the population size, N.

(b) Similarly, we find the expectation of T=aX+bY+c,

\( E(T)=aE(X)+bE(Y)+c= a \sum_{i=1}^N i P(X=i) + b \sum_{j=1}^N j P(Y=j) + c,\)

now here \( P(X=i)=P(Y=i)= \frac{1}{N} \), so, \( E(T) = a \frac{N+1}{2}+ b\frac{N+1}{2}+c = (a+b) \frac{N+1}{2} +c,\)

clearly, E(T) = N, i.e T will be unbiased for N, iff a+b=2 and c=-1.

Hence we are done !


Food For Thought

Now, suppose that the numbers on the tickets are random, that is it can be any positive integer, ( like say 220 or 284), but thankfully you now the total number of tickets .i.e. N is known . Now you are collecting tickets for yourself and k-1 of your friends, and the number c is lucky for you and you wish to keep it in your collection, and select the remaining k-1 tickets out of N-1 tickets, and you calculate a sample mean(of the collected numbers) \( \bar{y'}\), Can I claim that \(c+ (N-1)\bar{y'}\) is an unbiased estimator of the population total ? Do you know this estimator shows less variance than the conventional unbiased estimator of the population total? Can you show that too?? why do you think the variance minimizes??

By the Way, Do you know, in mathematics 220 and 284 are quite special ?? They are the first "amicable numbers". One can obtain the other by summing over its own divisors !! So, to become amicable one needs to increase the size of their mind and heart !! Keep increasing both!! Till then.... bye.


Similar Problems and Solutions



ISI MStat PSB 2008 Problem 10
Outstanding Statistics Program with Applications

Outstanding Statistics Program with Applications

Subscribe to Cheenta at Youtube


Cycles, Symmetry, and Counting | ISI MStat 2016 PSB | Problem 2

This problem is a beautiful and elegant application of basic counting principles, symmetry and double counting principles in combinatorics. This is Problem 2 from ISI MStat 2016 PSB.

Problem

Determine the average value of
$$
i_{1} i_{2}+i_{2} i_{3}+\cdots+i_{9} i_{10}+i_{10} i_{1}
$$ taken over all permutations \(i_{1}, i_{2}, \ldots, i_{10}\) of \(1,2, \ldots, 10\).

Prerequisites

Solution

The problem may seem mind boggling at first, when you will even try to do it for \(n = 4\), instead of \(n = 10\).

But, in mathematics, symmetry is really intriguing. Let's see how a symmetry argument holds here. It is just by starting to count. Let's see this problem in a geometrical manner.

\( i_{1}\to i_{2} \to i_{3} \to \cdots \to i_{9} \to i_{10} \to i_{1}\) is sort of a cycle right?

A hexagon
n = 6

Now, the symmetry argument starts from this symmetric figure.

We will do the problem for general \(n\).

Central Idea: Let's fix a pair say [ 4 - 5 ], we will see in all the permutations, in how many times, [ 4 - 5 ] can occur.

We will see that there is nothing particular about [ 4 - 5 ], and this is the symmetry argument. Therefore, the number is symmetric along with all such pairs.

Observe, along every such cycle containing [ 4 - 5 ], there are three parameters:

The number corresponding to the above questions are the following:

Basic Counting Principle in a Hexagon

So, in total [ 4 - 5 ] edge will occur \( n \times 2! \times (n-2)! \) times.

By the symmetry argument, every edge [\( i - j\)], will occur \( n \times 2! \times (n-2)! \) times.

Thus, when we sum over all such permutations, we get the following

$$ \sum_{i, j = 1; i \neq j}^{n} {ij} \times \text{number of times [i - j] pair occur} $$

$$ = \sum_{i, j = 1; i \neq j}^{n} {ij} \times n \times 2! \times (n-2)! = n \times (n-2)! \times \sum_{i, j = 1; i \neq j}^{n} {2ij} $$

Now, there are \( n!\) permutations in total. So, to take the average, we divide by \( n!\) to get $$ \frac{\sum_{i, j = 1; i \neq j}^{n} {2ij}}{n-1} $$

$$ = \frac{(\sum_{i}^{n} i)^2 - \sum_{i}^{n} i^2 }{n-1} $$

$$ = \frac{\frac{n^2(n+1)^2}{4} - \frac{n(n+1)(2n+1)}{6}}{n-1} = \frac{n(n+1)(3n+2)}{12} $$

Edit 1:

One of the readers, Vishal Routh has shared his solution using Conditional Expectation, I am sharing his solution in picture format.

Video Solution:

Restricted Regression Problem | ISI MStat 2017 PSB Problem 7

This problem is a regression problem, where we use the ordinary least square methods, to estimate the parameters in a restricted case scenario. This is ISI MStat 2017 PSB Problem 7.

Problem

Consider independent observations \({\left(y_{i}, x_{1 i}, x_{2 i}\right): 1 \leq i \leq n}\) from the regression model
$$
y_{i}=\beta_{1} x_{1 i}+\beta_{2} x_{2 i}+\epsilon_{i}, i=1, \ldots, n
$$ where \(x_{1 i}\) and \(x_{2 i}\) are scalar covariates, \(\beta_{1}\) and \(\beta_{2}\) are unknown scalar
coefficients, and \(\epsilon_{i}\) are uncorrelated errors with mean 0 and variance \(\sigma^{2}>0\). Instead of using the correct model, we obtain an estimate \(\hat{\beta_{1}}\) of \(\beta_{1}\) by minimizing
$$
\sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right)^{2}
$$ Find the bias and mean squared error of \(\hat{\beta}_{1}\).

Prerequisites

Solution

It is sort of a restricted regression problem because maybe we have tested the fact that \(\beta_2 = 0\). Hence, we are interested in the estimate of \(\beta_1\) given \(\beta_2 = 0\). This is essentially the statistical significance of this problem, and we will see how it turns out in the estimate of \(\beta_1\).

Let's start with some notational nomenclature.
\( \sum_{i=1}^{n} a_{i} b_{i} = s_{a,b} \)

Let's minimize \( L(\beta_1) = \sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right)^{2}\) by differentiating w.r.t \(\beta_1\) and equating to 0.

\( \frac{dL(\beta_1)}{d\beta_1}\sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right)^{2} = 0\)

\( \Rightarrow \sum_{i=1}^{n} x_{1 i} \left(y_{i}-\beta_{1} x_{1 i}\right) = 0 \)

\( \Rightarrow \hat{\beta_1} = \frac{s_{x_{1},y}}{s_{x_{1},x_{1}}} \)

From, the given conditions, \( E(Y_{i})=\beta_{1} X_{1 i}+\beta_{2} X_{2 i}\).

\( \Rightarrow E(s_{X_{1},Y}) = \beta_{1}s_{X_{1},X_{1}} +\beta_{2} s_{X_{1},X_{2}} \).

Since, \(x's\) are constant, \( E(\hat{\beta_1}) = \beta_{1} +\beta_{2} \frac{s_{X_{1},X_{2}}}{s_{X_{1},X_{1}}} \).

\( Bias(\hat{\beta_1}) = \beta_{2} \frac{s_{X_{1},X_{2}}}{s_{X_{1},X_{1}}} \).

Thus, observe that the more \( \beta_2 \) is close to 0, the more bias is close to 0.

From, the given conditions,

\( Y_{i} - \beta_{1} X_{1 i} - \beta_{2} X_{2 i}\) ~ Something\(( 0 , \sigma^2\)).

\( \hat{\beta_1} = \frac{s_{x_{1},y}}{s_{x_{1},x_{1}}}\) ~ Something\(( E(\hat{\beta_{1}}) , Var(\hat{\beta_1}))\).

\(Var(\hat{\beta_1}) = \frac{\sum_{i=1}^{n} x_{1i}^2 Var(Y_{i})}{s_{X_1, X_1}^2} = \frac{\sigma^2}{s_{X_1, X_1}} \)

\( MSE(\hat{\beta_1}) = Variance + \text{Bias}^2 = \frac{\sigma^2}{s_{X_1, X_1}} + \beta_{2}^2(\frac{s_{X_{1},X_{2}}}{s_{X_{1},X_{1}}})^2\)

Observe, that even the MSE is minimized if \(\beta_2 = 0\).

Counting Principle - Concept with Problem | Combinatorics

Learn the concept of the Counting Principle and make algorithms to count complex things in a simpler way with the help of a beautiful problem of Combinatorics.

Concept - Counting Principle.


Often we come across such problems that involves very complex counting which are nearly impossible to do in a trivial way. To encounter this kind of complexity of counting we use some $\textbf{tricks and tools}$; one of them is to find a algorithm or a pattern.

We try to obtain result for a similar but a simpler problem and then by the algorithm achieve the final goal.

One Handy Tool - Multiplication Principle and Its Application In Set Theory

Suppose we have two sets $X=\{x_1,x_2,x_3\}$ and $Y=\{y_1,y_2\}$ ,then number of order pairs $(x_i,y_j);\text{ such that } i=1,2,3 ; j=1,2$ is equal to $3\times2=6$

That is , If a set $S_1$ has $n$ elements and the set $S_2$ has $m$ element then the set $S_1\times S_2=\{(a,b) | a \in S_1 , b \in S_2\}$ has $nm$ elements.

Face the problem Now :

Let $S=\{1,2,3,\ldots ,10\}$ Find the number of subsets $A$ of $S$ such that $x\in A, $ and $2x \in S \Rightarrow 2x \in A$

Key Concepts


Combinatorics

Counting

Set Theory

Check the Answer


Answer: $180$

Try with Hints


Understand The Problem :

What is the set $A$? Let us see some feasible options for $A$

$\{1,2,4,8\}$, $\{2,4,8\}$, $\{1,3,2,4,8,6\} \ldots \text{etc.}$Is there any pattern ?

Ok wait !!!! There is one thing that if we take $1$ in $A$ then $2,4,8$ must belongs to $A$, similarly, if we take $3 \in A$ then $6\in A$ .

Good.... we are surely getting something interesting. Do you want to give it a thought ?

Let's Take a Simpler Example :

Let us see what happens if $S$ was given as $\{1,2,3\}$ , then all possible options of $A$ would be $\{1,2\}$, $\{2\}$, $\{3\}$, $\{2,3\}$, $\phi$. i.e, $6$ such possible options.

We can notice a very interesting thing here that the elements $1,2$ are making a group because twice of $1$ is $2$ let call them elements of $1^{st}$ kind and $3$ is not related to the elements by this kind of relation, let's call it element of 2^{nd} kind.

Then we can make two collection of sets $X=\{\{1,2\},\{2\},\phi \}$ and $Y=\{\{3\},\phi\}$ [That is $X$ contains all the subsets made by the elements of $1^{st}$ kind following the rule and $\phi$, similarly $Y$ contains the sets made by the elements of $2^{nd}$ kind following the rule and $\phi$ ].

Now , Let $P$ be the set $\{X'\cup Y' | \text{where} X' \in X \text{ and } Y' \in Y \}$ . Then member of $P$ are the possible options for $A$.

Therefore, $|P|=$ number of possible subset $A$ [where $|| \to $ number of elements in a set]. By the Multiplication Principle

$|P|=|X|\times |Y|=3\times 2=6$.

Finally, we have got a pattern. But !!! it will be better to check it for one more simpler example then we can definitely apply it to solve the original problem. Can you verify it for $S=\{1,2,3,4\}$

See!! we are getting a really good result.

For $S=\{1,2,3,4\} $

The element(s) of $1^{st}$ kind is/are : $1,2,4$.

The element(s) of $2^{nd}$ kind is/are : $3$ .

Then $X$=$\{\{1,2,4\}$, $\{2,4\}$, $\{4\}$, $\phi\}$; $\quad Y$= $\{\{3\},\phi\}$ .

Then $|P|=|X|\times |Y|=4\times 2=8$.

Now, if you count the possible subsets $A$ , you will get the same result. [They are $\{1,2,4\}$, $\{2,4\}$, $\{4\}$, $\{3\}$, $\{1,2,3,4\}$, $\{2,3,4\}$, $\{3,4\}$, $\phi $]. So it works for $S$=$\{1,2,3,4\}$ .

Now you can easily apply the algorithm to the original problem.

Applying the algorithm for the original problem :

$S={1,2,3,\ldots,10}$..

Elements of $1^{st}$ kind : $1,2,4,8$.

Elements of $2^{nd}$ kind : $3,6$.

Elements of $3^{rd}$ kind : $5,10$.

Elements of $4^{th}$ kind : $7$

Elements of $5^{th}$ kind : $9$

Here $X$=$\{\{1,2,4,8\}$, $\{2,4,8\}$, $\{4,8\}$, $\{8\}$, $\phi\}$

$Y$=$\{\{3,6\}$, $\{6\}$, $\phi\}$

$Z$=$\{\{5,10\}$, $\{10\}$, $\phi\}$

$W=\{\{7\},\phi\}$

$U=\{\{9\},\phi\}$

$|P|=|X|\times|Y|\times|Z|\times|W|\times |U|$

$=5\times 3\times 3\times 2\times 2=180$ [ANS]

Brain_Teaser : Generalize the concept

What if the set $S=\{1,2,\ldots,n\}$ i.e., Find the number of subsets $A$ of $S$=$\{1,2,\ldots,n\}$ such that $x\in A, $ and $2x \in S \Rightarrow 2x \in A$



Subscribe to Cheenta at Youtube