Bayes comes to rescue | ISI MStat PSB 2007 Problem 7

This is a very beautiful sample problem from ISI MStat PSB 2007 Problem 7. It's a very simple problem, which very much rely on conditioning and if you don't take it seriously, you will make thing complicated. Fun to think, go for it !!

Problem- ISI MStat PSB 2007 Problem 7


Let \(X\) and \(Y\) be i.i.d. exponentially distributed random variables with mean \(\lambda >0 \). Define \(Z\) by :

\( Z = \begin{cases} 1 & if X <Y \\ 0 & otherwise \end{cases} \)

Find the conditional mean, \( E(X|Z=1) \).

Prerequisites


Conditional Distribution

Bayes Theorem

Exponential Distribution

Solution :

This is a very simple but very elegant problem to describe an unique and efficient technique to solve a class of problems, which may seem analytically difficult.

Here, for \(X\), \(Y\) and \(Z\) as defined in the question, lets find out what we need first of all.

Sometimes, breaking a seemingly complex problem into some simpler sub-problems, makes our way towards the final solution easier. In this problem, the possible simpler sub problems, which I think would help us is, "Whats the value of \(P(X<Y)\) (or similarly \(P(Z=1)\)) ?? ", "what is pdf of \(X|X<Y\)( or equivalently \(X|Z=1\)) ?" and finally " what is the conditional mean \(E(X|Z=1)\) ??". We will attain these questions one by one.

for the very first question, "Whats the value of \(P(X<Y)\) (or similarly \(P(Z=1)\)) ?? ", well the answer to this question is relatively simple, and I leave it as an exercise !! the probability value which one will find if done correctly is \( \frac{1}{2}\). Verify it, then only move forward!!

The 2nd question is the most vital and beautiful part of the problem, we generally, do this kind of problems using the general definition of conditional probability, which you can obviously try, but will face some difficulties, which can be easily ignored by using the continuous form of Bayes' rule, which we are not often encouraged to use !! I don't really know why, though !

Let, find the conditional Cdf of \(X|Z=1\),

\( P(X \le x|Z=1) = \int^x_0 f_{X|X<Y}(x) dx, ........... x>0 \)

where \( f_{X|X<Y}(x)\) is the conditional pdf, which we are interested in, So now we can use Bayes rule on \(f_{X|X<Y}(x)\), we have,

\( f_{X|X<Y}(x)\)=\( \frac{P(Z=1|X=x)f_X(x)}{P(Z=1)}\)=\(\frac{P(Y>x)f_X(x)}{P(X<Y)}\)=\(\frac{\frac{e^{-\frac{x}{\lambda}}e^{- \frac{x}{\lambda}}}{\lambda}}{\frac{1}{2}}\)=\(\frac{2}{\lambda}e^{-\frac{2x}{\lambda}}\)

plugging this in the form of cdf we can easily verify, that \(X|Z=1 \sim expo(\frac{\lambda}{2}) \). (We can't say this directly from pdf because, pdfs are not unique, Can you give such an example ? think about it !)

So, now as we successfully answered the first 2 questions its easy to, answer the last and the final one, as \(X|Z=1 \sim expo(\frac{\lambda}{2}) \), its mean .i.e.

\(E(X|Z=1)=\frac{\lambda}{2}.\)

Hence the solution concludes.


Food For Thought

Lets, provide an interesting problem before concluding,

There, are K+1 machineas in a shop, all engaged in the mass production of an item. the \(i\)th machine produces defectives with probability of \(\frac{i}{k}\), i=0,1,2,.....,k.A machine is selected at random and then the items produced are repeatedly sampled. If the first n products are all defectives, then show that the conditional probability that (n+1)th sampled product will also be defective is approximately, equal to \( \frac{(n+1)}{(n+2)}\) when k is large.

Can you show it? Give it a try !!


Similar Problems and Solutions



ISI MStat PSB 2008 Problem 10
Outstanding Statistics Program with Applications

Outstanding Statistics Program with Applications

Subscribe to Cheenta at Youtube


ISI MStat PSB 2004 Problem 1 | Games and Probability

This is a very beautiful sample problem from ISI MStat PSB 2004 Problem 1. Games are best ways to understand the the role of chances in life, solving these kind of problems always indulges me to think and think more on the uncertainties associated with the system. Think it over !!

Problem- ISI MStat PSB 2004 Problem 1


Suppose two teams play a series of games, each producing a winner and a loser, until one team has won two or more games than the other. Let G be the number of games played. Assume echs team has a chance of 0.5 to win each game, independent of the results of the previous games.

(a) Find the probability distribution of G.

(b) Find the expected value of G.

Prerequisites


Naive Probability

Counting priciples

Geometric distribution.

Conditional expectation .

Solution :

While solving this kind of problem, First what we should do is to observe those things which remains invariant.

Here, observe that the game will always terminate, with consecutive wins of one team. Imagine there are two teams \(T_1\) and \( T_2\) . If the first two matches are won by any of the two teams we are done, but say \(T_1\) won the first match, but \(T_2\) won the 2nd match, so its a draw and the 2 matches played are basically lost, and we should renew the counting a fresh.

So, can I claim that G (as defined in the question), will always be even !! verify this claim yourself !

so, considering the event G=g , g is even. So, if the game terminates at the gth game, then it is evident from the logic we established above, then both the (g-1)th and gth game was won by the winning team. So, among the first (g-2) games, both the team won equal number of games, and ended in a draw. So, after the (g-2)th game, the teams can be at a draw in \( 2^( \frac{g-2}{2}) \) ways, and the last two matches can be won by any of the two teams in 2 ways. And the g matches can result in \(2^g\) different arrangements of wins and loss (from a perspective of any of the teams ).

(a) So, P(G=g)= \( \frac{2* 2^( \frac{g-2}{2})}{2^g}= \frac{1}{2^{\frac{g}{2}}} \) ; g=2,4,6,.......

Hence the distribution of G. Hold on ! is is looking like geometric somehow ?? Find it out!!

(b) While finding expectation of G, we can use the conventional definition of expectation and, and since I said that distribution of G is (somewhat) geometric, basically to be precise \( \frac{G}{2} \sim Geo(0.5) \), so, clearly expectation will be 4. But I will take this chance to show another beautiful and elegant method to find out expectation, using conditional expectation. So, we will first condition on the win over first match and develop a recursion, which I'm obsessed with. Though one may not find this method useful in this problem since the distribution of G is known to us, but life is not about this problem, isn't it ! What if the distribution is not known but the pattern is visible, only if you are ready to see it. Lets proceed,

Suppose, without loss of generality, let us assume, \(T_1\) won the first game, so 1 game is gone with probability 0.5 and with same probability it is expected that E(G|\(T_1\) is leading by 1 game ) number of games is to be played, similarly( OR) if \(T_2\) wins the first game, then with probability 0.5 , 1 game is gone and an expected E(G|\(T_2\) is leading by 1 game) number of games is to be played.

So, if we right the above words mathematically, it looks like,

E(G)= P(\(T_1\) wins the 1rst game )( 1+E(G|\(T_1\) is leading by 1 game))+P(\(T_4\) wins the 1rst game)(1+E(G|\(T_2\) is leading by 1 game)),......................(*)

So, now we need to find out E(G|\(T_1\) is leading by 1 game), the other is same due to symmetry!

so, expected number of games to be played when we know that \(T_1\) is leading the by 1 game, is the next game can be a win for \(T_1\), with probability 0.5, OR, \(T_1\) can lose the game with probability 0.5, and reach the stage of draw, and from here, all the games played is wasted and we start counting afresh (Loss of memory, remember !! ) , so 1 game is lost and still a expected E(G) numbers of games to follow before it terminates. So, mathematically,

E(G|\(T_1\) is leading by 1 game)= P(\(T_1\) wins again)x1 + P(\(T_1\) looses and draws)(1+E(G))=0.5+0.5+(0.5)E(G)=1+(0.5)E(G).

plugging in this, in (*), one will obtain a recursion of E(G), and calculate is as 4. So, on an average, the game terminates after 4 matches.


Food For Thought

Can you generalize the above problem, when I say, that the chance of winning each match, for one of the two team is some p, ( 0<p<1) ? Try it !

Wait!! Before leaving, lets toss some coins,

You toss a fair coin repeatedly. What is the expected number of tosses you think, you have to perform to get the pattern of HH (heads and heads) for the first time? What about the expected number of tosses when your pattern of interest is TH (or HT) ?? for which pattern you think you need to spend more time?? Does your intuition corroborates, the mathematical conclusions?? if not what's the reason you think, you are misled by your intuition ??

Think it over and over !! You are dealing with one of the most beautiful perspective of uncertainty !!


Similar Problems and Solutions



ISI MStat PSB 2008 Problem 10
Outstanding Statistics Program with Applications

Outstanding Statistics Program with Applications

Subscribe to Cheenta at Youtube