A Probability Birthday problem along with Julia Programming

Probability theory is nothing but common sense reduced to calculation.

Pierre-Simon Laplace

Today we will be discussing a problem from the second chapter of A First Course in Probability(Eighth Edition) by Sheldon Ross.

Let's see what the problem says:

Describing the Problem

The problem(prob-48) says:

Given 20 people, what is the probability that among the 12 months in the year, there are 4 months containing exactly 2 birthdays and 4 containing exactly 3 birthdays?

So, what do we mean by this? Let's suppose we have $20$ people, named $p_1,p_2,\cdots , p_{20}$ and we will be representing the months by numbers like, January$\to 1$, February $\to 2$ , March $\to 3$ ,$\cdots $,December $\to 12$.

Then, One possible combination of birthday can be,

People Name Birthday Month
$p_1$$01$
$p_2$$05$
$p_3$$09$
$p_4$$04$
$p_5$$07$
$p_6$$12$
$p_7$$03$
$p_8$$09$
$p_9$$06$
$p_{10}$$11$
$p_{11}$$12$
$p_{12}$$01$
$p_{13}$ $07$
$p_{14}$ $02$
$p_{15}$ $09$
$p_{16}$ $12$
$p_{17}$ $02$
$p_{18}$ $11$
$p_{19}$ $06$
$p_{20}$ $08$

In this case, January ($01$) has $2$ birthdays, February($02$) has $2$ birthdays. Similarly we can find the number of birthday in each month. But in our question it is said that one possible arrangement exist where any four months (suppose January, February, March and April) each holds $2$ birthday (in total in this group there are $4\times 2 = 8$ birthday) and another four months (suppose June, July, September and December) each holds $3$ birthday (in total this group has $4\times 3 = 12$ birthdays). We have to find the probability of existence of such groups.

I hope the problem is clear. Let's see how to solve this.

Solution - 1

To solve this problem one must notice the most important point. This problem treats each people in same footing, means they are indistinguishable. Normally, we humans are distinguishable as you can see people's face or maybe use their name to make difference.

So, The number of all possible combinations (outcome) = $12^{20}$.

Now, for our particular case, we must group $4$ months from $12$ months where each contains $2$ birthdays. This can be done in ${12 \choose 4}$ ways. After this again we have to choose $4$ months from remaining $8$, where each contains $3$ birthdays. This can be done in ${8 \choose 4}$ ways.

So, we can choose months (or the month combination is ) in ${12 \choose 4}{8 \choose 4}$ ways. Using multinomial theorem, we can write this number equals to ${12 \choose 4 4 4}$ (we will not use this).

Now, we have $20$ people and we have to to assign months with them.

Remember we have to take $3$ people and give them one single month. This can be done in ${20 \choose 3}$ ways. Again we have to take $3$ people and group them. This time the possible ways are ${17 \choose 3}$. Again doing the same gives us ${14 \choose 3}$ ways (this is done 4 times as we have $4$ such months which contains $3$ birthdays). So, for this group the number of possible combinations is ${20 \choose 3} {17 \choose 3} {14 \choose 3} {11 \choose 3} $.

Now, we still have $4$ months where each of those contains 2 birthdays. This gives a total of $ {8 \choose 2} {6 \choose 2} {4 \choose 2} {2 \choose 2} $ ways.

So, the total numbers of ways of assigning $20$ people to those months is $$ {20 \choose 3} {17 \choose 3} {14 \choose 3 } {11 \choose 3} {8 \choose 2} {6 \choose 2} {4 \choose 2} {2 \choose 2} = \frac{20!}{3!\cdot 17!} \frac{17!}{3!\cdot 14!} \frac{14!}{3!\cdot 11!} \frac{11!}{3!\cdot 8!} \frac{8!}{2!\cdot 6!} \frac{6!}{2!\cdot 4!} \frac{4!}{2!\cdot 2!} \frac{2!}{2!\cdot 0!} = \frac{20!}{12^4} $$

So, The probability = ${{12\choose 4}{8\choose 4}\frac{20!}{12^4}\over 12^{20}} = 0.00106042009902$

Easy right? This can also be solved using a analogy of dice with $12$ sides.

Solution-2

In this method we consider a dice with $12$ sides. Each person's birthday is within 12 months and each of them are independent. So, If that dice is rolled 20 times, then what is the probability to get same number $3$ times for $4$ different number and same number $2$ times for other $4$ different number, i.e., what is the probability to get a state like $11223344666777888999$?

This can be solved in $4$ simple steps.

  1. Among $12$ faces, $8$ of them can be taken in $a={12 \choose 8} = 495$ ways.
  2. Among those $8$ faces, $4$ are groups of $2$'s and $4$ are groups of $3$'s. The number of ways = $b = {8 \choose 4}{4 \choose 4} = 70$.
  3. If we roll it $20$ times or take $20$ dices, then there are $4$ sets of $3$ and $4$ sets of $2$. In this case, the possibilities $c = \frac{20!}{(3!)^4 (2!)^4}$.
  4. The sample space's element number $d = 12^{20}$

Hence, probability $=\frac{a\times b \times c}{d} = 0.0010604200990$

Whenever I solve this type of problems, It always feel like may be !! maybe!! I have made some mistake. To be sure that we are correct, we should perform experiment with a dice or may be take many groups of $20$ people and calculate probability. But It is practically impossible!!

Not quite... We can always use programming language to do the experiments and verify.

Verifying using Julia

First we have to know few tricks.

arr = [2,6,2,10]
#Creates an array

Now, how can we know if there are two 2's inside of this array?, an easy solution will be

arr.==2
#Check if any element equal to 2
#This gives an output as:
#BitVector: [true,false,true,false]

And then we use sum

sum(arr.==2)
#return number of times 2 is there
#output: 2

Using this we write a simple code:

begin
    function prob(n::Int)
	favour = 0
	for ex_num=1:n
		months = rand(1:12,20)
		nums = [sum(months.==i) for i=1:12]
		if sum(nums.==3)==4 && sum(nums.==2)==4
			favour += 1
		end
	end
	probability = favour/n
    end
end

This function beautifully give you the value. We need more sample, so I have used $n=1,00,00,000$.

This gives the result:3

prob_10_000_000 = prob(10_000_000)
#output: 0.0010616

Beautiful isn't it?, just using few lines of code we have verified our answer.

I encourage you to play with the code.

This is all for today. I hope you have learnt something new.

Monte Carlo Method to calculate Pi

Author: Kazi Abu Rousan

Pi is not merely the ubiquitous factor in high school geometry problems; it is stitched across the whole tapestry of mathematics, not just geometry’s little corner of it.

$\pi$ is truly one of the most fascinating things exist in mathematics. It's not just there in geometry, but it's also there in pendulum, waves, quantum particles, integrations, probability and even in biology.

Today we will be seeing one of the places where it arises in probability.

The name "Monte Carlo" refers to the city in Monaco which is known for it's casinos and gambling. So, Monte Carlo is basically used as a symbol of randomness.

Monte Carlo's Method or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be deterministic in principle.

We will be using one of those experiment to find value $\pi$.

Probability and $\pi $

Let's start from the basics.

Suppose, we have a line from $0$ to $5$ unit and we throw a dot at random. The question is: What is the probability that the dot will be on the line between $2$ to $3$ unit?

The answer is : $\frac{1}{5}$

Why?, because the probability is simply $\frac{length\ of\ favorable\ region}{length\ of\ whole\ region}$.

Now, suppose we have a circle inside a square as shown in the figure.

The circular and Square region

We now throw a ball at random inside the square. What is the probability that the ball will land inside the circle?, The answer is simply the ratio of the areas of the favorable region and the total region (here we are using the area as it is a two-dimensional region).

This simply means;

$$ P = \frac{Area\ of\ Green\ Region}{Area\ of\ Blue\ Region} $$

So, $$P = \frac{\pi R^2}{4\times R^2} = \frac{\pi}{4}\to \pi = 4\times P $$

Now, the question is how to find the value of $P$?

Well, we can just use the Experimental definition of probability. This definition simply says "The probability is equals to the ratio of number of events whose results are favorable and the total number of experiments done.

In our case, It simply means $P = \frac{No. \ of\ balls\ which\ are\ fallen\ inside\ circle}{Total\ No.\ of\ balls\ thrown}$

This exact idea can be used to write a program to find the value of $\pi $.

Python Code to find $\pi $

Let's begin our coding.

We will be taking a unit circle. Python's numpy library has many functions to generate random number, we will be using that.

import numpy as np
n = 4
x_vals = np.random.uniform(-1,1,n)
y_vals = np.random.uniform(-1,1,n)
print(x_vals,y_vals)
#Output:[0.83083661 0.40144553 0.07297935 0.21236558] [ 0.30707802 -0.42705451 -0.70296805 -0.44182543]

This block of code generates two arraies (list) containing values between $-1 to 1$. We will use each of those values from x_vals as x-coordinate of each ball and y_vals as y-coordinates of each ball. There $n$ represent number of balls.

Now, we need a function which will check if a ball is inside the circle or not. This is quite easy. We can just find $x^2+y^2$, if it's less than or equals to $1$, then it's inside the circle or else, it's outside.

def check_in_circle(x,y):
	if x**2+y**2<=1:
		return True
	else:
		return False
print(check_in_circle(0.5,0.5))
print(check_in_circle(1,1.2))
#Output: True
False

Now, we will combine all this to write a code and calculate the value of $\pi $.

import numpy as np
import matplotlib.pyplot as plt

n = 100000#also used 4, 10,100etc
x_vals = np.random.uniform(-1,1,n)
y_vals = np.random.uniform(-1,1,n)
coords = np.vstack((x_vals,y_vals)).T

def check_in_circle(coord):
	if coord[0]**2+coord[1]**2<=1:
		return True
	else:
		return False
inside_count = 0
for i in range(n):
	if check_in_circle(coords[i]):
		inside_count += 1
print("Value of pi = ",4*(inside_count/n)," for n = ",n)
#Output: Value of pi =  3.13572  for n =  100000
Value of pi =  3.0  for n =  4
Value of pi =  3.08  for n =  100
Value of pi =  3.1324  for n =  10000

As you can see, the value to $\pi $ from this method approaches it's real value as we increase $n$. Let's see a plot for few values of $n$ (along with code)

The code to generate this is here:

import numpy as np
import matplotlib.pyplot as plt

n = 100000
x_vals = np.random.uniform(-1,1,n)
y_vals = np.random.uniform(-1,1,n)
coords = np.vstack((x_vals,y_vals)).T
inside_x = []; inside_y = []
def check_in_circle(coord):
	if coord[0]**2+coord[1]**2<=1:
		return True
	else:
		return False
for i in range(n):
	if check_in_circle(coords[i]):
		inside_x.append(coords[i][0])
		inside_y.append(coords[i][1])
inside_count = len(inside_x)
pi_val = 4*inside_count/n
figure, axes = plt.subplots()
plt.scatter(x_vals,y_vals,color="BLUE",s=0.1)
plt.scatter(inside_x,inside_y,color="GREEN",s=0.1)
qa = plt.Circle((0,0),1,fill=False, color="RED", linewidth=2)
plt.xlim(-1.01,1.01)
plt.ylim(-1.01,1.01)
axes.set_aspect(1)
axes.add_artist(qa)
plt.title("n=%s and points_inside = %s, $\\pi$ = %s"%(n,inside_count,pi_val))
plt.show()

We can make a video of $\pi $ vs $n$ to see how it converges.

This is all for today. The code for the video can be found here: Code of Animation

I hope you learn something new.

ISI MStat Entrance 2021 Problems and Solutions PSA & PSB

This post contains ISI MStat Entrance PSA and PSB 2021 Problems and Solutions that can be very helpful and resourceful for your ISI MStat Preparation.

Download Paper
PSA Paper
PSB Paper
ISI MStat 2021 PSA Answer Key and Solutions

Click on the links to learn about the detailed solution. (Coming Soon)

  1. 49 (Rolle's Theorem)

2. 2 (4 - number of linear constraints)

3. k = 2 (a = -d, and form a biquadratic which has two real solutions)

4. 0 (divide by $x^4$, use $\frac{sinx}{x}$ limit result)

5. $\frac{p}{q}$ must be a rational number. (The product must be a rational number.)

6. $\alpha = 1, \beta =1$ (Use sandwich theorem on an easy inequality on ceiling of x)

7. $\frac{2n}{n+1}$ (Use geometry and definite integration)

8. $2+ \sqrt{5}$ (Just write down the pythagoras theorem in terms of the variables and solve)

9. 10 (Use the roots of unity)

10. $\frac{3}{8}$ (Find out the cases when it is non zero, and use classical probability)

11. $\frac{(n+1)^n}{n!}$ (Use ${{n} \choose {r}}={{n-1} \choose {r-1}}+{{n-1} \choose r}$)

12. $P(\pi)$ is even for all $\pi$. (Observe that there is one more odd than number of evens, so there will be one odd-odd match)

13. is equal to 12. (The $i,j$th element is $a_{ii}b{ij}c{jj}$. Use gp series then.)

14. 160 (Use the fact any permutation can be written as compositions of transpositions. Observe that the given condition is equivalent to that 2 transpositions are not possible)

15. $m_t < \infty$ for all $t \geq 0$ (All monotone functions are bounded on [a,b])

16.$H(x) = \frac{1-F(-x)+ F(x)}{2}$ (If $F(x)$ is right continuous, $F(-x)$ is left continuous.).

17. $\frac{1}{25}$ (Use the distribution function of $\frac{X}{Y}$)

18. 3 (Find the distribution of order statistic, and find the expectation)

19. (II) but not (I) (If $F(x)$ is right continuous, $F(-x)$ is left continuous.).

20. $20\lambda^4$ (Use gamma integral to find the $E(X_{1}^4)$.)

21. The two new observations are 15 and 5. (Use the condition to find two linear equations to find the observations).

22. It is less than 2. (Use the beta coefficients in terms of sample covariance and sample variance, and compare)

23. 4:3 (Use Bayes' Theorem)

24. The two-sample t-test statistic and the ANOVA statistics yield the same power for any non-zero value of $\mu_1 - \mu_2$ and for any $n,m$. (Both the test statistic are one to one function of one another)

25. t³-1 - 2(t-1)

26. $\frac{2 \sum_{i=1}^{n} X_i}{n(n+1)}$ (Use the invariance property of MLE)

27. $Y_1^2 + Y_2^2 + Y_1Y_2$ (Write the bivariate normal distribution in terms of $Y_1, Y_2$ and use Neyman Factorization Theorem.)

28. can be negative (Simson's Paradox)

29. $2z$ (There are three random variables, $N$ = stopping time to get $Y=1$, $Y$ and $X$. Use the conditioning properly. Take your time)

30. $\frac{40}{3}$ (Use the property that Poisson | Poisson in the given problem follows Binomial)


ISI MStat 2021 PSB Solutions
Coming soon.

ISI MStat PSB 2021 Problem 1

Solution

ISI MStat PSB 2021 Problem 2

Solution

ISI MStat PSB 2021 Problem 3

Solution

ISI MStat PSB 2021 Problem 4

Solution

ISI MStat PSB 2021 Problem 5

Solution

ISI MStat PSB 2021 Problem 6

Solution

ISI MStat PSB 2021 Problem 7

Solution

ISI MStat PSB 2021 Problem 8

Solution

ISI MStat PSB 2021 Problem 9

Solution

Please suggest changes in the comment section.

Cheena Statistics Logo
Cheenta Statistics Department
ISI MStat and IIT JAM Training Program

Can Two or more Events be Exhaustive and Independent?

Can Two or more Events be Exhaustive and Independent?

Two Event Problem Formulation

Let $S$ be the sample space.

Let $A,B \subset S$ be two strict events with probability $p,q$ respectively.

The conditions are

Do there exist such $A,B$? If yes, how do they look like?

Hints

Hint 1

Using both the conditions and the identity $P(A \cup B) = P(A) + P(B) - P(A \cap B)$ to an equation in $p,q$.

Hint 2

Solve for $p,q$ to get that $p=1$ or $q = 1$. But, $A,B$ are strict events. Hence, not possible.

Food For Thoughts

Full Solution (Food for Thoughts)

Is Multivariate Limit = Iterated Limit? Multivariate Limit Demystified

Is Multivariate Limit equal to Iterated Limit?

The multivariate limit is really akin to the univariate limit. But, how can we explain that?

However, We discuss the following aspects in this regard.

📌 Firstly, we discuss the ideas of proving and disprove Univariate Limits.
📌 Then, come Multivariate Limits - How to prove and disprove?
📌 Thereafter, Iterated Limits appear - Understanding and Geometry.
📌 Hence, we discover Relationship between Multivariate Limits and Iterated Limits.
📌 We end with Food for Thought.

Iterated Limits are a bypass. Do they really explain the Multivariate Limit?

We discover a rich relationship between the two. We give all the cases possible between multivariate limits and iterated limits.

Hints, Solution, and More

Enjoy the video

Build your foundations.

Ace your Exams.

Learn. Enjoy. Practice. Repeat.

Some Useful Links:

Stay Tuned!

When Maximum Likelihood = Method of Moments?

Is Maximum Likelihood = Method of Moments?

Maximum Likelihood Estimation is an algorithm to find a reasonable estimator. Personally, it really woos my mind - simple and yet so beautiful. Method of Moments is simpler. It doesn't woo me :p. However, still, they have a lot of similarities. Thus, we have set off to explore them. Finally, we ask for a lot of food for thought. After all, we are all explorers at heart.

We ask "Is MLE = MOM? If not, when?"

We discover a rich relationship between the two. We discover the score function and so much more exciting.

Hints, Solution, and More

Enjoy the video

Build your foundations.

Ace your Exams.

Learn. Enjoy. Practice. Repeat.

Some Useful Links:

ISI MStat 2020 PSB Problem 9 | Discussion & Solution

ISI MStat 2020 PSB Problem 9

This post discuses the problem 9 of the ISI MStat 2020 PSB Entrance Exam.

A finite population has \(N\) units, with \(x_{i}\) being the value associated with the \(i^{\text {th }}\) unit, \(i=1,2, \ldots, N\). Let \(\bar{x}{N}\) be the population mean.

A statistician carries out the following experiment.

Step 1: Draw a SRSWOR of size \(n({1}\) and denote the sample mean by \(\bar{X}{n}\).

Step 2: Draw a SRSWR of size \(m\) from \(S{1}\). The \(x\) -values of the sampled units are denoted by {\(Y_{1}, \cdots, Y_{m} \)}.

Hints, Solution, and More

Do subscribe to our channel to get instant notification of Live Session, so that you can join us live in the master classes!

Build your foundations.

Ace your Exams.

Learn. Enjoy. Practice. Repeat.

Some Useful Links:

Is MLE always a function of a Sufficient Statistic?

Is MLE always a function of a Sufficient Statistic?

MLE is an algorithm to find a reasonable estimator (personally, it really woos my mind - simple and yet so beautiful.).

Now, well - Life is hard. People has devised ways to check if an esimator is good or not - why will they care I like it or not.

So, they have developed Small Sample Properties and Large Sample Properties to do the quality control of MLE.

This post tests the flamboyancy of MLE is terms of the idea of "Sufficiency".

We ask "Is MLE sufficient? How is MLE and Sufficiency related?"

We discover a rich relationship between the two. Again, MLE wins my heart. Does it win yours? Check with the hints and the solution.

Hints, Solution, and More

Do subscribe to our channel to get instant notification of Live Session, so that you can join us live in the master classes!

Build your foundations.

Ace your Exams.

Learn. Enjoy. Practice. Repeat.

Some Useful Links:

ISI MStat 2020 PSB Problem 6 Problem & Solution

ISI MStat 2020 PSB Problem 6

This post discuses the problem 6 of the ISI MStat 2020 PSB Entrance Exam.

Suppose individuals are classified into three categories C1,C2 and C3.

Let p2,(1p)2 and 2p(1p) be the respective population proportions, where p∈(0,1). A random sample of N individuals is selected from the population and the category of each selected individual recorded.

For i=1,2,3, let Xi denote the number of individuals in the sample belonging to category Ci. Define U=X1+X32.

Hints, Solution, and More

Do subscribe to our channel to get instant notification of Live Session, so that you can join us live in the master classes!

Build your foundations.

Ace your Exams.

Learn. Enjoy. Practice. Repeat.

Some Useful Links:

Pigeonhole Principle

“The Pigeonhole principle” ~ Students who have never heard may think that it is a joke. The pigeonhole principle is one of the simplest but most useful ideas in mathematics. Let’s learn the Pigeonhole Principle with some applications.

Pigeonhole Principle Definition:

In Discrete Mathematics, the pigeonhole principle states that if we must put $N + 1$ or more pigeons into N Pigeon Holes, then some pigeonholes must contain two or more pigeons.

Example:

If $Kn+ 1$ (where k is a positive integer) pigeons are distributed among n holes than some hole contains at least $k + 1$ pigeons.

Applications of Pigeonhole Principle:

This principle is applicable in many fields like Number Theory, Probability, Algorithms, Geometry, etc.

Problems and Solutions:

Problem 1

A bag contains beads of two colours: black and white. What is the smallest number of beads which must be drawn from the bag, without looking so that among these beads, two are of the same colour?

Solution: We can draw three beads from bags. If there were no more than one bead of each colour among these, then there would be no more than two beads altogether. This is obvious and contradicts the fact that we have chosen their beads. On the other hand, it is clear that choosing two beads is not enough. Here the beads play the role of pigeons, and the colours (black and white) play the role of pigeonhole.

Problem 2

Find the minimum number of students in a class such that three of them are born in the same month?

Solution: Number of month $n =12$

According to the given condition,

$K+1 = 3$

$K = 2$

$M = kn +1 = 2×12 + 1 = 25$.

Problem 3

Show that from any three integers, one can always choose two so that $a^3$b – a$b^3$ is divisible by 10.

Solution: We can factories the term $a^3$b – a$b^3$ = $ab(a + b)(a - b)$, which is always even, irrespective of the pair of integers we choose.

If one of three integers from the above factors is in the form of 5k, which is a multiple of 5, then our result is proved.

If none of the integers is a multiple of 5 then the chosen integers should be in the form of $(5k)+-(1)$ and $(5k)+-(2)$ respectively.

Clearly, two of these three numbers in the above factors from the given expression should lie in one of the above two, which follows by the virtue of this principle.

These two integers are the ones such that their sum and difference is always divisible by 5. Hence, our result is proved.

Problem 4

If n is a positive integer not divisible by 2 or 5 then n has a multiple made up of 1's.

Problem 5

Let $X \subseteq{1,2, \ldots, 99}$ and $|X|=10$. Show that it is possible to select two disjoint nonempty proper subsets $Y, Z$ of $X$ such that $\sum(y \mid y \in Y)=\sum(z \mid z \in Z)$.

Problem 6

Let $A_{1} B_{1} C_{1} D_{1} E_{1}$ be a regular pentagon. For $2 \leq n \leq 11$,
let $A_{n} B_{n} C_{n} D_{n} E_{n}$ be the pentagon whose vertices are the midpoints of the sides of the pentagon $A_{n-1} B_{n-1} C_{n-1} D_{n-1} E_{n-1}$. All the 5 vertices of each of the 11 pentagons are arbitrarily coloured red or blue. Prove that four points among these 55 points have the same colour and form the vertices of a cyclic quadrilateral.

Some Useful links: