A Probability Birthday problem along with Julia Programming

Probability theory is nothing but common sense reduced to calculation.
Pierre-Simon Laplace

Today we will be discussing a problem from the second chapter of A First Course in Probability(Eighth Edition) by Sheldon Ross.

Let's see what the problem says:

Describing the Problem

The problem(prob-48) says:

Given 20 people, what is the probability that among the 12 months in the year, there are 4 months containing exactly 2 birthdays and 4 containing exactly 3 birthdays?

So, what do we mean by this? Let's suppose we have $20$ people, named $p_1,p_2,\cdots , p_{20}$ and we will be representing the months by numbers like, January$\to 1$, February $\to 2$ , March $\to 3$ ,$\cdots $,December $\to 12$.

Then, One possible combination of birthday can be,

People Name	Birthday Month
$p_1$	$01$
$p_2$	$05$
$p_3$	$09$
$p_4$	$04$
$p_5$	$07$
$p_6$	$12$
$p_7$	$03$
$p_8$	$09$
$p_9$	$06$
$p_{10}$	$11$
$p_{11}$	$12$
$p_{12}$	$01$
$p_{13}$	$07$
$p_{14}$	$02$
$p_{15}$	$09$
$p_{16}$	$12$
$p_{17}$	$02$
$p_{18}$	$11$
$p_{19}$	$06$
$p_{20}$	$08$

In this case, January ($01$) has $2$ birthdays, February($02$) has $2$ birthdays. Similarly we can find the number of birthday in each month. But in our question it is said that one possible arrangement exist where any four months (suppose January, February, March and April) each holds $2$ birthday (in total in this group there are $4\times 2 = 8$ birthday) and another four months (suppose June, July, September and December) each holds $3$ birthday (in total this group has $4\times 3 = 12$ birthdays). We have to find the probability of existence of such groups.

I hope the problem is clear. Let's see how to solve this.

Solution - 1

To solve this problem one must notice the most important point. This problem treats each people in same footing, means they are indistinguishable. Normally, we humans are distinguishable as you can see people's face or maybe use their name to make difference.

So, The number of all possible combinations (outcome) = $12^{20}$.

Now, for our particular case, we must group $4$ months from $12$ months where each contains $2$ birthdays. This can be done in ${12 \choose 4}$ ways. After this again we have to choose $4$ months from remaining $8$, where each contains $3$ birthdays. This can be done in ${8 \choose 4}$ ways.

So, we can choose months (or the month combination is ) in ${12 \choose 4}{8 \choose 4}$ ways. Using multinomial theorem, we can write this number equals to ${12 \choose 4 4 4}$ (we will not use this).

Now, we have $20$ people and we have to to assign months with them.

Remember we have to take $3$ people and give them one single month. This can be done in ${20 \choose 3}$ ways. Again we have to take $3$ people and group them. This time the possible ways are ${17 \choose 3}$. Again doing the same gives us ${14 \choose 3}$ ways (this is done 4 times as we have $4$ such months which contains $3$ birthdays). So, for this group the number of possible combinations is ${20 \choose 3} {17 \choose 3} {14 \choose 3} {11 \choose 3} $.

Now, we still have $4$ months where each of those contains 2 birthdays. This gives a total of $ {8 \choose 2} {6 \choose 2} {4 \choose 2} {2 \choose 2} $ ways.

So, the total numbers of ways of assigning $20$ people to those months is $$ {20 \choose 3} {17 \choose 3} {14 \choose 3 } {11 \choose 3} {8 \choose 2} {6 \choose 2} {4 \choose 2} {2 \choose 2} = \frac{20!}{3!\cdot 17!} \frac{17!}{3!\cdot 14!} \frac{14!}{3!\cdot 11!} \frac{11!}{3!\cdot 8!} \frac{8!}{2!\cdot 6!} \frac{6!}{2!\cdot 4!} \frac{4!}{2!\cdot 2!} \frac{2!}{2!\cdot 0!} = \frac{20!}{12^4} $$

So, The probability = ${{12\choose 4}{8\choose 4}\frac{20!}{12^4}\over 12^{20}} = 0.00106042009902$

Easy right? This can also be solved using a analogy of dice with $12$ sides.

Solution-2

In this method we consider a dice with $12$ sides. Each person's birthday is within 12 months and each of them are independent. So, If that dice is rolled 20 times, then what is the probability to get same number $3$ times for $4$ different number and same number $2$ times for other $4$ different number, i.e., what is the probability to get a state like $11223344666777888999$?

This can be solved in $4$ simple steps.

Among $12$ faces, $8$ of them can be taken in $a={12 \choose 8} = 495$ ways.
Among those $8$ faces, $4$ are groups of $2$'s and $4$ are groups of $3$'s. The number of ways = $b = {8 \choose 4}{4 \choose 4} = 70$.
If we roll it $20$ times or take $20$ dices, then there are $4$ sets of $3$ and $4$ sets of $2$. In this case, the possibilities $c = \frac{20!}{(3!)^4 (2!)^4}$.
The sample space's element number $d = 12^{20}$

Hence, probability $=\frac{a\times b \times c}{d} = 0.0010604200990$

Whenever I solve this type of problems, It always feel like may be !! maybe!! I have made some mistake. To be sure that we are correct, we should perform experiment with a dice or may be take many groups of $20$ people and calculate probability. But It is practically impossible!!

Not quite... We can always use programming language to do the experiments and verify.

Verifying using Julia

First we have to know few tricks.

arr = [2,6,2,10]
#Creates an array

Now, how can we know if there are two 2's inside of this array?, an easy solution will be

arr.==2
#Check if any element equal to 2
#This gives an output as:
#BitVector: [true,false,true,false]

And then we use sum

sum(arr.==2)
#return number of times 2 is there
#output: 2

Using this we write a simple code:

begin
    function prob(n::Int)
	favour = 0
	for ex_num=1:n
		months = rand(1:12,20)
		nums = [sum(months.==i) for i=1:12]
		if sum(nums.==3)==4 && sum(nums.==2)==4
			favour += 1
		end
	end
	probability = favour/n
    end
end

This function beautifully give you the value. We need more sample, so I have used $n=1,00,00,000$.

This gives the result:3

prob_10_000_000 = prob(10_000_000)
#output: 0.0010616

Beautiful isn't it?, just using few lines of code we have verified our answer.

I encourage you to play with the code.

This is all for today. I hope you have learnt something new.

ISI B.Math objective 2006 problem -2 Number theory (Euler phi function)

PROBLEM

Let $p$ be an odd prime.Then the number of positive integers less than $2p$ and relatively prime to $2p$ is:

(A)$p-2$ (B) $\frac{p+1}{2} $(C) $p-1$(D)$p+1$

SOLUTION

This is a number theoretic problem .We can solve this problem in 2 different methods. Let us see them both one by one

Method -1

Let us look at the prime factorization of $2p$ it is

$2p=2\cdot p $

Note that there are $2p-1$ numbers that are less than $2p$

$2p$ is an even number so it has a common divisor with each of the even numbers that are smaller than $2p$ i.e. the numbers in the following set

${2,4,6,\dots(2p-2)}$

So we can disregard these $(p-1)$ numbers

Next note that $p$ is an odd prime number so the only odd number smaller than $2p$ that can have a common divisor with $2p$ is $p$ so we have to disregard this number too

Taking all these into account we come to conclusion that the no of positive integers less than $2p$ and relatively prime to $2p$ is $(2p-1)-(p-1)-1=p-1$

Method-2

We can also use Euler totient function or phi function to solve this problem

We know that Euler phi function is multiplicative i.e $\phi(m\cdot n)=\phi(m)\cdot \phi(n)$ for any positive integers $m$ and $n$

So We can write $\phi(2 \cdot p)=\phi(2)\phi(p)$

Now $\phi(p)$ is the no of positive integers that are less than $p$ and are relatively prime to to $p$ .

As $p$ is a prime no ,so this number is equal to $p-1$

similarly $\phi(2)=1$

As $\phi$ is multiplicative so we get $\phi(2p)=1\cdot(p-1)=p-1$

Pi calculating from Mandelbrot Set using Julia

There should be no such thing as boring mathematics.
Edsger W. Dijkstra

In one of our previous post, we have discussed on Mandelbrot Set. That set is one of the most beautiful piece of art and mystery. At the end of that post, I have said that we can calculate the value of $\pi $ using Mandelbrot set. So, today we will be doing exactly that.

Today our main goal is to write a Julia programming language. Well, I am using Julia as I am learning this new language and it's really great and also, "Mandelbrot" and "Julia" set are related. So, why not use Julia language.

It all started in $1991$. On that year, Dave Boll discovered a surprising occurrence of the number $\pi $ while exploring a seemingly unrelated property of the Mandelbrot set. So, let's see.

Discovery

It all started when Dave Boll was trying to see if the neck of the mandelbrot set ($c=-0.75+i\cdot 0$) is infinitely thin or not. We was trying to find this by taking a number $\epsilon $, which is as small as we want (in simple terms $\epsilon \to 0$) and apply the recurrence relation $z = z^2 + c$ where $c$ is taken as $c = -0.75 + i\cdot \epsilon$.

Now, we see the number of steps it needs to go outside of the mandelbrot set.

$\epsilon $	No if Iteration needed to go out
1.0	3
0.1	33
0.01	315
0.001	3143
0.0001	31417
0.00001	314160
0.000001	3141593

WTF

What the !!... It was exactly the reaction of Boll, when he first saw this. Somehow the digits of $\pi $ was generated.

Let's take an example to see what is happening. Let's start with $c = -0.75 + \epsilon i = -0.75 + i $. Now, we apply the algorithm of Mandelbrot set. This is shown in the image below.

This picture clearly shows the algorithm. After taking the $c$ value, we apply the iteration. That iteration has $3$ values of $z$ ($z_0$,$z_1$,$z_2$) before the value of $z$ goes outside of the mandelbrot set. And as you have guessed the number is the first digit of $\pi $.

As we decrease the number $\epsilon $, the number of steps needed by the $z$ to go outside the mandelbrot set increases as seen in the table. In the limit $\epsilon \to 0$, the number needed will generate the exact value of $\pi $ (This value actually goes to infinity).

It doesn't just work for $-0.75 + \epsilon i$ but also works for $(0.25 + \epsilon )+ 0i $, $-0.75 - i\epsilon $, $\cdots $

The same for $0.25+\epsilon +0i$ can be visualized using a parabola. This can also be compared with a reflective system formed by a parabolic and straight mirror. The number of reflection will generate the value of $\pi $.

The proof of this a bit technical. If you want a proof read this - Proof of this Phenomenon.

Programming using Julia

The code is very simple to write. Remember in Julia the imaginary $i$ is represented by $im$.

The code for the case $\epsilon = 0.1$ is shown below.

As you can see it's so easy and it is giving me $33$. And if you have noticed... in Julia we can use symbols like "$\epsilon $".

Now, what we can do.. we can make a function from this and let's also use a function called setprecision to get high precision to decimal values to get many digits of $\pi $. We will also be using BigInt and BigFloat to handle high precision.

So, The final code is:

This function takes $n$, which is the number of digits you want in $\pi$. Now, let's see some of it's output.

Notice, the number of digits are correct up to $n-1$ places.

Beautiful!!.. Isn't it?

We can visually see the path like shown below:

This is all for today. I hope you learnt something new and have grown a bit.

Partition Numbers and a code to generate one in Python

Author: Kazi Abu Rousan

The pure mathematician, like the musician, is a free creator of his world of ordered beauty.
Bertrand Russell

Today we will be discussing one of the most fascinating idea of number theory, which is very simple to understand but very complex to get into. Today we will see how to find the Partition Number of any positive integer $n$.

Like every blog, our main focus will be to write a program to find the partition using python. But today we will use a very special thing, we will be using Math Inspector - A visual programming environment. It uses normal python code, but it can show you visual (block representation) of the code using something called Node Editor. Here is a video to show it's feature.

Beautiful isn't it? Let's start our discussion.

What are Partition Numbers?

In number theory, a Partition of a positive integer $n$, also called an Integer Partition, is a way of writing $n$ as a sum of positive integers (two sums that differ only in the order of their summands are considered the same partition, i.e., $2+4 = 4+ 2$). The partition number of $n$ is written as $p(n)$

As an example let's take the number $5$. It's partitions are;

$$5 = 5 +0\ \ \ \ \\= 4 + 1\ \ \ \ \\ = 3 + 2\ \ \ \ \\= 3 + 1+ 1\ \ \\= 2 +2 + 1\ \ \\= 2+1+1+1 \ \\= 1+1+1+1+1$$

Hence, $p(5) = 7$. Easy right?, Let's see partitions of all numbers from $0$ to $6$.

Note: The partition number of $0$ is taken as $1$.

See how easy it is not understand the meaning of this and you can very easily find the partitions of small numbers by hand. But what about big numbers?, like $p(200)$?

Finding the Partition Number

There is no simple formula for Partition Number. We normally use recursion but if you want just a formula which generate partition number of any integer $n$, then you have to use this horrifying formula:

Scary!!!-- Try using this to find $p(2)$

Just looking at this can give you nightmare. So, are there any other method?, well yes.

Although, there are not any closed form expression for the partition function up until now, but it has both asymptotic expansions(Ramanujan's work) that accurately approximate it and recurrence relations(euler's work) by which it can be calculated exactly.

Generating Function

One method can be to use a generating function. In simple terms we want a polynomial whose coefficient of $x^n$ will give us $p(n)$. It is actually quite easy to find the generating function.

$$ \sum_{n =0}^{\infty} p(n)x^n = \Pi_{j = 0}^{\infty}\sum_{i=0}^{\infty}x^{ji} = \Pi_{j = 1}^{\infty}\frac{1}{1-x^j}$$

This can also written as:

$$ \sum_{n =0}^{\infty} p(n)x^n = \frac{1}{(1-x)}\frac{1}{(1-x^2)}\frac{1}{(1-x^3)}\cdots = (1+x^1+x^2+\cdots+x^r+\cdots) (1+x^2+x^4+\cdots+x^{2r}+\cdots)\cdots $$

Each of these $(1+x^k+x^{2k}+\cdots+x^{rk}+\cdots)$ are called sub-series.

We can use this to find the partition number of any $n$. Let's see an example for $n=7$. There are 2 simple steps.

First write the polynomial such that each sub-series's maximum power is smaller or equals to $n$. As an example, if $n=7$, then the polynomial is $(1+x+x^2+x^3+x^4+x^5+x^6+x^7)(1+x^2+x^4+x^6)(1+x^3+x^6)(1+x^4)(1+x^5)(1+x^6)(1+x^7)$.
Now, we simplify this and find coefficient of $x^n$. In this process we find the partition numbers for all $r<n$.

So, for our example, we can just expand the polynomial given in point-1. You can do that by hand. But I have used sympy library.

from sympy import *
x = symbols('x')
a = expand((1+x+x**2+x**3+x**4+x**5+x**6+x**7)*(1+x**2+x**4+x**6)*(1+x**3+x**6)*(1+x**4)*(1+x**5)*(1+x**6)*(1+x**7))
print(a)
#Output: x**41 + x**40 + 2*x**39 + 3*x**38 + 5*x**37 + 7*x**36 + 11*x**35 + 15*x**34 + 18*x**33 + 24*x**32 + 30*x**31 + 37*x**30 + 44*x**29 + 52*x**28 + 57*x**27 + 64*x**26 + 71*x**25 + 77*x**24 + 81*x**23 + 84*x**22 + 84*x**21 + 84*x**20 + 84*x**19 + 81*x**18 + 77*x**17 + 71*x**16 + 64*x**15 + 57*x**14 + 52*x**13 + 44*x**12 + 37*x**11 + 30*x**10 + 24*x**9 + 18*x**8 + 15*x**7 + 11*x**6 + 7*x**5 + 5*x**4 + 3*x**3 + 2*x**2 + x + 1

The coefficients of $x^k$ gives $p(k)$ correctly up to $k = 7$. So, from this $p(7) = 15$, $p(6)=11$ and so on. But for $n>7$, the coefficients will be less than $p(n)$, as an example $p(8)=22$ but here the coefficient is $18$. To calculate $p(8)$, we need to take sub-series with power $8$ or less.

Although this method is nice, it is quite lengthy. But this method can be modified to generate a recurrence relation.

Euler's Pentagonal Number Theorem

Euler's Pentagonal Theorem gives us a relation to find the polynomial of the product of sub-series. The relation is:

$$ \Pi_{i = 1}^{\infty }(1-x^n) = 1 + \sum_{k =1}^{\infty } (-1)^k \Big(x^{k\frac{(3k+1)}{2}} + x^{k\frac{(3k-1)}{2}}\Big) $$

Using this equation, we get the recurrence relation,

$$p(n) = p(n-1) + p(n-2) - p(n-5) - p(n-7) + \cdots = \sum_{k=1}^{n} \Bigg( p\Bigg(n-\frac{k(3k-1)}{2}\Bigg) + p\Bigg(n-\frac{k(3k+1)}{2}\Bigg)\Bigg) $$

The numbers $0, 1, 2, 5, 7, 12, 15, 22, 26, 35, 40,\cdots $ , are called pentagonal numbers. We generate these numbers by $\frac{k(3k+1)}{2}$ and $\frac{k(3k-1)}{2}$.

Let's try to find $p(7)$, using this.

Using this simple algorithm, we can write a code to get partition number.

def par(n):
	summ = 0
	if n == 0 or n == 1:
		result = 1
	else:
		for k in range(1,n+1):
			d1 = n - int((k*(3*k-1))/2)
			d2 = n - int((k*(3*k+1))/2)
			sign = pow(-1,k+1)
			summ = summ + sign*(par(d1) + par(d2))
		result = summ
	return result

This is the code to generate partition number. This is not that efficient as it uses recurrence. For big numbers it will take huge amount of time. In math-inspector, we can see it's block diagram.

Here the par block represent the function par(n). Whenever i creates a variable $n1=6$ or $n3=10$, it creates a circle. To apply par on $n3$, i just need to connect their nodes. And to show the output, add the other node to output node.

So, we have seen this simple function. Now, let's define one using the euler's formula but in a different manner. To understand the used algorithm watch this: Explanation

def partition(n):
    odd_pos = []; even_pos= []; pos_d = []
    for i in range(1,n+1):
        even_pos.append(i)
        odd_pos.append(2*i+1)
    m = 0; k = 0
    for i in range(n-1):
        if i % 2 == 0:
            pos_d.append(even_pos[m])
            m += 1
        else:
            pos_d.append(odd_pos[k])
            k += 1

    initial = 1; pos_index = [initial]; count = 1
    for i in pos_d:
        d = initial + i
        pos_index.append(d)
        initial = d
        count += 1
        if count > n:
            break
    sign = []
    for i in range(n):
        if i % 4 == 2 or i % 4 == 3:
            sign.append(-1)
        else:
            sign.append(1)
    pos_sign = []; k = 0
    for i in range(1,n+1):
        if i in pos_index:
            ps = (i,sign[k])
            k = k + 1
            pos_sign.append(ps)
        else:
            pos_sign.append(0)
    if len(pos_sign) != n:
        print("Error in position and sign list.")
    partition = [1]
    f_pos = []
    for i in range(n):
        if pos_sign[i]:
            f_pos.append(pos_sign[i])
        partition.append(sum(partition[-p]*s for p,s in f_pos))
    return partition

This is the code. This is very efficient. Let's try running this function.

Note: This function returns a list which contains all $p(n)$ in the range $0$ to $n$.

As you can see in the program, In[3] gives a list $[p(0),p(1),p(2),p(3),p(4),p(5)]$. Hence, to get the $p(n)$ using this function, just add $[-1]$ as it gives the last element of the list.

We can use a simple program to plot the graph.

import matplotlib.pyplot as plt
n = 5
x_vals = [i for i in range(n+1)]
y_vals = partition(n)
plt.grid(color='purple',linestyle='--')
plt.ylabel("Partition Number")
plt.xlabel("Numbers")
plt.step(x_vals,y_vals,c="Black")
plt.savefig("Parti_5.png",bbox_inches = 'tight',dpi = 300)
plt.show()

Output is given below. The output is a step function as it should be.

This is all for today. I hope you have learnt something new. If you find it useful, then share.

ISI B.STAT PAPPER 2018 |SUBJECTIVE

Problem

Let $f$:$\mathbb{R} \rightarrow \mathbb{R}$ be a continous function such that for all$x \in \mathbb{R}$ and all $t\geq 0$

f (x) = f (k t x)

where $k>1$ is a fixed constant

Hint

Case-1

choose any 2 arbitary nos $x,y$ using the functional relationship prove that $f(x)=f(y)$

Case-2

when $x,y$ are of opposite signs then show that $$f(x)=f(\frac{x}{2})=f(\frac{x}{4})\dots$$
use continuity to show that $f(x)=f(0)$

Solution

Let us take any $2$ real nos $x$ and $y$.

Case-1

$x$ and $y$ are of same sign . WLG $0<x<y$

Then$\frac{y}{x}>1$
so there is a no $t\geq 0$ such that
$\frac{y}{x}=k^t$
$f(y)=f(k^tx)=f(x)$ [using$f(x)=f(k^tx)$]

case-2

$x,y$ are of opposite sign. WLG $x<0<y$
Then $f(x)=f(k^tx)$

$\Rightarrow f(k^tx)=f(k^t2\frac{1}{2}x)$

$\Rightarrow f(k^t2\frac{1}{2}x)=f(k^tk^{log_k2}\frac{x}{2})$

$\Rightarrow f(k^tk^{log_k2}\frac{x}{2})=f(k^{t+log_k2}\frac{x}{2})$

$\Rightarrow f(k^{t+log_k2}\frac{x}{2})=f(\frac{x}{2})$

Using this logic repeatedly we get

$f(x)=f(\frac{x}{2})=f(\frac{x}{4})\dots =f(\frac{x}{2^n})$

Now $\frac{x}{2^n}\rightarrow0$ and $f$ is a continous function hence $\lim_{n\to\infty}f(\frac{x}{2^n})=f(0)$.

[Because we know if $f$ is a continous function and $x_n$ is a sequence that converges to $x$ then $\lim_{n\to\infty}f(x_n)=f(x)$]

using similar logic we can show that $f(y)=f(0)$ so $f(x)=f(y)$ for any $x,y\in \mathbb{R}$

I.S.I B.STAT 2018 | SUBJECTIVE -4

PROBLEM

Let $f (0,\infty)\rightarrow \mathbb{R}$ be a continous function such that for all $x \in (0,\infty)$ $f(x)=f(3x)$ Define $g(x)= \int_{x}^{3x} \frac{f(t)}{t}dt$ for $x \in (0,\infty)$ is a constant function

HINT

Use leibniz rule for differentiation under integral sign

SOLUTION

using leibniz rule for differentiation under integral sign we get
$g'(x)=f(3x)-f(x)$

$\Rightarrow g'(x)=0$ [ Because f(3x)=f(x)]
Since the derivative of $g(x)$ is $0$ for all $x$, Hence $g(x)$ is a constant function

TESTING THE CONCEPT OF COPRIME NUMBERS | CMI 2015 PART B PROBLEM-3

PROBLEM

Show that there are exactly $2$ numbers $a$ in the set $\{1,2,3\dots9400\}$ such that $a^2-a$ is divisible by $10000$

HINT

Use Modular arithmetic and concepts of coprime numbers

SOLUTION

we know

$10000=2^4*5^4$

In order for $10000$ to divide $a^2-a$ both $2^4$ and $5^4$ must divide $ a^2-a $

Write $a^2-a=a(a-1)$

Note that $a$ and $a-1$ are coprime numbers so they can not have any common divisors

That means either

$2^4$ divides $a$ and $5^4$ divides $(a-1)$

$2^4$ divides $a-1$ and $5^4$ divides $a$

Case-1

$2^4$ divides $a$ and $5^4$ divides $(a-1)$

AS $5^4=625$ divides $a-1$ we can write

$a-1=625k$ for some integer $k$

$\Rightarrow a=625k+1$

Now $2^4=16$ divides $a=625k+1$

Note that $16$ divides $624$ hence $16$ will also divide $624k$

We can write $625k+1=624k+k+1$

As $16$ divides $625k+1=624k+k+1$ so it must divide $k+1$

$\Rightarrow k$ is among the numbers $T=\{15,31,47\dots\}$

For $k=15$ we get $a=625*15+1=9376$

For $k=31$ the value of $a$ is $19376>9400$

So from $31$ onward none of the elements of $T$ is acceptable

So in this case the only possible value of $a$ is $9376$

Case-2

$2^4$ divides $a-1$ and $5^4$ divides $a$

As $5^4=625$ divides $a$ so $a=625k$ for some integer $k$

$\Rightarrow a-1=625k-1$

Now $2^4=16$ divides $a-1$

Write $625k-1=624k+k-1$

As in the last case we can argue that $16$ divides $624$ so it must divide $k-1$

So $K$ must be from the set $U=\{1,17,33\dots\}$

For $k=17 , a=10625>9400$ so any value of $k$ greater or equal $17$ does not count

So the only possible value that $k$ can take in this case is $k=1$

So $a=625$

So $625$ & $9376$ these are the only $2$ values that $a$ can take

Best algorithm to calculate Pi - Part1

Author: Kazi Abu Rousan

$\pi$ is not just a collection of random digits. $\pi$ is a journey; an experience; unless you try to see the natural poetry that exists in $\pi$, you will find it very difficult to learn.

Today we will see a python code to find the value of $\pi $ up to $1,00,000$ within $3$ to $4$ second. We will be using Chudnovsky Algorithm, which is one of the fastest method to calculate the value of $\pi $. Using this in 14^th August, 2021 $62.8$ trillion digits of $\pi $ was calculated. Can you imagine this?

This is based on Ramanujan's $\pi $ formula and was discovered in $1988$. Let's see the formula and how we are going to use that.

Chudnovsky Algorithm

The value of $\pi $ is given by,

$$\frac{1}{\pi } = 12\sum_{k=0}^{\infty } \frac{(-1)^k (6k)! (545140134k + 13591409)}{(3k)! (k!)^3(640320)^{3k+\frac{3}{2}}}$$

For proof visit here: Proof

Let's just take the first term of this series and see what it gives us.

Just the first term is this much accurate. Just thinking about it surprises me.

So, how are we going to write the code?, well for this blog, we are directly going to use above expression. In some later blog, we will see how to make this much more efficient. We are doing so, to show you guys how powerful this method is.

Let's Code

Let's start our coding. Note one thing. Normally, when we perform this sort of algorithm, mostly the error comes from the fractional part. To reduce this error, we have to increase the precision of our calculation. For this we will be using a special python library.

There are many libraries which can help us in this. Some of them are;

Decimal
mpmath
gmpy2

Our good old math library cannot do that as it mostly returns IEEE-$754$ $64$-bit result, which is roughly $17 $ digits only.

Here we will be using Decimal library.

Let's now begin our code:

import math as mp
from decimal import *
def pi_chudn(n):
    getcontext().prec = n+50
    k=0
    pi_chud = 0
    while k<n:
        pi_chud+=(((Decimal(-1))**k ) * (Decimal(mp.factorial(6*k)))*(13591409 + 545140134*k))/Decimal((mp.factorial(3*k)*((mp.factorial(k))**3)*(640320**((3*k)+(Decimal(1.5))))))
        k+=1
    pi_chud = (Decimal(pi_chud) * 12)
    pi_chud = (Decimal(pi_chud**(-1)))
    return int(pi_chud*10**n)

exact_pi_val = str(31415926535897932384626433832795028841971693993751058209749445923078164062862089986280348253421170679821480865132823066470938446095505822317253594081284811174502841027019385211055596446229489549303819644288109756659334461284756482337867831652712019091456485669234603486104543266482133936072602491412737245870066063155881748815209209628292540917153643678925903600113305305488204665213841469519415116094330572703657595919530921861173819326117931051185480744623799627495673518857527248912279381830119491298336733624406566430860213949463952247371907021798609437027705392171762931767523846748184676694051320005681271452635608277857713427577896091736371787214684409012249534301465495853710507922796892589235420199561121290219608640344181598136297747713099605187072113499999983729780499510597317328160963185950244594553469083026425223082533446850352619311881710100031378387528865875332083814206171776691473035982534904287554687311595628638823537875937519577818577805321712268066130019278766111959092164201989)
for n in range(1,1000):
    print(int(exact_pi_val[:n+1]))
    print(pi_chudn(n))
    is_true = (pi_chudn(n) == int(exact_pi_val[:n+1]))
    print("for n = ",n, " It is ",is_true)
    if is_true == False:
    	break

This have run this program up to $413$ terms of $\pi $ and belief we, each and every digit was correct (most chudnovsky's programs you will find in internet gives wrong value after $14-15$ digit).

The string exact_pi_val contains $1001$ correct digit of $\pi $. So, The program will itself check if the calculated $\pi $ value is right or wrong up to $1000$ digit.

Although it's slow. To calculate $400$ digits of $\pi $, it needs $4.177294731140137$ sec.

We can plot the number of digits and calculation time.

This is all for today. I have you have learnt something new.

Monte Carlo Method to calculate Pi

Author: Kazi Abu Rousan

Pi is not merely the ubiquitous factor in high school geometry problems; it is stitched across the whole tapestry of mathematics, not just geometry’s little corner of it.

$\pi$ is truly one of the most fascinating things exist in mathematics. It's not just there in geometry, but it's also there in pendulum, waves, quantum particles, integrations, probability and even in biology.

Today we will be seeing one of the places where it arises in probability.

The name "Monte Carlo" refers to the city in Monaco which is known for it's casinos and gambling. So, Monte Carlo is basically used as a symbol of randomness.

Monte Carlo's Method or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be deterministic in principle.

We will be using one of those experiment to find value $\pi$.

Probability and $\pi $

Let's start from the basics.

Suppose, we have a line from $0$ to $5$ unit and we throw a dot at random. The question is: What is the probability that the dot will be on the line between $2$ to $3$ unit?

The answer is : $\frac{1}{5}$

Why?, because the probability is simply $\frac{length\ of\ favorable\ region}{length\ of\ whole\ region}$.

Now, suppose we have a circle inside a square as shown in the figure.

We now throw a ball at random inside the square. What is the probability that the ball will land inside the circle?, The answer is simply the ratio of the areas of the favorable region and the total region (here we are using the area as it is a two-dimensional region).

This simply means;

$$ P = \frac{Area\ of\ Green\ Region}{Area\ of\ Blue\ Region} $$

So, $$P = \frac{\pi R^2}{4\times R^2} = \frac{\pi}{4}\to \pi = 4\times P $$

Now, the question is how to find the value of $P$?

Well, we can just use the Experimental definition of probability. This definition simply says "The probability is equals to the ratio of number of events whose results are favorable and the total number of experiments done.

In our case, It simply means $P = \frac{No. \ of\ balls\ which\ are\ fallen\ inside\ circle}{Total\ No.\ of\ balls\ thrown}$

This exact idea can be used to write a program to find the value of $\pi $.

Python Code to find $\pi $

Let's begin our coding.

We will be taking a unit circle. Python's numpy library has many functions to generate random number, we will be using that.

import numpy as np
n = 4
x_vals = np.random.uniform(-1,1,n)
y_vals = np.random.uniform(-1,1,n)
print(x_vals,y_vals)
#Output:[0.83083661 0.40144553 0.07297935 0.21236558] [ 0.30707802 -0.42705451 -0.70296805 -0.44182543]

This block of code generates two arraies (list) containing values between $-1 to 1$. We will use each of those values from x_vals as x-coordinate of each ball and y_vals as y-coordinates of each ball. There $n$ represent number of balls.

Now, we need a function which will check if a ball is inside the circle or not. This is quite easy. We can just find $x^2+y^2$, if it's less than or equals to $1$, then it's inside the circle or else, it's outside.

def check_in_circle(x,y):
	if x**2+y**2<=1:
		return True
	else:
		return False
print(check_in_circle(0.5,0.5))
print(check_in_circle(1,1.2))
#Output: True
False

Now, we will combine all this to write a code and calculate the value of $\pi $.

import numpy as np
import matplotlib.pyplot as plt

n = 100000#also used 4, 10,100etc
x_vals = np.random.uniform(-1,1,n)
y_vals = np.random.uniform(-1,1,n)
coords = np.vstack((x_vals,y_vals)).T

def check_in_circle(coord):
	if coord[0]**2+coord[1]**2<=1:
		return True
	else:
		return False
inside_count = 0
for i in range(n):
	if check_in_circle(coords[i]):
		inside_count += 1
print("Value of pi = ",4*(inside_count/n)," for n = ",n)
#Output: Value of pi =  3.13572  for n =  100000
Value of pi =  3.0  for n =  4
Value of pi =  3.08  for n =  100
Value of pi =  3.1324  for n =  10000

As you can see, the value to $\pi $ from this method approaches it's real value as we increase $n$. Let's see a plot for few values of $n$ (along with code)

The code to generate this is here:

import numpy as np
import matplotlib.pyplot as plt

n = 100000
x_vals = np.random.uniform(-1,1,n)
y_vals = np.random.uniform(-1,1,n)
coords = np.vstack((x_vals,y_vals)).T
inside_x = []; inside_y = []
def check_in_circle(coord):
	if coord[0]**2+coord[1]**2<=1:
		return True
	else:
		return False
for i in range(n):
	if check_in_circle(coords[i]):
		inside_x.append(coords[i][0])
		inside_y.append(coords[i][1])
inside_count = len(inside_x)
pi_val = 4*inside_count/n
figure, axes = plt.subplots()
plt.scatter(x_vals,y_vals,color="BLUE",s=0.1)
plt.scatter(inside_x,inside_y,color="GREEN",s=0.1)
qa = plt.Circle((0,0),1,fill=False, color="RED", linewidth=2)
plt.xlim(-1.01,1.01)
plt.ylim(-1.01,1.01)
axes.set_aspect(1)
axes.add_artist(qa)
plt.title("n=%s and points_inside = %s, $\\pi$ = %s"%(n,inside_count,pi_val))
plt.show()

We can make a video of $\pi $ vs $n$ to see how it converges.

This is all for today. The code for the video can be found here: Code of Animation

I hope you learn something new.

A code to find Primes - Sieve of Eratoshenes

To some extent the beauty of number theory seems to be related to the contradiction between the simplicity of the integers and the complicated structure of the primes, their building blocks. This has always attracted people.
A. Knauf from "Number theory, dynamical systems and statistical mechanics"

This quote is indeed true. If you just think about the simplicity of integers and complexity of primes, it's really horrifying. It's almost like how easy and intuitive it is to analyse big "classical objects" but extremely un-intuitive to discuss about "quatum objects".

Today's blog will be a bit special. Today we will discuss about a simple algorithm which is one of the most easiest algorithm which find primes in a certain range. This is called The Sieve of Eratosthenes.

Main Idea of Eratosthenes

The main idea behind this algorithm is very simple. Let's try to understand this using an example.

Suppose, you want to find all primes within $20$.

First you write down all numbers from $2$ to $20$. The list will be $[2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]$
Then, you take $2$ and eliminate all it's multiple from the list. This will remove all evens. The remaining list will be $[2,3,5,7,9,11,13,15,17,19]$.
After this, you take $3$ and start eliminating all it's multiple. After this step the new list is $[2,3,5,7,11,13,17,19]$
If we go on again and again by taking the smallest untouched number of the remaining list (in this case 5), then we will filter out all possible composite numbers from the list and finally we will have all primes.

This is basically our algorithm. Below you can find the algorithm for $30$.

Code to perform this

The code to perform this is quite easy, specially in python, because of the filter function. Let's see the code;

n = 20
all_list = [i for i in range(2,n+1)]
factor_2_remove = filter(lambda m: m %2 !=0, all_list)
factor_2_remove = list(factor_2_remove)
factor_2_remove.append(2)
print(factor_2_remove)
#output:[3, 5, 7, 9, 11, 13, 15, 17, 19, 2]

As you can clearly see, first we have created a list containing a list from $2$ to $n$, then we use filter. This function filter out the numbers which doesn't satisfy the given. Here the condition is $m \not\equiv 0$ mod $2$. This is written as $m%2 !=0$ in python. After using this we again have to add $2$ as it also doesn't satisfy the condition so, it also is removed.

So, to find all primes within any range, we simply do this for all number, not just $2$. You can use a simple for loop for this.

n = 100
all_list = [i for i in range(2,n+1)]
for i in len(all_list):
    first_ele = all_list[0]
    all_list = filter(lambda m: m %first_ele !=0, all_list)
    all_list = list(all_list)
    all_list.append(first_ele)
print(all_list)
#Output: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]

See!, how easy it is.. But we can do even better. Notice, here we have to run the loop $\pi(n)$ times. But we don't have to do that.

Note: $\pi(n)$ is a function, which represent number of primes in between $2$ to $n$.

Remember: If an integer $a>1$ is not divisible by any prime $p\leq \sqrt{a}$, then $a$ is of necessity a prime.

Using this we can increase the efficiency of the program. Let's try to understand this using an example.

Suppose, we want to find primes up to $100$. Then, we create a list up to $100$, i.e., $2,3,4,5,6,7,\cdots ,98,99,100$. Now, we take $2$ and remove all multiples of $2$ from the list, except $2$ itself. We again do the same for $3$, $5$,$\cdots $. But notice $\sqrt{100}=10$, so we just do this up to $10$, i.e., up to $7$. And this will generate all primes within $100$.

If we do this, now the we only have to run the loop for $\pi(n)$ times. Let's see how much the speed has increased. I will now make two functions, one which runs the loop $\pi(n)$ times and one which runs $\pi(\sqrt{n})$ times.

import math as mp
def with_just_sqrtn(n):#uses sqrt n
    all_list = [i for i in range(2,n+1)]
    root_n = mp.ceil(mp.sqrt(n))
    for i in range(n):
        first_ele = all_list[0]
        all_list = filter(lambda m: m %first_ele !=0, all_list)
        all_list = list(all_list)
        all_list.append(first_ele)
        if first_ele == root_n:
            break
    return all_list
print(with_just_sqrtn(100))
#output: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]

See, the output is same as before.

The previous one can also be written as a function.

def with_all(n):
    all_list = [i for i in range(2,n+1)]
    for i in range(n):
        first_ele = all_list[0]
        all_list = filter(lambda m: m %first_ele !=0, all_list)
        all_list = list(all_list)
        all_list.append(first_ele)
    return all_list

Now, let's find which one takes much time. We will use $n=1,00,000$

n = 1_00_000
import time
start_t = time.time()
with_all(n)
print("Time = %s seconds"%(time.time()-start_t))
#Output: Time = 131.59189105033875 seconds

Just close to $2$ min $11$ sec. Now, let's run the one with small loop number.

start_t = time.time()
with_just_sqrtn(n)
print("Time = %s seconds"%(time.time()-start_t))
#Output: Time = 0.1107027530670166 seconds

What the !!, just 0.11 sec?

So, i hope now you can see, how much faster it is compared to the previous one.

This is all for today, I hope you have learnt something new.