Sunday, February 8, 2015

py12. Normal Approximation to Binomial in Python

According to the Central Limit Theorem, average of distributions such as Binomial can be approximated by the Normal for large n.


The function norm and binom from scipy.stats are used. For the binomial, the probabilities are summed for a range near p*n.


For the normal, the cdf at x = 0 is p = 0.5. Thus only the right side probability is used and multiplied by 2 since normal is symmetric about mean.

# ex12.py
from __future__ import print_function, division
import numpy as np
from scipy.stats import norm, binom
import matplotlib.pyplot as plt
p = 0.4
n = np.arange(100,2001,100)
s = np.zeros(len(n))
nrm = np.zeros(len(n))
print('Probability of successes n*(p-.01):n*(p+.01) in n: = s')
for j in range(len(n)):
    vals = range(int(n[j]*(p-.01)), int(n[j]*(p+.01))+1)
    s[j] = sum(binom.pmf(vals, n[j], p ))
print("Normal approximation: = nrm")
for j in range(len(n)):
    mn = n[j]*p
    std = np.sqrt(n[j]*p*(1-p))
    val = n[j]*(p+.01)
    nrm[j] = 2*(norm.cdf(val+0.5, loc = mn, scale = std)-.5)
error = (s-nrm)**2
plt.plot(n, error, 'b')
plt.show()
#Probability of successes n*(p-.01):n*(p+.01) in n: = s
#Normal approximation: = nrm

Output:

1 comment:

  1. How would I write something like this if I have several 10-digit sequences of 0's and 1's, and each sequence has ~65% accuracy, and i'm trying to get higher accuracy, like ~90% or higher to find the correct 10 digit sequence?

    ReplyDelete