Tuesday, February 17, 2015

py21. Negative Binomial Distribution in Python

In negative binomial distribution, we find probability of k successes in n trials, with the requirement that the last trial be a success. This requirement is due to the fact that the total probability is 1, and we not not want any double counting.


Unlike binomial distribution, in negative binomials, k is fixed. Instead of using n, we use r, the number of failures, as the x-parameter, and which is just n-k.


Thus, if we have 4 trials and we want 3 successes we can have these combinations (p = success, q = failure):


p p q p


p q p p


q p p p


The above can be found from choose(3,2) = 3. The function scipy.misc.comb is used with the alias choose.


Thus, if n is the number of trials, and k is the number of success, we have choose(n-1,k-1) combinations. This is due to the loss of freedom of the final trial. Each of these combinations has the probability p^k*q^(n-k), where q, failure, is 1-p.


We can rewrite the formulas in terms of failures, r, where r = n - k. This distribution goes from r = 0, to infinity. However, it goes to zero very fast. The reason we change the formulas to depend on k and r is so we can compare it to the Python probability function for negative binomial distribution, scipy.stats.nbinom.pmf

# ex21.py
from __future__ import print_function, division
import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import nbinom
from scipy.misc import comb as choose
N = 12
p = 0.55
k = 3 
suc = np.zeros(N)
for r in range(N):
    suc[r] = choose(r+k-1,k-1)*p**k*(1-p)**r

dis = nbinom.pmf(range(N),k,p)
# dis == suc
err = sum(dis-suc)
print("err =",err)
plt.bar(range(N),suc, color='gray')
plt.xlabel('r fails')
plt.ylabel('{} suc with 0:{} fails'.format(k,N-1))
plt.title('Negative Binomial Distribution, p = {}'.format(p))
plt.show()
# err = -3.0617869351e-16

Output:

No comments:

Post a Comment