Wednesday, February 18, 2015

py22. One Sample t-test in Python

A one-sample t-test is conducted over a range of mean values. The null hypothesis is that test mean is the true mean of the population.


The sample is of length 500 and is a random sample based on a standard normal distribution, which has a mean of 0. Thus, we expect values far from 0 will be rejected. For the normal sample, we use scipy.stats.norm.rvs function


If you uncomment the print() lines, you will get intermediate results. You can use the scipy.stats.ttest_1samp function or a calculation, based on the t-distribution. We have a two sided alternative hypotheses, that is, we sum over both tails. This is the default for ttest_1samp from scipy.stats module. The t distribution, from this module, is used but with the alias st.

# ex22.py
from __future__ import print_function, division
import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import norm, ttest_1samp, t as st
N = 500
mu = np.arange(-.2,+.201,.001) # test for the population mean
pr = np.zeros(len(mu))
rej = np.zeros(len(mu))
samp = norm.rvs(size = N)
mean_samp = samp.mean()
sd_samp = samp.std(ddof = 1)
for i in range(len(mu)):
    m = mu[i]
    t_crit = (mean_samp-m)/(sd_samp/np.sqrt(N))
    if t_crit>=0:
        p = 2*(1-st.cdf(t_crit, df = N-1))
    else:
        p = 2*st.cdf(t_crit, df = N-1 )
    pr[i] = p
    res = ttest_1samp(samp, popmean = m)
    # t1,t2 tuples for prints
    t1 = t_crit, res[0]
    t2 = p, res[1]
    #print('\n-- t_crit = %.5f \tres[0] = %.5f' % t1)
    #print('\n-- p = %.5f \tres[1] = %.5f' % t2)
    if p<0.05:
        #print('\nmu =',m,' rejected')
        rej[i] = 1
    else:
        #print('\nmu =',m,' not rejected')
        rej[i] = 0
plt.plot(mu,rej,'r')
plt.plot(mu,pr,'b')
plt.plot(mu,len(mu)*[0.05],'g')
plt.title('Means rejected (p-value: blue), (threshold: green)')
plt.xlabel('mean')
plt.ylabel('Reject = 1 (red)')
plt.ylim(-.05,1.05)
plt.show()

Output:

No comments:

Post a Comment