Skip to main content

Test of Hypothesis, P Values and Related Concepts

The Principle of the Hypothesis Test

The principle is to formulate a hypothesis and an alternative hypothesis, H0H_0 and H1H_1 respectively, and then select a statistic with a given distribution when H0H_0 is true and select a rejection region which has a specified probability (α)(\alpha) when H0H_0 is true.

The rejection region is chosen to reflect H1H_1, i.e. to ensure a high probability of rejection when H1H_1 is true.

Examples

Example

Flip a coin to test

H0:P=12H_0: P = \displaystyle\frac {1} {2} vs H1:P12H_1: P \neq \displaystyle\frac {1} {2} \ Reject, if no heads or all heads are obtained in 6 trials, where the error rate is

P[Reject H0 when true]=P[All heads or all tails]=P[All heads]+P[All tails]=126+126=2164=132<0.05\begin{aligned} P[ \text{Reject } H_0 \text{ when true}] &= P [\text{All heads or all tails}] \\ &= P[\text{All heads}] + P[\text{All tails}] \\ &= \displaystyle\frac {1} {2^6} + \displaystyle\frac {1} {2^6} \\ &= 2 \cdot \displaystyle\frac {1} {64} \\ &= \displaystyle\frac {1} {32} \\ &< 0.05 \end{aligned}

A variation of this test is called the sign test, which is used to test hypothesis of the form,

H0H_0 : true median equals zero using a count of the number of positive values.

The One Sided zz -test for normal mean

Consider testing

H0:μ=μ0H_0: \mu = \mu_0

vs

H1:μ>μ0H_1: \mu > \mu_0

Where data x1,,xnx_1, \ldots, x_n are collected as independent observations of X1,,XnN(μ,σ2)X_1, \ldots,X_n \sim N(\mu, \sigma^2) and σ2\sigma^2 is known. If H0H_0 is true, then

xˉN(μ0,σ2n)\bar {x} \sim N(\mu_0, \displaystyle\frac{\sigma^2}{n})

So,

Z=xˉμ0σnN(0,1)Z = \displaystyle\frac{\bar {x} - \mu_0}{\displaystyle\frac{\sigma} {\sqrt{n}}} \sim N(0,1)

It follows that,

P[Z>z]=αP[Z>z^\ast] = \alpha

Where

z=z1αz^\ast = z_{1-\alpha}

So if the data x1,,xnx_1, \ldots,x_n are such that,

z=xˉμ0σn>zz = \displaystyle\frac{\bar {x} - \mu_0}{\displaystyle\frac{\sigma} {\sqrt{n}}} > z^\ast

Then H0H_0 is rejected.

Examples

Example

Consider the following data set: 47,42,41,45,4647, 42, 41, 45, 46.

Suppose we want to test the following hypothesis

H0:μ=42H_0 : \mu = 42

vs

H1:μ>42H_1 : \mu > 42

where σ=2\sigma = 2 is given. The mean of the given data set can be calculated as

xˉ=44.2\bar {x} = 44.2

we can calculate zz by using following equation

z=xˉμσn=44.24225=2.20.8944=2.459\begin{aligned} z &= \displaystyle\frac{\bar {x} - \mu}{\displaystyle\frac{\sigma} {\sqrt{n}}} \\ &= \displaystyle\frac{44.2 - 42}{\displaystyle\frac{2} {\sqrt{5}}} \\ &= \displaystyle\frac{2.2}{0.8944} \\ &= 2.459 \end{aligned}

Here α=0,05\alpha = 0,05 so we have that

z=1.645z^\ast = 1.645

We obtain that 2,459>1,6452,459 > 1,645, i.e. z>zz> z^\ast and so H0H_0 is rejected with α=0.05\alpha = 0.05

The Two-sided zz -test for a normal mean

z:=xμ0snN(0,1)z: =\displaystyle\frac{\overline{x}-\mu_0}{s\sqrt{n}} \sim N(0,1)

Details

Consider testing H0:μ=μ0H_0: \mu=\mu_0 versus H1:μμ0H_1: \mu \ne \mu_0 based on observation from X1,,XN(μ,σ2)\overline{X_1},\dots, \overline{X} \sim N(\mu, \sigma^2) independent and identically distributed where σ2\sigma^2 is known. If H0H_0 is true, then

Z:=xμ0σnN(0,1)Z: = \displaystyle\frac{\overline{x}-\mu_0}{\sigma \sqrt{n}} \sim N(0,1)

and

P[z>z]=αP[|z| > z^\ast] = \alpha

with

z=z1αz^\ast = z_{1-\alpha}

We reject H0H_0 if z>z|z| > z^\ast. If z>z|z| > z^\ast is not true, then we cannot reject the H0H_0 .

Examples

Example

In R, you may generate values to calculate the zz value. The command that is generally used is: quantile

To illustrate:

z<-rnorm(1000,0,1) quantile(z,c(0.025,0.975)) 2.5% 97.5% -1.995806 2.009849

So, the zz value for a two-sided normal mean is 1.99\left |-1.99 \right |.

The One-sided T-test for a Single Normal Mean

Recall that if X1,,XnN(μ,σ2)X_1,\dots,X_n \sim N(\mu,\sigma^2) independent and identically distributed then

XμS/ntn1\displaystyle\frac{\overline{X}-\mu}{S/\sqrt{n}}\sim t_{n-1}

Details

Recall that if X1,,XnN(μ,σ2)X_1,\ldots,X_n \sim N(\mu,\sigma^2) independent and identically distributed then

XμS/ntn1\displaystyle\frac{\overline{X}-\mu}{S/\sqrt{n}}\sim t_{n-1}

To test the hypothesis H0:μ=μ0H_0:\mu=\mu_{0} vs H1:μ>μ0H_1:\mu > \mu_{0} first note that if H0H_0 is true, then

T=Xμ0S/ntn1T= \displaystyle\frac{\overline{X}-\mu_{0}}{S/\sqrt{n}} \sim t_{n-1}

so

P[T>t]=αP[T>t^{\ast}]=\alpha

if

t=tn1,1αt^{\ast}=t_{n-1,1-\alpha}

Hence, we reject H0H_0 if the data x1,,xnx_1,\dots,x_n results in a a value of t:=xμ0S/nt:=\displaystyle\frac{\overline{x}-\mu_0}{S/\sqrt{n}} such that t>tt>t^{\ast}, otherwise H0H_0 cannot be rejected.

Examples

Example

Suppose the following data set (12,19,17,23,15,27) comes independently from a normal distribution and we need to test H0:μ=μ0H_0:\mu=\mu_0 vs H1:μ>μ0H_1:\mu>\mu_0. Here we have n=6,x=18.83,s=5.46,μ0=18n=6,\overline{x}=18.83, s=5.46, \mu_0=18

so we obtain

t=xμ0s/n=0.37t=\displaystyle\frac{\overline{x}-\mu_0}{s/\sqrt{n}}= 0.37

so H0H_0 cannot be rejected

In R, tt^{\ast} is found using qt(n-1,0.95) but the entire hypothesis can be tested using

t.test(x,alternative="greater",mu=<18)

Comparing Means from Normal Populations

Suppose data are gathered independently from two normal populations resulting in x1,,xnx_1,\dots,x_n and y1,ymy_1,\dots y_m

Details

We know that if

X1,,XnN(μ1,σ)X_1, \dots, X_n \sim N(\mu_1,\sigma)

Y1,,YmN(μ2,σ)Y_1, \dots, Y_m \sim N(\mu_2,\sigma)

are all independent then

XˉYˉN(μ1μ2,σ2n+σ2m)\bar{X}-\bar{Y} \sim N(\mu_1-\mu_2,\displaystyle\frac{\sigma^2}{n}+\displaystyle\frac{\sigma^2}{m})

Further,

i=1n(XiXˉ)2σ2Xn12\displaystyle\sum_{i=1}^{n} \displaystyle\frac{(X_i-\bar{X})^2}{\sigma^2} \sim X_{n-1}^{2}

and

j=1m(YjYˉ)2σ2Xm12\displaystyle\sum_{j=1}^{m} \displaystyle\frac{(Y_j-\bar{Y})^2}{\sigma^2} \sim X_{m-1}^{2}

so

Σi=1n(XiXˉ)2+Σj=1m(YjYˉ)2σ2Xn+m22\displaystyle\frac {\Sigma_{i=1}^{n}(X_i-\bar{X})^2 + \Sigma_{j=1}^{m}(Y_j-\bar{Y})^2}{\sigma^2} \sim X_{n+m-2}^2

and it follows that

XˉYˉ(μ1μ2)S(1n+1m)tn+m2\displaystyle\frac {\bar{X}-\bar{Y}-(\mu_1-\mu_2)}{S\sqrt{(\displaystyle\frac{1}{n}+\displaystyle\frac{1}{m})}} \sim t_{n+m-2}

where

S=Σi=1n(X1Xˉ)2+Σj=1m(YjYˉ)2n+m2S=\sqrt{\displaystyle\frac{\Sigma_{i=1}^{n}(X_1-\bar{X})^2+\Sigma_{j=1}^{m}(Y_j-\bar{Y})^2}{n+m-2}}

consider testing H0:μ1=μ2H_0:\mu_1=\mu_2 vs H1=μ1>μ2H_1=\mu_1>\mu_2. Hence, if H0H_0

is true then the observed value

t=xˉyˉS1n+1mt=\displaystyle\frac{\bar{x}-\bar{y}}{S\sqrt{\displaystyle\frac{1}{n}+\displaystyle\frac{1}{m}}}

comes from a tt -test with n+m2n+m-2 df and we reject H0H_0 if t>t\left|t\right|>t^\ast. Here,

S=i(xixˉ)2+j(yjyˉ)2n+m2S=\sqrt{\displaystyle\frac{\displaystyle\sum_{i}(x_i-\bar{x})^2+\displaystyle\sum_{j}(y_j-\bar{y})^2}{n+m-2}}

and t=tn+m2,1αt^\ast=t_{n+m-2,1-\alpha}

Comparing Means from Large Samples

If X1,XnX_1,\dots X_n and Y1,YmY_1,\dots Y_m, are all independent (with finite variance) with expected values of μ1\mu_1 and μ2\mu_2 respectively, and variances of σ12\sigma_1^2,and σ22\sigma_2^2 respectively, then

XY(μ1μ2)σ12n+σ22mN(0,1)\displaystyle\frac{\overline{X}-\overline{Y}-(\mu_1-\mu_2)}{\sqrt{\displaystyle\frac{\sigma_1^2}{n}+\displaystyle\frac{\sigma_2^2}{m}}} \sim N(0,1)

if the sample sizes are large enough

This is the central limit theorem.

Details

Another theorem (Slutzky) stakes that replacing σ12\sigma_1^2 and σ22\sigma_2^2 with S12S_1^2 and S22S_2^2 will result in the same (limiting) distribution

It follows that for large samples we can test

H0:μ1=μ2vsH1:μ1>μ2H_0: \mu_1=\mu_2 \qquad \text{vs} \qquad H_1:\mu_1 > \mu_2

by computing

z=xys12n+s22mz=\displaystyle\frac{\overline{x}-\overline{y}}{\sqrt{\displaystyle\frac{s_1^2}{n}+\displaystyle\frac{s_2^2}{m}}}

and reject H0H_0 if z>z1αz>z_{1-\alpha}.

The P-value

The pp -value of a test is an evaluation of the probability of obtaining results which are as extreme as those observed in the context of the hypothesis.

Examples

Example

Consider a dataset and the following hypotheses

H0:μ=42H_0:\mu=42

vs.

H1:μ>42H_1:\mu>42

and suppose we obtain

z=2.3z=2.3

We reject H0H_0 since

2.3>1.645+z0.952.3>1.645+z_{0.95}

The pp -value is

P[Z>2.3]=1Φ(2.3)P[Z>2.3]= 1-\Phi(2.3)

obtained in R using

1-pnorm(2.3) [1] 0.01072411

If this had been a two tailed test, then

P=P[Z>2.3]=P[Z<2.3]+P[Z>2.3]=2P[Z>2.3]\begin{aligned} P &= P[|Z|>2.3] \\ &= P[Z<-2.3]+P[Z>2.3] \\ &= 2\cdot P[Z>2.3] \end{aligned}

The Concept of Significance

Details

Two sample means are statistically significantly different if the null hypothesis H0:μ1=μ2H_0:\mu_1 = \mu_2, can be rejected. In this case, one can make the following statements:

  • The population means are different.
  • The sample means are significantly different.
  • μ1μ2\mu_1 \ne \mu_2
  • xˉ\bar{x} is significantly different from yˉ\bar{y}.

But one does not say:

  • The sample means are different.
  • The population means are different with probability 0.950.95.

Similarly, if the hypothesis H0:μ1=μ2H_0: \mu_1 = \mu_2 cannot be rejected, we can say:

  • There is no significant difference between the sample means.
  • We cannot reject the equality of population means.
  • We cannot rule out.

But we cannot say:

  • The sample means are equal.
  • The population means are equal.
  • The population means are equal with probability 0.950.95.