42. Statistics – Hypothesis Testing

In general, the main purpose for statistical sampling is evaluating the parameters coming from the population. But it can also examine whether the hypothesis for the population is appropriate or not. This is when hypothesis testing is required to examine the appropriateness of the hypothesis before validating the data. 

In hypothesis testing, null hypothesis (H0) and alternative hypothesis (H1) are the two primary components to examine the hypothesis. And the following table summarizes the scenarios when you accept or reject null hypothesis.

Hypothesis Testing Evaluation 

In type I error, α (alpha) is also known as significance level for statistics. α is a measure of the strength of the evidence (default statement would be 95% confidence interval) that must be present in your sample before rejecting null hypothesis and conclude that the effect is statistically significant . 

In general, Type II error is more of the permittable error compare to type I error. And also in the common statistics software such as JMP or Minitab, the null hypothesis (H0) is assumed to be true for the population. And also based on the hypothesis testing, there are two types of tests which are listed below:

This applies when you have the following case where the alternative hypothesis is not equal to something. 

e.g.

H0: μ = 69

H1: μ ≠ 69

By knowing if the sample average is greater or less than 69, H0 needs to be rejected and H1 needs to be accepted.

This applies when you have the following case where the alternative hypothesis is smaller or greater than the population average. 

e.g.

H0: μ = 69

H1: μ > 69

By knowing if the sample average is greater than 69, H0 needs to be rejected and H1 needs to be accepted.

And based on the hypothesis testing of the population parameters, there will be different scenarios listed below.

This will consist of two different types of testing which includes two scenarios. One of them would be population’s variance which is known and unknown

This is having independent samples where population variance is known or unknown, while the sample size, placement method and variance matters.

Variance hypothesis testing for one population or multiple populations.

Ratio hypothesis testing for one population or multiple populations.

In general, the t-test, Z-test and F test for the continuous data by comparing the population average, variances are the most often ones. The latter section will illustrate the calculation condition for the following setups when doing hypothesis testing for single population, multiple population and variances.

One Population Average’s Hypothesis Testing Description

Two Population Average’s Hypothesis Testing (t-Test) Description

Variance Hypothesis Testing Description

The calculation is to use the actual sample’s data to compare with the α level’s significance in probability. The rule of thumb is given below.

When calculated score (Z, t, F or Chi Square) does NOT exceed the values for significance level’s indicated statistical value.

When calculated score (Z, t, F or Chi Square) does exceed the values for significance level’s indicated statistical value.

During the process analysis, if wanted to analyze between two different sets of data. Especially comparing the data before and after improvement, the assumption of the H0 and H1 is extremely critical to interpret the data results correctly.

Usually, the H0 would be the two populations (before and after improvement) remains the same while H1 indicate there are statistical differences between before and after improvement. So by looking at the data results, the interpretation can be listed below  

Process Data Analysis Example for Hypothesis Testing

The following examples are some scenarios to illustrate the hypothesis tests.

Hypothesis test with one population mean

Hypothesis test with two large population mean

Hypothesis test with two small population mean (same variance)

Hypothesis test with two small population mean (different variance)

Share your thoughts