43. Statistics – Categorical Hypothesis Testing

As discussed in the previous chapters, data are consists of categorical or continuous data. In this section, categorical data analytics’ testing method would be illustrated with Chi-Square testing based on contingency table.

In categorical data analysis, there are 4 different types of test performed listed below:

This test is generally looking at the actual answer’s distribution is the same or not compared to the expected categorical distribution. 

And in this setting ,the goodness of fit generally examine one classification

This test is generally looking at the actual answer’s distribution is the same or not compared to the expected categorical distribution. 

 

And in this setting ,the goodness of fit generally examine one population’s observation or opinions.

‘s observation or opinions.

For test of independence, it is applied to examine two different categories. And verify the categories are independent or not via Chi-Square’s test of independence. This will gather N amount of samples from population before classify into 2 different of categories. 

 

For test of independence, the sampling selection is coming from the same sample group. Then the classification is executed to classify different categories.

Compare to the independence tests, the test of homogeneity is to verify whether the 2 independent samples come from the same population. 

Per default, the null hypothesis for this test is to assume both sample groups come from the same population. While the alternative hypothesis are different. 

 

Lastly, the sampling is different, because the two different samples were pre-obtained before evaluating the observation is the same or not.

This test is to evaluate whether if the population changed their decision after the implementation is made. 

 

The analysis is done with chi-square distribution to evaluate the trend with the degrees of freedom 1. If the chi-square value reaches the significance level, then the experiment’s impact is significant and vice versa.

 

The null hypothesis for test of change is the experiment will not impact the preference of results, and alternative hypothesis indicates the preference of results will be impact based on the experiment. 

For categorical test, the chi-square estimation is calculated below as the generic formula. This will also be applied for the majority of the categorical data hypothesis testing.

Chi Square Observation Calculation

In order to make sure the readers can understand, the example below will have one type of categorical testing to demonstrate.

Goodness of Fit Example

Test of Independent Example

Test of Homogeneity Example

Test of Change Illustration and Example

In general, the greater the Chi-Square value (X2) then the null hypothesis would be more likely to be rejected. And the greater the selected sample size is, the highly likely the null hypothesis would be rejected as well. But this should only be applied when the null hypothesis is correct.

Share your thoughts