Hypothesis testing Using Excel
SectionA_Group8_Amitkumar Gandhi(13PGP017) : Session 08
Excel being used widely in industry it is important to learn analysis through excel. After learning Hypothesis testing through SPSS in Lecture (session 8 Sec A,PGP13-15,IIM Raipur), I am using this platform to make you aware of Advance Excel formulae & how to prove Hypothesis Using Excel.
Look at the cross tabulation below assuming data which is Pivoted for Verdict Pass vs Gender
The question raised is whether on the basis of these observed frequencies we can make any inferences about the degree of association between gender and verdict. To do this requires that we have some idea of the frequencies that would be expected on a purely random basis and then compare the actually observed frequencies with the expected ones.
To calculate the expected frequencies we argue as follows. 29/50 are Female, 24/50 fail so 29/50*24/50 =0.2784 of the total of 50 = 13.92 would be expected to be Female and fail. Once we cancel the grand total (50) from top and bottom we get: expected frequency = column total*row total/grand total =29*24/50 = 13.92.
To calculate all four expected frequencies in Excel we proceed as follows. Now in B9 enter:
and copy this cell down into B10 and then along into C9:C10. The expected frequencies will be calculated. With the actual frequencies located in the B9:C10 range and the expected in the B9:C10 range we can now use an Excel function known as =CHITEST to calculate the Chi Squared Statistic for these data. The syntax for this function is =CHITEST(range of actual frequencies, range of expected frequencies)
So use B12 to contain: =CHITEST(B5:C6,B9:C10)
A figure of 0.9634 will be returned, but what does this tell us?
Well, it can be shown that if every observed frequency were equal to the expected frequency then chi squared would be calculated as zero. In this case there is no association between the two variables – the actual frequencies are just what would be expected by chance. However, as the actual and the expected frequencies start to diverge then the calculated value of chi squared increases and as it does so provides evidence of a degree of association between the variables. But how high does it have to become before it becomes significant? The answer is to compute the significance of the calculated chi squared statistic from Excel’s =CHIDIST function.
The syntax is: =CHIDIST(Calculated Chi Squared Value, degrees of freedom)
Now note that in an R by C contingency (Pivot) table the Degrees of Freedom are given by (R-1)*(C-1). So with 2 rows and two columns our table has (2-1)*(2-1) = 1 degree of freedom. Now use B14 to contain: =CHIDIST(B12,1)
A result of 0.326331 will be obtained and as with we know that significance tests we require a value of 0.05 or less to provide an acceptable level of risk (5% risk of error,95% non-risk of error). We conclude in this case that the result is not significant at the 5% level.
Conclusion: In other words, Gender and Verdict are not statistically associated in a significant manner.
Through above example of effect on gender on verdict I tried to explain how to use excel for Hypothesis testing hope you have understood.
This article is written by Mr. Amitkumar Prakash Gandhi PGP2013-15, he can be reached on email id firstname.lastname@example.org