Section A _Group 1_Vishal Dandwate_13PGP013 (Session 8)

In the previous session we learned about different types of variables/data. In this session we will learn about “Analysis” of given variables and how to find the relation between two different variables.

One can perform descriptive analysis for a given data; such kind of analysis gives frequency distribution of particular variable, standard deviation, mean, std. error etc.

After preliminary analysis of distribution of each of the variables, our next task is to look for relationships among two or more of the variables. There are multiple tools that may be used include correlation and regression, t-test etc. The type of analysis chosen depends on the research design, characteristics of the variables, shape of the distributions, level of measurement, and whether the assumptions required for a particular statistical test are met.

In this session we are going to discuss:  cross tabulation, it is a joint frequency distribution of cases based on two or more categorical variables.

When we do cross-tabulation we don’t take continuous variable as the its output due to use of continuous variable shall be very long and difficult to interpret. In such case to get more meaningful output, we transform continuous variable data into categorical variables by dividing it into mutually exclusive and collectively exhaustive intervals.

Thus redefined categorical variables are cross-tabulated as row and column interchangeably to get more substantiate observation.

The observation of cross tabulation can be validated by the use of the chi-square statistic to determine whether the variables are statistically independent or if they are associated.

The chi-square test of statistical significance, first developed by Karl Pearson, assumes that both variables are measured at the nominal level. To be sure, chi-square may also be used with tables containing variables measured at a higher level; however, the statistic is calculated as if the variables were measured only at the nominal level. This means that any information regarding the order of, or distances between, categories is ignored.

The null hypothesis assumed is that there is no relationship between two variables i.e. they are independent and alternate hypothesis is that they are dependent. In business statistics we see only significant value in the output table. If sig. > 0.05 then accept Null hypothesis i.e. there is no relationship between the two variables otherwise reject the Null hypothesis. Value of alpha is generally considered as 0.05, however, in the case of criticality like medical emergencies it is considered as 0.01.

Example :- Generations of students at Washington School have taken field trips at both the elementary and secondary levels. The principal wonders if parents still support field trips for children at either level. Five hundred letters were mailed to parents, asking them to indicate either approval or disapproval; 100 parents returned the response postcard. Each postcard indicated whether the parents’ children were currently enrolled in elementary or high school, and the parents’ approval or disapproval of field trips.

Table below contains the collected data-

 

  Approve

 

Disapprove

 

No Opinion

 

Row Totals

 

Elementary 28 14 5 47
High School 19 28 6 53
Column Totals 47 42 11 100

Analysis of above mentioned data by cross tabulation and chi-square is shown below –

 

Case Processing Summary

 

Cases

Valid

Missing

Total

N

Percent

N

Percent

N

Percent

Parents * Opinion

100

100.0%

0

0.0%

100

100.0%

 

Parents * Opinion Crosstabulation

 

Opinion

Total

Approve

Disapprove

No Opinion

Parents Elementary school parents Count

28

14

5

47

% within Parents

59.6%

29.8%

10.6%

100.0%

high school parents Count

19

28

6

53

% within Parents

35.8%

52.8%

11.3%

100.0%

Total Count

47

42

11

100

% within Parents

47.0%

42.0%

11.0%

100.0%

 

Chi-Square Tests

 

Value

df

Asymp. Sig. (2-sided)

Pearson Chi-Square

6.143a

2

.046

Likelihood Ratio

6.222

2

.045

Linear-by-Linear Association

3.262

1

.071

N of Valid Cases

100

   
a. 0 cells (0.0%) have expected count less than 5. The minimum expected count is 5.17.

 

Clearly from Chi-Square test sig value= 0.046 < 0.05, We reject the Null Hypothesis , hence we can conclude that Parents do approve field trips.

Advertisements
Standard

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s