# Section B _Group 7_Amrit Jain_13PGP062

SPSS- A Statistical Tool

While thinking for a topic for our blog, I was confused which topic should I choose. Also I was solving a problem on SPSS. Suddenly it struck to my mind why not to write on IBM SPSS software itself! I thought it can be an interesting topic to write. SPSS- Statistical Package for the Social Sciences as originally called. It is a Windows based program that can be used to perform data entry and analysis and also, to create tables, graphs and pictograms. SPSS is capable of handling huge amounts of data and impressively performing statistical analysis of data. SPSS is updated often with new versions. The one I am using is SPSS 15.0 Evaluation version.  It was developed by Norman H. Nie and C. Hadlai Hull of IBM Corporation in the year 1968. It is compatible with Windows, Linux, UNIX & Mac operating systems. SPSS is among the most widely used programs for statistical analysis in social sciences.

Before learning about SPSS I was confused whether spreadsheet applications like Microsoft Excel or Openoffice Calc. is better than SPSS, because spreadsheets are also widely used for statistical analysis. But after doing secondary research on the same I got impressed by the marvels of this tool. This learning came as a value addition for me. SPSS looks a lot like a typical spreadsheet application. When we open it, we see the familiar tabular grid and we enter values in cells. Spreadsheets, on the other hand, are capable of a lot of things that SPSS is good at, like generating graphs and statistics on a data set. The difference can be summed up in the following points:

Flexibility: Spreadsheets are designed to be very flexible and broadly applicable to many different tasks, while SPSS is specifically designed for statistical processing of large amounts of data at an enterprise level.  For example, unlike a spreadsheet, SPSS has the concepts of “case” and “variable” built-in. The rows in SPSS always represent cases, for example survey responses( typically, of a questionnaire) or experimental subjects, and the columns always represent variables observed from those cases, like the specific values given by the survey respondent or measurements from the experimental subject. Owing to this case/variable arrangement, when some calculation is performed over a set of data, the result does not get inserted into another cell on the table, like it would in a typical spreadsheet, but appears in a separate window. This is particularly advantageous when dealing with large sets of data, since it keeps calculated statistics and graphs separate from the raw data but still easily accessible. Spreadsheet like MS Excel has a lot more functions than SPSS and gives more flexibility in how you use them.

Ease of use: It is also much more convenient to perform statistical tests in SPSS, even though many are possible using typical spreadsheets. For example, to perform a one-sample T-test with Excel, we’ll have to calculate the T value independently for the sample and use the “T.DIST” function to return the significance, while also selecting a cell for the results and labelling it in another cell. To perform the same test in SPSS we select a variable and supply the value to compare with our sample and, when we click “Ok,” SPSS generates a table with t, the degrees of freedom, the significance, and a confidence interval neatly calculated. SPSS makes it easy to understand statistical results. It has added a lot of extra help files and tutorials that explain how we can or should interpret a lot of the statistical jargon that the software spits out. Spreadsheets don’t provide so.

Modernity: Probably the most significant advantage of using SPSS is that it was designed with modern data collection methods in mind. A lot of data that’s collected, especially survey data, is numerically coded before it’s electronically stored. So for example a response of “strongly agree” might become a 6; a level of education such as “completed high school” or “some college” might become a 10 or 11. SPSS makes it possible to automatically define the variable so that the coded values are connected to their original meanings. For this reason only, various surveys and polls, (including many that U of I students and faculty can access through Roper iPoll, ICPSR, and other sets provided through the U of I library), make their raw data available in SPSS’s native.” say” format.

The differences mentioned above are the major ones. Now there are some disappointments from SPSS too. For e.g. SPSS doesn’t update the values of cells automatically when changes are made elsewhere in our data despite having setup a compute command. Also, if we delete one variable we cannot restore it. IBM SPSS is expensive, sometimes ridiculously so, and even when we do buy we are really only leasing, and its license is definitely not user friendly. There are often compatibility issues with prior.

But despite these minor issues I really like working on SPSS relative to spreadsheet application. Summing in one line, ease of use and in-depth data analysis are the features which really impressed me.

Other Members: Amrit Jain, Ankit Saxena, Gugan N, Jyoti Kanwatia, Nitin Sonkar, Sonam Supriya, Sumit Ranjan,Yogesh Sham Gupta