**SAMPLING**

A sample survey is characterized by:-

- clearly specified population
- sample selected by a random process from that population
- goal of estimating some population parameters

In the sample survey, randomization is used to reduce bias and to allow the results of the sample to be generalized to the population from which the sample was drawn.

**Some Terminology **

**Element:** An element is an object on which a measurement is made. This could be a voter in a precinct, a product as it comes off the assembly line, or a plant in a field that has either bloomed or not.

**Population:** A population is a collection of elements about which we wish to make an inference. The population must be clearly defined before the sample is taken.

**Sampling Units:** Sampling units are non-overlapping collections of elements from the population that cover the entire population. The sampling units partition the population of interest. The sampling units could be households or individual voters.

**Frame:** A frame is a list of sampling units.

**Sample:** A sample is a collection of sampling units drawn from a frame or frames. Data are obtained from the sample and are used to describe characteristics of the population.

**Example 1** Suppose we are interested in what students in a particular high school think about the drilling for oil in our national wildlife preserves. The elements are the high school students and the population is the students who attend this high school. The sampling units could be the students as individuals with the frame as alphabetical listing of all students enrolled in the school. The sampling units could be homerooms, since each student has one and only one homeroom, and the frame the class list for homerooms.

**Example 2** Suppose we are interested in what voters in a particular precinct think about the drilling for oil in our national wildlife preserves. The elements are the registered voters in the precinct. The population is the collection of registered voters. The sampling units will likely be households in which there may be several registered voters. The frame is a list of households in the precinct.

When the population is the residents of a city, the frame will commonly be the city phone book. However, not everyone in the city has their phone listed in the phone book. In this situation, the frame does not match the population. A survey conducted from the frame of the phone book would likely suffer from under coverage bias.

**Probability Samples**

Sample designs that utilize planned randomness are called probability samples.

**Simple random sample:** The most fundamental probability sample is the simple random sample. In a simple random sample, a sample of n sampling units is selected in such a way that each sample of size n has the same chance of being selected.

**Stratified Random Sample:** A stratified random sample is one obtained be separating the population elements into non-overlapping groups, called strata, and then selecting a simple random sample from each stratum.

**Systematic Sample:** A systematic sample is obtained by randomly selecting at random one element from the first k elements in the frame and every kth element thereafter.

**Cluster Sample:** A cluster sample is a probability sample in which each sampling unit is a collection, or cluster, of elements.

**Sources of Errors in Surveys **

**Sampling Error**Sampling error is a part of any sampling process. If the sampling process were repeated a number of times, the results would differ each time, producing a variation in the estimates of the population parameters.

**Coverage error** results when the frame does not match the population. For example, if the frame is the town phone book, then people with unlisted numbers and those without phones will be missing from the frame.

**Non-response error** is a result of elements in the frame that have died, moved away, refuse to participate, or otherwise are missing from the sample.

**Observation Error** include interviewer error, respondent error, measurement error, and errors in data collection.

**Interviewer erro**r is a result of the interaction between the interviewer and the subject being interviewed. Most people who agree to an interview do not want to appear disagreeable and will tend to side with the view apparently favoured by the interviewer, especially on questions for which the respondent does not have a strong opinion. Reading a question with inappropriate emphasis or intonation can a response in one direction or another. Interviewers of the same gender, racial, and ethnic groups as those being interviewed are, in general, slightly more successful.**Respondent error**is a result of the differing abilities of the respondents in a sample to answer correctly the questions asked. Most respondent errors are unintentional and are due to either recall bias (the respondent does not remember correctly) or prestige bias (the respondent exaggerates). At times, respondent error may be due to intentional deception (the respondent will not admit breaking a law or has a particular gripe against an agency).**Measurement error**occurs when inaccurate responses are caused by errors of definition in survey questions. For example, what does the term unemployed mean? Should the unemployed include those who have given up looking for work, teenagers who cannot find summer jobs, and those who lost part-time jobs? Does education include only formal schooling or technical training, on-the-job classes and summer institutes as well? Items to be measured must be precisely defined and be unambiguously measurable.**Errors in data collection**occur in all surveys.

**Problems with telephone survey**

A major problem with telephone surveys is the establishment of a frame that closely corresponds to the population. Telephone directories have many numbers that do not belong to households, and many households have unlisted numbers. A technique that avoids the problem of unlisted numbers is random digit dialling. In this method, a telephone exchange number (the first three digits of the seven-digit number) is selected, and then the last four digits are dialled randomly until a fixed number of households of a specified type are reached. A mailed questionnaire sent to a specific group of interested persons can achieve good results, but, response rates for this type of data collection are generally so low that all reported results are suspect. Nonresponse can be a problem in any form of data collection, but since we have the least contact with respondents in a mailed questionnaire, we frequently have the lowest rate of response. The low response rate can introduce a bias into the sample because the people who answer questionnaires may not be representative of the population of interest. To eliminate some of this bias, investigators frequently contact the non-respondents through follow-up letters, telephone interviews, or personal interviews.

**Steps in Planning a Survey**

** **1. Statement of objectives

2. Target population

3. The frame

4. Sample design

5. Method of measurement

6. Measurement instrument

7. Selection and training of field-workers

8. The pre-test

9. Organization of fieldwork.

10. Organization of data management

11. Data analysis.

12. Final Report

13. Recapitulation.

Section B Group 6_Rohan Kr. Jha (13FPM004)

Other Member:

- Apurva Ramteke(13PGP068)
- Chandan Parsad(13FPM002)
- Komal Suchak (13PGP086)
- Silpa Bahera (13PGP107)
- Sushil Kumar (13FPM010)
- Vivek Roy (12FPM005)
- Vaneet Bhatia (13FPM008)