Statistical procedures can be divided into two major categories:
- Descriptive statistics
- Inferential statistics.
Descriptive statistics includes statistical procedures that we use to describe the population we are studying. The data could be collected from either a sample or a population, but the results help us organize and describe data. On the other hand, inferential statistics is concerned with making predictions or inferences about a population from observations and analyses of a sample.
The method in which we select samples to learn more about characteristics in a given population is called hypothesis testing. Hypothesis testing is really a systematic way to test claims or ideas about a group or population.
The method for hypothesis testing can be described in four simple steps:
Step 1: Null and alternative hypotheses
The null hypothesis is a claim of “no difference”. In a mathematical formulation of the null hypothesis there will typically be an equal sign. The null hypothesis is what we are attempting to overturn by our hypothesis test. If we are studying a new treatment, the null hypothesis is that our treatment will not change our subjects in any meaningful way. This hypothesis is denoted by H0.
The opposing hypothesis is the alternative hypothesis. The alternative hypothesis is a claim of “a difference in the population,” and is the hypothesis the one often hopes to bolster. This hypothesis is denoted by either Ha. In a mathematical formulation of the alternative hypothesis there will typically be an inequality, or not equal to symbol.
Step 2: Test statistic
We calculate a test statistic from the data. There are different types of test statistics. One of them is the z statistic. The z statistic will compare the observed sample mean to an expected population mean μ0. Large test statistics indicate data are far from expected, providing evidence against the null hypothesis and in favour of the alternative hypothesis.
Step 3: p Value and conclusion
The test statistic is converted to a conditional probability called a P-value. The P- value answers the question “If the null hypothesis were true, what is the probability of observing the current data or data that is more extreme?”
Step 4: Decision
Alpha (α) is a probability threshold for a decision. If P ≤ α, we will reject the null hypothesis. Otherwise it will be retained for want of evidence. α is called the level of confidence. For example if α =0.95, we can say that the solution of the hypothesis test can be claimed to be correct with a confidence of 95 %. The level of confidence is decided based on the criticality of the problem. The more the criticality of the problem, the more will be the level of confidence set at.
TYPES OF ERRORS IN HYPOTHESIS TESTING
There are basically two kinds of errors that are possible in case of hypothesis testing:
Type I Error:
This kind of error is also known as a “false positive”: the error of rejecting a null hypothesis when it is actually true. In other words, this is the error of accepting an alternative hypothesis (the real hypothesis of interest) when the results can be attributed to chance. It occurs when one is observing a difference when in truth there is none.
Type II Error:
This kind of error is also known as a “false negative”: the error of not rejecting a null hypothesis when the alternative hypothesis is the true state of nature. In other words, this is the error of failing to accept an alternative hypothesis when one doesn’t have adequate power. It occurs when one is failing to observe a difference when in truth there is one.
Type I and type II errors are part of the process of hypothesis testing. Although the errors cannot be completely eliminated, we can minimize one type of error. When one tries to decrease the probability of one type of error, the probability for the other type increases. However, if everything else remains the same, then the probability of a type II error will nearly always increase. There is always a trade-off between the two types of errors.
In some cases a Type I error is preferable to a Type II error. In other applications a Type I error is more dangerous to make than a Type II error.
Suppose one is designing a medical screening for a disease. Is a Type I or a Type II error better? A false positive may give the patient some anxiety, but this will lead to other testing procedures. Ultimately the patient will discover that the initial test was incorrect. Contrasted to this, a false negative will give the patient the incorrect assurance that he does not have a disease when he in fact does. As a result of this incorrect information, the disease will not be treated. If one could choose between these two options, a false positive is more desirable than a false negative.
Now suppose that one has been put on trial for murder. The null hypothesis here is that one is not guilty. Which of the two errors is more serious? Again, it depends. A Type I error occurs when one is found guilty of a murder that you did not commit. This is a very a dire outcome. A Type II error occurs when one is guilty but is found not guilty. This is a good outcome for one, but not for society as a whole. Here we see the value in a judicial system that seeks to minimize Type I errors.