Exploring the Null Hypothesis: Definition and Purpose

Key Points

A null hypothesis is the first form of hypothesis testing conducted.
They rely on rejecting the null hypothesis to prove the hypothesis.
They readily allow you to determine if your data falls within a normal distribution.

Hypothesis testing is a branch of statistics in which, using data from a sample, an inference is made about a population parameter or a population probability distribution.

First, a hypothesis statement and assumption are made about the population parameter or probability distribution. This initial statement is called the Null Hypothesis and is denoted by Ho. An alternative or alternate hypothesis (denoted Ha), is then stated which will be the opposite of the Null Hypothesis.

The hypothesis testing process and analysis involves using sample data to determine whether or not you can be statistically confident that you can reject or fail to reject the Ho. If the Ho is rejected, the statistical conclusion is that the alternative or alternate hypothesis Ha is true.

What Is the Null Hypothesis?

Interior Of Empty Fruit Processing And Packaging Plant — ©Juice Flair/Shutterstock.com

Hypothesis testing applies to all forms of statistical inquiry. For example, it can be used to determine whether there are differences between population parameters or an understanding about slopes of regression lines or equality of probability distributions.

In all cases, the first thing you do is state the Null and Alternate Hypotheses. The word Null in the context of hypothesis testing means “nothing” or “zero.”

As an example, if we wanted to test whether there was a difference in two population means based on the calculations from two samples, we would state the Null Hypothesis in the form of:

Ho: mu1 = mu2 or mu1- mu2 = 0

In other words, there is no difference, or the difference is zero. Note that the notation is in the form of a population parameter, not a sample statistic.

Since you are using sample data to make your inferences about the population, you may make an error. In the case of the Null Hypothesis, we can make one of two errors.

Type 1, or alpha error: An alpha error is when you mistakenly reject the Null and believe that something significant happened. In other words, you believe that the means of the two populations are different when they aren’t.
Type 2, or beta error: A beta error is when you fail to reject the null when you should have. In this case, you missed something significant and failed to take action.

How It Works

A classic example is when you get the results back from your doctor after taking a blood test. If the doctor says you have an infection when you don’t, that is an alpha error. That is thinking that there is something significant going on when there isn’t. We also call that a false positive. The doctor rejected the null that “there was zero infection” and missed the call.

On the other hand, if the doctor told you that everything was OK when you did have an infection, then he made a beta, or type 2, error. He failed to reject the Null Hypothesis when he should have. That is called a false negative.

The decision to reject or not to reject the Null Hypothesis is based on three numbers.

Alpha, which you get to choose. Alpha is the risk you are willing to assume of falsely rejecting the Null. The typical values for alpha are 1%, 5%, or 10%. Depending on the importance of the conclusion, you only want to falsely claim a difference when there is none, 1%, 5%, or 10% of the time.
Beta, which is typically 20%. This means you’re willing to be wrong 20% of the time in failing to reject the null when you should have.
P-value, which is calculated from the data. The p-value is the actual risk you have in being wrong if you reject the null. You would like that to be low.

Your decision as to what to do about the null is made by comparing the alpha value (your assumed risk) with the p-value (actual risk). If the actual risk is lower than your assumed risk, you can feel comfortable in rejecting the null and claiming something has happened. But, if the actual risk is higher than your assumed risk you will be taking a bigger risk than you want by rejecting the null.

Writing a Good Null Hypothesis

We’ve covered what they are and how they work, but what does it take to make a good null hypothesis? When writing one you need to consider the population, not the sample. Further, you need to consider that the statement itself needs to indicate there is no difference or a sum of zero.

Benefits of the Null Hypothesis

The testing of the null hypothesis is the foundation of hypothesis testing. By doing so, you set the parameters for your statistical inference.

Statistical Assurance of Determining Differences Between Populations

Just looking at the mathematical difference between the means of two samples and making a decision is woefully inadequate. By statistically testing the null hypothesis, you will have more confidence in any inferences you want to make about populations based on your samples.

It Estimates the Probability of a Population Distribution

Many statistical tests require assumptions of specific distributions. Many of these tests assume that the population follows the normal distribution. If it doesn’t, the test may be invalid.

It Strengthens Your Hypothesis Tests

Hypothesis testing calculations will provide some relative strength to your decisions as to whether you reject or fail to reject the null hypothesis.

Why Is the Null Hypothesis Important to Understand?

The interpretation of the statistics relative to the null hypothesis is what’s important.

Properly Write the Hypothesis to Capture What You Want to Prove

The null is always written in the same format. That is the lack of difference or some other condition. The alternative hypothesis can be written in three formats depending on what you want to prove.

Select an Appropriate Alpha Risk

You don’t want to place too big of a hurdle or burden on your decision-making relative to action on the null hypothesis by selecting an alpha value that is too high or too low.

You’ll Have Decision Errors When Choosing How to Respond

Since your decision relative to rejecting or not rejecting the null is based on statistical calculations, it is important to understand how that decision works.

An Industry Example

The new director of marketing just completed the rollout of a new marketing campaign targeting the Hispanic market. Early indications showed that the campaign was successful in increasing sales in the Hispanic market.

He came to that conclusion by comparing a sample of sales prior to the campaign and current sales after implementation of the campaign. He was anxious to proudly tell his boss how successful the campaign was. But, he decided to first check with his Lean Six Sigma Black Belt to see whether she agreed with his conclusion.

The Black Belt first asked the director about his tolerance for the risk of being wrong by telling the boss the campaign was successful when in fact, it wasn’t. That was the alpha value. The Director picked 5% since he was new and didn’t want to make a false claim so early in his career. He also picked 20% as his beta value.

When the Black Belt was done analyzing the data, she found out that the p-value was 15%. That meant if the director told the VP the campaign worked, there was a 15% chance he would be wrong and that the campaign probably needed some revising. Since he was only willing to be wrong 5% of the time, the decision was to not reject the null since his 5% assumed risk was less than the 15% actual risk.

Null Hypothesis Best Practices

Using hypothesis testing to help make better data-driven decisions requires that you properly address the Null Hypothesis.

Use the Right Nomenclature

The null will always be in the form of decisions regarding the population, not the sample.

It Is Always Written as an Absence of Parameter or Process Characteristics

The writing of the Alternate Hypothesis can vary, so be sure you understand exactly what condition you are testing against.

Pick a Reasonable Alpha Risk

Being too cautious will lead you to make beta errors, and you’ll never learn anything about your population data.

Other Useful Tools and Concepts

Looking for more? You might do well to learn all about platykurtic distributions. These distributions differ from normal distributions and can be useful for cluing you into whether your data falls within the domain of normality.

Further, you should learn all about special cause variations. These are factors you cannot control which create outliers in your datasets. Learning how they work and how to account for them is instrumental for anyone in a leadership position.

Conclusion

The proper writing of the Null Hypothesis is the basis for applying hypothesis testing to help you make better data-driven decisions. The format of the Null will always be in the form of zero, or the non-existence of some condition. It will always refer to a population parameter and not the sample you use to do your hypothesis-testing calculations.

Be aware of the two types of errors you can make when deciding on what to do with the Null. Select reasonable risk values for your alpha and beta risks. By comparing your alpha risk with the calculated risk computed from the data, you will have sufficient information to make a wise decision as to whether you should reject the Null Hypothesis or not.

Exploring the Null Hypothesis: Definition and Purpose

Key Points