# Guide: Variance

Understanding variance is a cornerstone in both the field of statistics and the practice of continuous improvement. Whether you’re a data analyst, a Lean Six Sigma practitioner, or a manager aiming to optimize processes, knowing how to calculate and interpret variance can provide invaluable insights into the performance and reliability of a system.

Variance measures the dispersion or spread of a set of values, essentially telling you how much individual data points differ from the mean. A low variance indicates that the data points are close to the mean, signifying a more stable and predictable process. On the other hand, a high variance implies greater variability, which could signal potential issues that need addressing.

This comprehensive guide aims to demystify the concept of variance for you. We’ll cover its mathematical basis, its importance in various contexts including business and manufacturing, and its specific applications in continuous improvement methodologies like Lean Six Sigma. Whether you’re a beginner or have some experience, this guide has something valuable to offer.

## What is Variance?

### Mathematical Definition of Variance

Variance is a statistical measure that quantifies the dispersion or spread of a set of values. In simpler terms, it helps you understand how far individual numbers in a dataset are from the mean (average) of that dataset. The mathematical formula to calculate the variance (${}^{2}$) for a population of  values is: Where ${}_{}$ represents each value in the dataset, and  is the mean of the dataset.

For a sample of  values, the formula becomes: Here, $\stackrel{}{}$ is the sample mean.

### Simple Example to Illustrate the Concept

Imagine you have test scores from two different classes: Class A has scores [90, 91, 92, 93, 94] and Class B has scores [70, 80, 90, 100, 110].

The mean score for Class A is 92, and for Class B, it is 90. At a glance, you might think the two classes performed similarly. However, the variance will tell you a different story about how consistent the scores are within each class.

## Types of Variance

### Sample Variance

Sample variance is calculated when you have a subset (sample) of a larger population. The formula for calculating sample variance (s${}^{2}$) uses 
in the denominator to correct for the bias in estimating the population variance from a sample.

### Population Variance

Population variance is calculated when you have data for the entire population you’re studying. The formula for population variance (${}^{2}$) uses $\mathrm{N in the denominator.}$

## 1.3 Units of Variance

### Explanation of How the Units of Variance Are Determined

The unit of variance is the square of the unit of the original data. For example, if you are calculating the variance of heights measured in meters, the unit of variance will be ${\text{meters}}^{2}$ (square meters). This is important to remember because the units of variance are not directly comparable to the original units of the data set. To make them comparable, you can take the square root of the variance to get the standard deviation, which will be in the original unit of measurement.

## Why is Variance Important?

Understanding the importance of variance is crucial for both statisticians and professionals engaged in continuous improvement methodologies like Lean Six Sigma. Below, we delve into why variance is significant in these two contexts.

### In Statistics

#### Role in Descriptive Statistics

In the realm of statistics, variance serves as a foundational pillar of descriptive statistics. Descriptive statistics aim to summarize and interpret a dataset, providing a snapshot of its main characteristics. Variance offers insights into the “spread” or dispersion of the data points around the mean. Without this measure, we would lack a comprehensive understanding of the dataset. For instance, two datasets could have the same mean but different variances; the latter would tell us how consistent or inconsistent the data points are around that mean.

#### Relationship with Standard Deviation

Standard deviation is perhaps one of the most commonly used statistical measures and is directly related to variance. It is simply the square root of the variance. Standard deviation is particularly useful because it shares the same unit as the data, making it easier to interpret and compare. While variance gives you a broad idea of dispersion, standard deviation provides a more tangible sense of “average distance” between each data point and the mean.

### In Continuous Improvement

#### Variance Analysis in Six Sigma

In the context of Six Sigma, a methodology aimed at reducing defects and improving processes, variance is a key metric. Variance analysis can identify weaknesses in a process, thereby helping to pinpoint areas that require improvement. By understanding the variance, practitioners can apply Six Sigma tools like DMAIC (Define, Measure, Analyze, Improve, Control) to reduce variability and enhance process performance.

#### Importance in Process Control and Quality Management

Variance is pivotal in quality management and process control. A high variance often signals an unstable process, prone to defects and inefficiencies. On the other hand, a low variance usually indicates a well-controlled, reliable process. Therefore, monitoring and reducing variance can lead to more predictable outcomes, less waste, and higher quality products or services.

## Calculating Variance

Calculating variance can be done in multiple ways, depending on the resources you have at hand and the context in which you’re working. Here, we will explore three methods: manual calculation, using statistical tools, and online calculators.

### Step-by-step Guide

Calculating variance manually involves a series of steps. Here’s a step-by-step guide to help you compute variance for a dataset:

1. Identify the Dataset: Collect the set of values for which you want to calculate variance.
2. Calculate the Mean: Add all the numbers in the dataset and divide by the number of values to get the mean ().
3. Find the Deviations: Subtract the mean from each number in the dataset to find the deviation of each number from the mean.
4. Square the Deviations: Square each of the deviations obtained in the previous step.
5. Sum the Squared Deviations: Add up all the squared deviations.
6. Calculate Variance: Finally, divide the sum of the squared deviations by for a sample, or  for a population.

### Example Calculation

Let’s say you have a dataset of exam scores: [80, 85, 88, 92, 95].

1. Dataset: [80, 85, 88, 92, 95]

2. Mean (): $\frac{80+85+88+92+95}{5}=88$

3. Deviations: [-8, -3, 0, 4, 7]

4. Squared Deviations: [64, 9, 0, 16, 49]

5. Sum of Squared Deviations: $64+9+0+16+49=138$
6. Variance (s${}^{2}$): $\frac{138}{5-1}=34.5$

## Software and Calculators Available for Calculating Variance

There are various tools and software available for calculating variance, but for most users, Microsoft Excel provides an easy and accessible way to get started. This guide will focus specifically on how to calculate variance using Excel.

## How to Use Excel to Calculate Variance

Below is a step-by-step guide to calculating variance in Excel:

### For Population Variance

1. Open Excel: Start by opening a new Excel spreadsheet.

2. Input Data: In a column, say Column A, input the dataset for which you want to calculate the population variance.

• For example, let’s say your data are [80, 85, 88, 92, 95]. You would input these into cells A1 to A5.
3. Select a Cell for the Result: Click on the cell where you want the variance result to be displayed. This could be any empty cell, like B1.

4. Use the VAR.P Function: In the formula bar, type =VAR.P( and then select the range of cells that contain your dataset. Close the parenthesis and hit Enter.

• For our example, the formula would look like =VAR.P(A1:A5).
5. View the Result: The cell you selected in Step 3 will now display the population variance.

### For Sample Variance

1. Open Excel: If you’re still in Excel, you can continue with the next steps. Otherwise, open a new Excel spreadsheet.

2. Input Data: Just like before, input your dataset into a column.

3. Select a Cell for the Result: Choose an empty cell for displaying the sample variance.

4. Use the VAR.S Function: In the formula bar, type =VAR.S( and then select the range of cells that contain your dataset. Close the parenthesis and hit Enter.

• For our example, the formula would look like =VAR.S(A1:A5).
5. View the Result: The cell you selected will now display the sample variance.

## Applications of Variance

Understanding the concept and calculation of variance is just the tip of the iceberg. The true value of variance becomes evident when applied in various fields to make data-driven decisions and improvements. This section explores its applications in business, manufacturing, and other domains.

#### Variance Analysis in Budgeting and Finance

In the business world, variance analysis is a cornerstone of budgeting and financial planning. Organizations often set financial targets and budgets at the start of a fiscal year. As the year progresses, actual results—be it revenue, expenditure, or any other financial metric—are compared against these planned figures. The difference between the actual and budgeted numbers is known as the variance. Understanding this variance is crucial for making timely corrections, reallocating resources, or even re-strategizing business objectives.

#### Use in Performance Evaluation

Variance is also utilized in performance evaluations, both at the individual and departmental levels. For example, a sales team’s performance can be gauged by calculating the variance between targeted and actual sales. A lower variance usually indicates that the team is meeting its targets consistently, while a higher variance might prompt an investigation into the reasons for the discrepancy, leading to potential improvements in strategy or operations.

### In Manufacturing

#### Use in Quality Control Processes

In manufacturing settings, variance is a critical metric in quality control. Production processes are designed to create products that meet certain specifications. Variance measures how much the actual output deviates from these specifications. High variance can indicate a flawed process or the need for better quality materials, while low variance signifies a well-controlled process.

#### Case Study Example

Consider a car manufacturing plant that aims to produce wheels with a diameter of 50 cm. After measuring 100 wheels, they find that the variance in diameter sizes is high. This leads to an investigation, revealing that the machinery needs recalibration. After making the necessary adjustments, a subsequent measurement shows a significantly reduced variance, confirming the effectiveness of the corrective action.

### In Other Domains

#### Applications in Logistics, Warehousing, and the Public Sector

• Logistics: In logistics, variance can be used to measure the reliability of delivery times. A low variance in delivery times suggests a reliable, well-optimized logistics network.

• Warehousing: Variance in inventory levels can indicate issues like overstocking or understocking, providing insights for better inventory management.

• Public Sector: In public administration, variance is often used to assess the effectiveness of various programs and initiatives, such as education programs or healthcare services.

## Variance in Lean Six Sigma

Lean Six Sigma is a methodology that aims to improve the quality of process outputs by identifying and removing the causes of defects and minimizing variability. Understanding variance is crucial in this context, as it directly influences the effectiveness of Lean Six Sigma projects. This section will delve into the role of variance in the DMAIC approach and explore techniques for reducing variance.

### DMAIC Approach

The DMAIC approach in Lean Six Sigma consists of five phases: Define, Measure, Analyze, Improve, and Control. Let’s look at how variance plays a critical role in each of these phases.

#### Define

In the Define phase, the project’s objectives and customer (internal or external) requirements are clearly outlined. While variance itself may not be the main focus, understanding the expected variance in the process can help set realistic project goals and customer expectations.

#### Measure

In this phase, data is collected to provide a baseline of the current state. Calculating the variance at this stage is crucial as it offers a quantitative measure of process dispersion. This baseline variance serves as a point of reference for subsequent stages.

#### Analyze

The Analyze phase involves identifying the root causes of variability. Understanding variance is key here, as higher variance often points to underlying issues in the process that need to be addressed.

#### Improve

In the Improve phase, solutions are developed and tested to address the root causes identified in the Analyze phase. The aim is usually to reduce variance, making the process more predictable and reliable. Various tools and methodologies, discussed in the next section, can be employed to achieve this.

#### Control

The Control phase aims to sustain the improvements made. Monitoring variance is critical here, as any increase could be an early indicator of a process falling out of control, necessitating immediate action.

### Tools and Methodologies for Reducing Variance

Reducing variance is often a primary objective in Lean Six Sigma projects. Below are some commonly used tools and methodologies aimed at achieving this:

1. Control Charts: These charts are used to monitor the stability of a process over time. Any spikes in variance can be easily identified and addressed.

2. Design of Experiments (DOE): DOE is a structured method for determining the relationship between factors affecting a process and the output of that process. It’s highly effective in identifying factors contributing to high variance.

3. Statistical Process Control (SPC): This involves using statistical methods to monitor and control a process, thereby aiding in reducing variance.

4. Process Mapping: This involves creating a visual representation of the process to identify areas of high variability that could be streamlined.

5. Root Cause Analysis: Techniques such as the 5 Whys or Fishbone Diagram can be used to identify the underlying causes of high variance.

## Common Mistakes and Pitfalls

As with any statistical measure, the application of variance is not without its pitfalls and challenges. This section aims to highlight some of the most common mistakes and misunderstandings that can occur when working with variance, especially in the context of continuous improvement.

### Errors in Calculating Variance

#### Using the Wrong Formula

One of the most frequent errors is using the wrong formula for the type of data you have. Remember that there are distinct formulas for calculating variance for a sample and a population. Using the incorrect formula can lead to skewed results, which could, in turn, affect the integrity of your analysis.

#### Incomplete Data

Another common mistake is calculating variance based on incomplete or insufficient data. If your dataset is not representative of the population or process you’re studying, your variance calculation may not provide an accurate picture of dispersion.

#### Manual Calculation Errors

While software and calculators have made it easier to compute variance, some people still prefer or need to calculate it manually. Simple arithmetic errors can result in incorrect variance values, impacting subsequent analyses and decisions.

### Misinterpretation of Results

#### Over-Reliance on Variance

Variance provides valuable information about data dispersion, but it’s not the end-all measure. Some people make the mistake of focusing too heavily on variance and neglecting other important statistical measures like mean, median, and mode. A comprehensive data analysis should consider multiple facets of the dataset.

#### Not Considering Units

The unit of variance is the square of the unit of the original data. Failing to consider this can lead to misunderstandings, especially when comparing variances of different datasets.

#### Ignoring Context

High or low variance is not inherently “bad” or “good.” The interpretation depends on the context in which you are analyzing the data. For instance, high variance in a financial portfolio may indicate a risky investment strategy, but in the context of innovation and product development, high variance could signify creative exploration.

## Conclusion

Understanding variance is not just a statistical endeavor but a critical skill for anyone involved in data analysis, business strategy, or continuous improvement methodologies like Lean Six Sigma. This comprehensive guide has aimed to demystify the concept of variance, from its mathematical foundation to its practical applications in various sectors. We delved into how to calculate variance, both manually and using statistical tools, and highlighted the pitfalls to avoid and best practices to adopt.

The role of variance in shaping key decisions in business, manufacturing, and other domains was also discussed, underscoring its pivotal role in quality control and performance evaluation. As you go forward, remember that the power of variance as a tool lies not just in its calculation but in its insightful application for making data-driven decisions and improvements. Whether you’re a seasoned professional or a beginner, mastering variance can significantly elevate your analytical capabilities.

## References

A: Sample variance is calculated when you have a subset of data from a larger population. The formula for sample variance uses n−1 in the denominator. Population variance is calculated when you have data for the entire population you’re interested in, and its formula uses N in the denominator. The main difference lies in what the data represents and the formula used for calculation.

A: Variance involves squaring the deviations from the mean, which naturally squares the unit as well. This is why the unit of variance is the square of the unit of the original data. To get a measure with the same unit as the original data, you can take the square root of the variance to get the standard deviation.

A: In Lean Six Sigma, variance plays a crucial role in the DMAIC approach (Define, Measure, Analyze, Improve, Control). Variance helps quantify how much a process deviates from the expected outcome, thereby aiding in identifying areas for improvement. Tools like Control Charts and Design of Experiments are often used to analyze and reduce variance in processes.

A: Whether high variance is “good” or “bad” depends on the context. In financial portfolios, high variance usually indicates higher risk. However, in contexts like innovation and research, high variance may signify creative exploration and diversity of thought. The key is to understand what variance represents in your specific situation.

A: Some common mistakes include using the wrong formula (confusing sample and population variance), calculating based on incomplete data, and manual calculation errors. Misinterpretation often arises from over-relying on variance as the sole measure of dispersion, not considering the unit of variance, and ignoring the contextual meaning of high or low variance.

## Author #### Daniel Croft

Daniel Croft is a seasoned continuous improvement manager with a Black Belt in Lean Six Sigma. With over 10 years of real-world application experience across diverse sectors, Daniel has a passion for optimizing processes and fostering a culture of efficiency. He's not just a practitioner but also an avid learner, constantly seeking to expand his knowledge. Outside of his professional life, Daniel has a keen Investing, statistics and knowledge-sharing, which led him to create the website learnleansigma.com, a platform dedicated to Lean Six Sigma and process improvement insights.

## Free Lean Six Sigma Templates

Improve your Lean Six Sigma projects with our free templates. They're designed to make implementation and management easier, helping you achieve better results.