When you’ve got a set of data like a big list of numbers, range and outliers tell you which values don’t appear right, and how spread out they are.
Are all the values bunched up or scattered all over? The easiest measure of spread is the range. Understanding outliers lets us know which ones we should probably ignore, as those data points are very strange or have been recorded incorrectly.
TLDR
- Range = highest value – lowest value
- A range calculation with outliers included is dead simple but it’s easily affected by unusual values
- Outliers are values that sit way outside the pattern of the rest of the data
- An outlier can make the range much bigger than it would be otherwise
- When reporting the range, you should mention if outliers are included or excluded
- Range is still useful despite being affected by outliers – it gives a quick overall impression of spread
- You can also calculate the range without the outliers – and this can be the right or wrong thing to do. You have to think about it!
What You Need to Know (Taken From The Specification)
- You need to be able to interpret, analyse and compare data distributions
- For measures of spread, you must understand and be able to calculate the range
- You need to consider outliers when working with the range
- This applies to both Foundation and Higher tiers
This topic connects to other statistical concepts like measures of central tendency (mean, median, mode) and graphical representations of data.
Range – The Basics
What is the Range, How do we Calculate it
The range is the simplest measure of spread. It’s just the difference between the maximum and minimum values in your data set.
Range = Maximum value – Minimum value
That’s it – nothing more complicated than that. Just find the biggest number, find the smallest number, and subtract.
Example 1
Let’s find the range of this data set: 5, 8, 12, 15, 16, 18, 22
Step 1: Find the maximum value.
Maximum = 22
Step 2: Find the minimum value.
Minimum = 5
Step 3: Calculate the range.
Range = Maximum – Minimum = 22 – 5 = 17
So the range of this data set is 17.
Example 2
Now let’s look at a slightly messier example with decimal values.
Find the range of these test scores: 65.5, 71.2, 68.7, 82.9, 75.0, 69.3
Step 1: Find the maximum value.
Maximum = 82.9
Step 2: Find the minimum value.
Minimum = 65.5
Step 3: Calculate the range.
Range = Maximum – Minimum = 82.9 – 65.5 = 17.4
So the range of the test scores is 17.4.
Outliers
What are Outliers?
Outliers are values that are unusually high or low compared to the rest of the data. They sit away from the main cluster of values and can skew your statistics.
The usual rule for what an outlier is, is a value which is 1.5 x the interquartile range (IQR) below the first quartile, or 1.5 x the IQR above the 3rd quartile.
In reality there’s no single rule for what counts as an outlier – it depends on the context of your data. But an outlier is always a value that’s very different from most of the values in your data set. Unless told otherwise, stick to the 1.5 x IQR.
Identifying Outliers
There are several ways to identify outliers:
- Visual inspection – plot the data and see if any points lie far away from the rest on the graph
- Common sense and context – if you know what values are reasonable for your data
- Statistical methods – like the 1.5 × IQR rule (you’ll learn this in the higher tier)
Example 3
Let’s look at this data set: 15, 17, 16, 18, 14, 16, 42
Most values are clustered between 14 and 18, but 42 is way higher. It doesn’t fit the pattern, so 42 is an outlier.
If we calculate the range including the outlier:
Range = 42 – 14 = 28
But if we remove the outlier:
Range = 18 – 14 = 4
A massive difference! The range with the outlier is 7 times larger than without it.
Example 4
Consider these temperatures (°C) recorded over a week: 22, 24, 23, 25, -2, 21, 23
The value -2 looks really weird compared to the others. Let’s think about it:
- Is it a recording error? Maybe someone put down -2 instead of 20?
- Is it a true value? Maybe there was a really cold day?
Without more context, we’d typically consider -2 an outlier.
Range with the outlier: 25 – (-2) = 27
Range without the outlier: 25 – 21 = 4
So the outlier here massively affects the range. Bearing in mind the size of the change and the short time frame, we should probably ignore it. It’s almost unheard of for the temperature to plunge from summer to winter levels for one day of the week!
From these two examples you can see the range is extremely sensitive to outliers because it only uses the two most extreme values in the data set. Just one unusual value can completely change the range.
A Weakness of Range as a Measure of Spread
This is the main weakness of the range as a measure of spread. It doesn’t tell you anything about how the data is distributed between the extremes.
Example 5
Consider these two data sets:
Data Set A: 10, 11, 12, 13, 14, 15, 16
Data Set B: 10, 10, 10, 13, 16, 16, 16
Both have the same range (16 – 10 = 6), but the distribution of values is completely different:
- In Data Set A, the values are evenly spread
- In Data Set B, the values are clustered at the ends and middle
This shows why the range alone doesn’t give you the complete picture of how data is spread out.
Reporting Range With and Without Outliers
When you have outliers in your data, it’s often useful to report the range both with and without them.
For example: “The range of temperatures was 27°C, but excluding the unusually cold day, the range was 4°C.”
This gives a more complete picture of the data’s spread and acknowledges the unusual values without letting them dominate the description.
Common Mistakes to Avoid
-
Forgetting to subtract:
-
Incorrect: Just giving the maximum and minimum values (e.g., “the range is 5 to 22”)
-
Correct: Subtract minimum from maximum to get a single value (e.g., “the range is 17”)
-
Ignoring outliers completely:
-
Incorrect: Removing outliers without mentioning them
-
Correct: Acknowledge outliers and consider calculating the range both with and without them
-
Confusing range with other measures:
-
Incorrect: Mixing up range with interquartile range or standard deviation
-
Correct: Remember that range is just max – min
-
Not considering context:
-
Incorrect: Blindly identifying values as outliers without considering if they’re reasonable
-
Correct: Use your knowledge of the context to help judge whether extreme values are outliers
-
Thinking the range tells the whole story:
-
Incorrect: Relying only on the range to describe spread
-
Correct: Recognize that the range is limited because it only uses two values
Questions
Try these questions to practice working with range and outliers:
-
Calculate the range of this data set:
7, 12, 18, 21, 25, 30, 32
-
For the following data set, identify any outliers and calculate the range both with and without them:
15, 18, 17, 16, 19, 62, 14
-
The heights (in cm) of plants in a garden are:
25, 28, 30, 32, 29, 31, 27, 5
a) Calculate the range of the heights.
b) Is there an outlier? Explain your reasoning.
c) If there is an outlier, calculate the range without it.
- Two students measured the temperature of water samples. Their results were:
Student A: 22°C, 23°C, 21°C, 24°C, 22°C
Student B: 10°C, 35°C, 22°C, 23°C, 20°C
a) Calculate the range for each student’s measurements.
b) Which student’s measurements show greater consistency? Explain your answer.
Solutions
Question 1
Range = Maximum – Minimum = 32 – 7 = 25
Question 2
The data set is: 15, 18, 17, 16, 19, 62, 14
Most values are between 14 and 19, but 62 is much higher. This is clearly an outlier.
Range with the outlier:
Range = 62 – 14 = 48
Range without the outlier:
Range = 19 – 14 = 5
Question 3
a) Range = Maximum – Minimum = 32 – 5 = 27
b) Yes, 5 cm appears to be an outlier because:
- All other values are between 25 and 32 cm
- 5 cm is significantly smaller than the rest
- It’s possible this plant hasn’t grown properly or was damaged
c) Range without the outlier:
Range = 32 – 25 = 7
Question 4
a) Student A’s range:
Range = 24 – 21 = 3°C
Student B’s range:
Range = 35 – 10 = 25°C
b) Student A’s measurements show greater consistency because:
- The range is much smaller (3°C compared to 25°C)
- All values are close to each other
- Student B’s measurements vary widely, suggesting possible errors in measurement or very different conditions
Summary
- The range is the simplest measure of spread: just the maximum value minus the minimum value
- It’s super easy to calculate but has limitations because it only uses two values
- Outliers are extreme values that don’t fit the pattern of the rest of the data
- Outliers can massively affect the range, which is why it’s often useful to calculate the range both with and without them
- When interpreting data, always consider whether outliers are errors that should be excluded or legitimate values that should be included
- Despite its limitations, the range gives you a quick impression of how spread out your data is
Remember, in statistics, no single measure tells the complete story. The range is just one number that describes data. It isn’t even the only measure of spread. It works best when used alongside other measures like the mean, median, and mode.