Standard error of the mean

This is an Observable notebook I created to explain what standard error of the mean is.

The standard deviation helps us understand how much variation there is amongst individual observations. For example, Nancy wants to understand how well her students did in the exams. She would summarize the results of her students as a mean and standard deviation.

However, the standard deviation doesn't tell us how 'good' our estimate of the mean is. This is something we need to know when we're comparing the means between samples from two or more populations. For example, if we want to compare the mileage between two different types of cars: A and B, we could measure it on a sample of cars from both populations. We can then calculate a mean and standard deviation for both samples. However, we can't make a meaningful comparison between A and B's populations based on just the mean from two small samples. This is where the standard error of the mean comes in.

In this explorable, we will develop an intuitive understanding of the standard error of the mean. We first cover the basics: normal distribution, standard deviation before delving into standard error. There are more than a few formulas, but feel free to skip past them if you are not a fan. I've included them here as they are really useful for conveying mathematical ideas concisely. Anyhow, there'll be plenty of interactive plots to keep you informed and entertained.

It'll be fun. I promise!

Normal distributions can be characterised using the mean and the standard deviation.

Good things may come in twos, but most things in life come in normal distributions. A normal distribution, sometimes called the bell curve, is a distribution that occurs naturally in many situations, e.g., heights and IQ scores. We can describe such a population using parameters such as the mean and the standard deviation.

The population mean, μ\textcolor{green}{\mu}, is an average of a group characteristic.

μ=i=1NxiN \textcolor{green}{\mu}=\frac{\sum_{i=1}^{\textcolor{green}{N}} x_{i}}{\textcolor{green}{N}}

where:

  • N\textcolor{green}{N} is the total number of individuals or cases in the population,
  • xix_i is a single observation, where i can vary from 1 to N\textcolor{green}{N}.

The population standard deviation, σ\textcolor{green}{\sigma}, is a measure of the spread (variability) of the scores.

σ=i=1N(xiμ)2N \textcolor{green}{\sigma}=\sqrt{\frac{\sum_{i=1}^{\textcolor{green}{N}}\left(x_{i}-\textcolor{green}{\mu}\right)^{2}}{\textcolor{green}{N}}}

The mean and standard deviation are pretty easy to understand visually. In the plot below, we can see a 'standard' normal distribution, i.e., a normal distribution with a mean of 0 and a standard deviation of 1. Use the slider below to see how the standard deviation helps us describe the spread of the data. It'll be useful while going through the rest of this article if you can remember that approximately 68% of the population lies within 1 standard deviation of the mean.

1
Number of standard deviations
numberOfStdDev_0 = 1
−4−20200.10.20.30.4
PDF1 sigmaStandard normal distributionpop std dev (σ)Probability Density Function (PDF)68% of the population scores are within 1 std deviations of the mean.

Sampling allows us to estimate characteristics of a population.

Unfortunately, population parameters such as μ\textcolor{green}{\mu} and σ\textcolor{green}{\sigma} are usually unknown because it’s generally impossible to measure an entire population. However, you can use random samples to calculate estimates of these parameters.

[object Event]

The sample mean, x\textcolor{red}{\overline{x}}, is the average score of a sample on a given variable,

x=i=1nxin \textcolor{red}{\overline{x}}=\frac{\sum_{i=1}^{\textcolor{red}{n}} x_{i}}{\textcolor{red}{n}}

where n\textcolor{red}{n} is the samples size.

The sample standard deviation, s\textcolor{red}{s}, is a measure of the spread (variability) of the scores in the sample on a given variable.

s=i=1n(xix)2n1 \textcolor{red}{s}=\sqrt{\frac{\sum_{i=1}^{\textcolor{red}{n}}\left(x_{i}-\textcolor{red}{\overline{x}}\right)^{2}}{\textcolor{red}{n} - 1}}

Don't worry too much about the 'n - 1' term in the denominator. If you are puzzled by why it's not just 'n', read up on the Bessel's correction.

The visualisation below shows us what a random sample from a population with a set mean and standard deviation looks like. Did you notice that the sample mean isn't always very accurate?

0
pop mean (μ)
mu_1 = 0
5
pop std dev (σ)
sigma_1 = 5
10
sample size (n)
n_1 = 10
refresh_1 = "Resample"
−20−1001020−20−1001020012345
observations (x)sample mean (x̅)pop mean (μ)PDFNormal distribution μ = 0, σ = 5, n = 10observations (x)observations (x)frequency (f)

Increasing sample size (n) increases the accuracy of the sample mean (x̅).

The sample mean won't be exactly equal to the population mean that we're trying to estimate. If our sample size is small, our estimate of the mean won't be as good as an estimate based on a larger sample size.

Below we have a sample size of 3 compared with a sample size of 10. Both samples are from the same population.

10
Number of samples
numberOfSamples_2 = 10
05−10−5051005−10−50510
observations (x)sample mean (x̅)pop mean (μ)Effect of sample size (n) μ = 0, σ = 5, no. of samples = 10sample (i)sample (i)scorescoren = 3n = 10

The std error (σ) tells us how accurate our estimate of the pop mean (μ) is likely to be.

The standard error of the mean (also called just 'standard error') is the standard deviation of the different sample means if you took mutliple samples from the same population. Thus, we know that approximately 68% of the sample means would be within one standard error of the population mean.

5
pop std dev (σ)
sigma_3 = 5
10
sample size (n)
n_3 = 10
10
Number of samples
numberOfSamples_3 = 10
02468−10−50510
observations (x)sample mean (x̅)std error (s​)pop mean (μ)std error (s​) is the std deviation of the different sample means μ = 0, σ = 5, n = 10, no. of samples = 10sample (i)score

However, in reality, we won't have multiple samples to use to estimate the standard error. Luckily, we can calculate the standard error of the mean from just one sample using the following formula.

σx=σn \textcolor{green}{{\sigma_{\overline{x}}}} =\frac{\textcolor{green}{\sigma}}{\sqrt{\textcolor{red}{n}}}

Since we seldom know the pop std dev (σ), we use an approximation using the sample std dev (s).

σxsn \textcolor{green}{{\sigma_{\overline{x}}}} \approx \frac{\textcolor{red}{s}}{\sqrt{\textcolor{red}{n}}}

In practice, the standard error of the mean tends to be simply defined as such.

sx=sn \textcolor{red}{\textcolor{red}{{s_{\overline{x}}}}}=\frac{\textcolor{red}{s}}{\sqrt{\textcolor{red}{n}}}

In the following visualisation, we can see the standard error calculated for each sample. Approximately 68% of the samples will include the population mean within ±1 standard error of the sample mean.

5
pop std dev (σ)
sigma_4 = 5
10
sample size (n)
n_4 = 10
10
Number of samples
numberOfSamples_4 = 10
02468−10−50510
std error (s​)sample mean (x̅)pop mean (μ)std error (s​) calculated for each sample μ = 0, σ = 5, n = 10, no. of samples = 10sample (i)score

Increasing sample size (n) reduces std error (s).

With increasing sample size, the sample mean becomes a more accurate estimate of the population mean. Therefore, the standard error of the mean becomes smaller. Below we have a sample size of 3 compared with a sample size of 20. Both samples are from the same population.

5
pop std dev (σ)
sigma_5 = 5
05−10−5051005−10−50510
std error (s​)sample mean (x̅)pop mean (μ)Increasing sample size (n) reduces std error (s​) μ = 0, σ = 5, no. of samples = 10sample (i)sample (i)scorescoren = 3n = 20

In the visualisation below, we can see how the standard error reduces with sample size. Since it is inversely proportional to the square root of n\textcolor{red}{n}, it reduces drastically at first but, beyond a sample size of 20, the reduction is less pronounced. By contrast, the sample std dev (s) will not tend to change as we increase the size of our sample; it tends to pop std dev (σ).

5
sample std dev (s)
s_6 = 5
2040608012345
Increasing sample size (n) reduces std error (s​)sample size (n)std error (s​)

The standard error allows us to compare the means between two populations.

Let's take a sample from two different populations. You can define the two populations using the sliders below. Use the button to resample.

Population 1

pop mean (μ) = 5
pop std dev (σ) = 1
sample size (n) = 10
pop1 = Object {mu: 5, sigma: 1, n: 10}

Population 2

pop mean (μ) = 6
pop std dev (σ) = 1
sample size (n) = 10
pop2 = Object {mu: 6, sigma: 1, n: 10}
refresh_7 = "Resample"
FirstSecond0123456
sample mean (x̅)pop mean (μ)Comparing means from samples of two different populations Error bars show std error (s​)populationscoreμ = 5, σ = 1, n = 10μ = 6, σ = 1, n = 10

Knowing the standard errors allows us to make a meaningful comparison of the populations means, not just the sample means.

Visually comparing standard error bars still isn't a very good way of testing the significance of the difference of means between two samples. There is a statistical tool called the t-test which we can use for this. Perhaps a topic for another explorable :)

Conclusion

I hope you enjoyed that :) Here's what we learnt so far.

  • Normal distributions can be characterised using the pop std dev (σ) and the pop mean (μ).
  • Sampling allows us to estimate characteristics of a population.
  • Increasing sample size (n) increases the accuracy of the sample mean (x̅).
  • The std error (s) tells us how accurate our estimate of the pop mean (μ) is likely to be.
  • Increasing sample size (n) reduces std error (s).
  • The standard error allows us to compare the means between two populations.

References

  1. McDonald, J.H. Standard error of the mean, 2014. Handbook of Biological Statistics (3rd ed.). Sparky House Publishing, Baltimore, Maryland.
  2. Altman DG, Bland JM. Standard deviations and standard errors, 2005. BMJ. doi:10.1136/bmj.331.7521.903

Appendix

I've captured all of the setup and calculations in this section. To see the code, head over to my Observable notebook here. You can also make a copy and play with it!

Imports

d3 = Object {event: null, format: ƒ(t), formatPrefix: ƒ(t, n), timeFormat: ƒ(t), timeParse: ƒ(t), utcFormat: ƒ(t), utcParse: ƒ(t), FormatSpecifier: ƒ(t), active: ƒ(t, n), arc: ƒ(), area: ƒ(), areaRadial: ƒ(), ascending: ƒ(t, n), autoType: ƒ(t), axisBottom: ƒ(t), axisLeft: ƒ(t), axisRight: ƒ(t), axisTop: ƒ(t), bisect: ƒ(n, e, r, i), bisectLeft: ƒ(n, e, r, i), …}
Plotly = Object {version: "1.58.5", register: ƒ(t), plot: ƒ(t, e, i, a), newPlot: ƒ(t, e, n, i), restyle: ƒ(t, e, n, i), relayout: ƒ(t, e, r), redraw: ƒ(t), update: ƒ(t, e, n, i), react: ƒ(t, e, n, i), extendTraces: ƒ(e, n, i, a), prependTraces: ƒ(e, n, i, a), addTraces: ƒ(e, n, i), deleteTraces: ƒ(e, n), moveTraces: ƒ(e, n, i), purge: ƒ(t), addFrames: ƒ(t, e, r), deleteFrames: ƒ(t, e), animate: ƒ(t, e, r), setPlotConfig: ƒ(t), toImage: ƒ(t, e), …}
math = Object {on: ƒ(), off: ƒ(), once: ƒ(), emit: ƒ(), type: Object, expression: Object, typed: ƒ(arg0, arg1), import: ƒ(object, options), config: ƒ(options), create: ƒ(config), bignumber: ƒ(arg0, arg1), boolean: ƒ(arg0, arg1), chain: ƒ(arg0, arg1), complex: ƒ(arg0, arg1), fraction: ƒ(arg0, arg1), index: ƒ(arg0, arg1), matrix: ƒ(arg0, arg1), sparse: ƒ(arg0, arg1), number: ƒ(arg0, arg1), string: ƒ(arg0, arg1), …}
_ = ƒ(…)
stdlib = Object {ArrayBuffer: ƒ(), convertArray: ƒ(e, _), convertArraySame: ƒ(e, _), arrayCtors: ƒ(e), arrayDataType: ƒ(e), arrayDataTypes: ƒ(), Float32Array: ƒ(), Float64Array: ƒ(), Int16Array: ƒ(), Int32Array: ƒ(), Int8Array: ƒ(), reviveTypedArray: ƒ(e, _), SharedArrayBuffer: ƒ(e), typedarray2json: ƒ(e), typedarray: ƒ(), typedarrayCtors: ƒ(e), typedarrayDataTypes: ƒ(), Uint16Array: ƒ(), Uint32Array: ƒ(), Uint8Array: ƒ(), …}
pdf = ƒ(e, _, t)

Defaults

muMin = -10
muMax = 10
muDefault = 0
sigmaMin = 0
sigmaMax = 10
sigmaDefault = 5
rangeVarMean = Array(2) [-20, 20]
range = Array(2) [-10, 10]
nMin = 3
nMax = 100
nDefault = 10
numberOfSamplesDefault = 10
xText = "observations (x)"
muText = "pop mean (μ)"
sigmaText = "pop std dev (σ)"
sampleText = "sample (i)"
meanText = "sample mean (x̅)"
sText = "sample std dev (s)"
nText = "sample size (n)"
NText = "population size (N)"
numberOfSamplesText = "no. of samples"
stdErrorSigmaText = "std error (σ<sub>x̅</sub>)"
stdErrorSText = "std error (s<sub>x̅</sub>)"
probabilityDensityText = "Probability Density Function (PDF)"
frequencyText = "frequency (f)"
scoreText = "score"
meanTex = "\\textcolor{red}{\\overline{x}}"
sigmaTex = "\\textcolor{green}{\\sigma}"
muTex = "\\textcolor{green}{\\mu}"
sTex = "\\textcolor{red}{s}"
nTex = "\\textcolor{red}{n}"
stdErrorSigmaTex = "\\textcolor{green}{{\\sigma_{\\overline{x}}}}"
stdErrorSTex = "\\textcolor{red}{{s_{\\overline{x}}}}"
NTex = "\\textcolor{green}{N}"

Slider generators

setMu = ƒ()
setSigma = ƒ()
setn = ƒ()
setNumberOfSamples = ƒ()
setS = ƒ()
setNumberOfStdDev = ƒ()
setPop = ƒ(…)

Helper functions

getTex = ƒ(symbolTex)
getInfo = ƒ(…)
getTraceErrorsEnvelope = ƒ(…)
getPDF = ƒ(…)
getAxis = ƒ(axis)

Generate samples and traces

getRandomNormal = ƒ()
getExperiment = ƒ(…)
getTracesExperiment = ƒ(…)
getTraceStdErrorsEnvelopeExperiment = ƒ(…)

Colors

lighterGreen = "rgba(0, 255, 0, 0.2)"
lightGreen = "rgba(0, 255, 0, 0.4)"
lighterBlue = "rgba(0, 199, 255, 0.2)"
lightBlue = "rgba(0, 199, 255, 0.4)"
lighterRed = "rgba(255, 0, 0, 0.2)"
lightRed = "rgba(255, 0, 0, 0.2)"
transparent = "rgba(0, 0, 0, 0)"