Standard error of the mean

2019-06-15 · 247 words · 2 minute read

math

statistics · explorable

This is an Observable notebook I created to explain what standard error of the mean is.

The standard deviation helps us understand how much variation there is amongst individual observations. For example, Nancy wants to understand how well her students did in the exams. She would summarize the results of her students as a mean and standard deviation.

However, the standard deviation doesn't tell us how 'good' our estimate of the mean is. This is something we need to know when we're comparing the means between samples from two or more populations. For example, if we want to compare the mileage between two different types of cars: A and B, we could measure it on a sample of cars from both populations. We can then calculate a mean and standard deviation for both samples. However, we can't make a meaningful comparison between A and B's populations based on just the mean from two small samples. This is where the standard error of the mean comes in.

In this explorable, we will develop an intuitive understanding of the standard error of the mean. We first cover the basics: normal distribution, standard deviation before delving into standard error. There are more than a few formulas, but feel free to skip past them if you are not a fan. I've included them here as they are really useful for conveying mathematical ideas concisely. Anyhow, there'll be plenty of interactive plots to keep you informed and entertained.

It'll be fun. I promise!

Contents

Normal distributions can be characterised using the mean and the standard deviation.
Sampling allows us to estimate characteristics of a population.
Increasing sample size (n) increases the accuracy of the sample mean (x̅).
The std error (σx̅) tells us how accurate our estimate of the pop mean (μ) is likely to be.
Increasing sample size (n) reduces std error (sx̅).
The standard error allows us to compare the means between two populations.
Conclusion
References
Appendix

Normal distributions can be characterised using the mean and the standard deviation.

Good things may come in twos, but most things in life come in normal distributions. A normal distribution, sometimes called the bell curve, is a distribution that occurs naturally in many situations, e.g., heights and IQ scores. We can describe such a population using parameters such as the mean and the standard deviation.

The population mean, $\textcolor{green}{\mu}$ , is an average of a group characteristic.

\textcolor{green}{\mu}=\frac{\sum_{i=1}^{\textcolor{green}{N}} x_{i}}{\textcolor{green}{N}}

where:

$\textcolor{green}{N}$ is the total number of individuals or cases in the population,
$x_i$ is a single observation, where i can vary from 1 to $\textcolor{green}{N}$ .

The population standard deviation, $\textcolor{green}{\sigma}$ , is a measure of the spread (variability) of the scores.

\textcolor{green}{\sigma}=\sqrt{\frac{\sum_{i=1}^{\textcolor{green}{N}}\left(x_{i}-\textcolor{green}{\mu}\right)^{2}}{\textcolor{green}{N}}}

The mean and standard deviation are pretty easy to understand visually. In the plot below, we can see a 'standard' normal distribution, i.e., a normal distribution with a mean of 0 and a standard deviation of 1. Use the slider below to see how the standard deviation helps us describe the spread of the data. It'll be useful while going through the rest of this article if you can remember that approximately 68% of the population lies within 1 standard deviation of the mean.

numberOfStdDev_0 = 1

Sampling allows us to estimate characteristics of a population.

Unfortunately, population parameters such as $\textcolor{green}{\mu}$ and $\textcolor{green}{\sigma}$ are usually unknown because it’s generally impossible to measure an entire population. However, you can use random samples to calculate estimates of these parameters.

[object Event]

The sample mean, $\textcolor{red}{\overline{x}}$ , is the average score of a sample on a given variable,

\textcolor{red}{\overline{x}}=\frac{\sum_{i=1}^{\textcolor{red}{n}} x_{i}}{\textcolor{red}{n}}

where $\textcolor{red}{n}$ is the samples size.

The sample standard deviation, $\textcolor{red}{s}$ , is a measure of the spread (variability) of the scores in the sample on a given variable.

\textcolor{red}{s}=\sqrt{\frac{\sum_{i=1}^{\textcolor{red}{n}}\left(x_{i}-\textcolor{red}{\overline{x}}\right)^{2}}{\textcolor{red}{n} - 1}}

Don't worry too much about the 'n - 1' term in the denominator. If you are puzzled by why it's not just 'n', read up on the Bessel's correction.

The visualisation below shows us what a random sample from a population with a set mean and standard deviation looks like. Did you notice that the sample mean isn't always very accurate?

mu_1 = 0

sigma_1 = 5

n_1 = 10

refresh_1 = "Resample"

Increasing sample size (n) increases the accuracy of the sample mean (x̅).

The sample mean won't be exactly equal to the population mean that we're trying to estimate. If our sample size is small, our estimate of the mean won't be as good as an estimate based on a larger sample size.

Below we have a sample size of 3 compared with a sample size of 10. Both samples are from the same population.

numberOfSamples_2 = 10

The std error (σ_x̅) tells us how accurate our estimate of the pop mean (μ) is likely to be.

The standard error of the mean (also called just 'standard error') is the standard deviation of the different sample means if you took mutliple samples from the same population. Thus, we know that approximately 68% of the sample means would be within one standard error of the population mean.

sigma_3 = 5

n_3 = 10

numberOfSamples_3 = 10

However, in reality, we won't have multiple samples to use to estimate the standard error. Luckily, we can calculate the standard error of the mean from just one sample using the following formula.

\textcolor{green}{{\sigma_{\overline{x}}}} =\frac{\textcolor{green}{\sigma}}{\sqrt{\textcolor{red}{n}}}

Since we seldom know the pop std dev (σ), we use an approximation using the sample std dev (s).

\textcolor{green}{{\sigma_{\overline{x}}}} \approx \frac{\textcolor{red}{s}}{\sqrt{\textcolor{red}{n}}}

In practice, the standard error of the mean tends to be simply defined as such.

\textcolor{red}{\textcolor{red}{{s_{\overline{x}}}}}=\frac{\textcolor{red}{s}}{\sqrt{\textcolor{red}{n}}}

In the following visualisation, we can see the standard error calculated for each sample. Approximately 68% of the samples will include the population mean within ±1 standard error of the sample mean.

sigma_4 = 5

n_4 = 10

numberOfSamples_4 = 10

Increasing sample size (n) reduces std error (s_x̅).

With increasing sample size, the sample mean becomes a more accurate estimate of the population mean. Therefore, the standard error of the mean becomes smaller. Below we have a sample size of 3 compared with a sample size of 20. Both samples are from the same population.

sigma_5 = 5

In the visualisation below, we can see how the standard error reduces with sample size. Since it is inversely proportional to the square root of $\textcolor{red}{n}$ , it reduces drastically at first but, beyond a sample size of 20, the reduction is less pronounced. By contrast, the sample std dev (s) will not tend to change as we increase the size of our sample; it tends to pop std dev (σ).

s_6 = 5

The standard error allows us to compare the means between two populations.

Let's take a sample from two different populations. You can define the two populations using the sliders below. Use the button to resample.

pop1 = Object {mu: 5, sigma: 1, n: 10}

pop2 = Object {mu: 6, sigma: 1, n: 10}

refresh_7 = "Resample"

Knowing the standard errors allows us to make a meaningful comparison of the populations means, not just the sample means.

Visually comparing standard error bars still isn't a very good way of testing the significance of the difference of means between two samples. There is a statistical tool called the t-test which we can use for this. Perhaps a topic for another explorable :)

Conclusion

I hope you enjoyed that :) Here's what we learnt so far.

Normal distributions can be characterised using the pop std dev (σ) and the pop mean (μ).
Sampling allows us to estimate characteristics of a population.
Increasing sample size (n) increases the accuracy of the sample mean (x̅).
The std error (s_x̅) tells us how accurate our estimate of the pop mean (μ) is likely to be.
Increasing sample size (n) reduces std error (s_x̅).
The standard error allows us to compare the means between two populations.

References

McDonald, J.H. Standard error of the mean, 2014. Handbook of Biological Statistics (3rd ed.). Sparky House Publishing, Baltimore, Maryland.
Altman DG, Bland JM. Standard deviations and standard errors, 2005. BMJ. doi:10.1136/bmj.331.7521.903

Appendix

I've captured all of the setup and calculations in this section. To see the code, head over to my Observable notebook here. You can also make a copy and play with it!

Imports

d3 = Object {event: null, format: ƒ(t), formatPrefix: ƒ(t, n), timeFormat: ƒ(t), timeParse: ƒ(t), utcFormat: ƒ(t), utcParse: ƒ(t), FormatSpecifier: ƒ(t), active: ƒ(t, n), arc: ƒ(), area: ƒ(), areaRadial: ƒ(), ascending: ƒ(t, n), autoType: ƒ(t), axisBottom: ƒ(t), axisLeft: ƒ(t), axisRight: ƒ(t), axisTop: ƒ(t), bisect: ƒ(n, e, r, i), bisectLeft: ƒ(n, e, r, i), …}

Plotly = Object {version: "1.58.5", register: ƒ(t), plot: ƒ(t, e, i, a), newPlot: ƒ(t, e, n, i), restyle: ƒ(t, e, n, i), relayout: ƒ(t, e, r), redraw: ƒ(t), update: ƒ(t, e, n, i), react: ƒ(t, e, n, i), extendTraces: ƒ(e, n, i, a), prependTraces: ƒ(e, n, i, a), addTraces: ƒ(e, n, i), deleteTraces: ƒ(e, n), moveTraces: ƒ(e, n, i), purge: ƒ(t), addFrames: ƒ(t, e, r), deleteFrames: ƒ(t, e), animate: ƒ(t, e, r), setPlotConfig: ƒ(t), toImage: ƒ(t, e), …}

math = Object {on: ƒ(), off: ƒ(), once: ƒ(), emit: ƒ(), type: Object, expression: Object, typed: ƒ(arg0, arg1), import: ƒ(object, options), config: ƒ(options), create: ƒ(config), bignumber: ƒ(arg0, arg1), boolean: ƒ(arg0, arg1), chain: ƒ(arg0, arg1), complex: ƒ(arg0, arg1), fraction: ƒ(arg0, arg1), index: ƒ(arg0, arg1), matrix: ƒ(arg0, arg1), sparse: ƒ(arg0, arg1), number: ƒ(arg0, arg1), string: ƒ(arg0, arg1), …}

_ = ƒ(…)

stdlib = Object {ArrayBuffer: ƒ(), convertArray: ƒ(e, _), convertArraySame: ƒ(e, _), arrayCtors: ƒ(e), arrayDataType: ƒ(e), arrayDataTypes: ƒ(), Float32Array: ƒ(), Float64Array: ƒ(), Int16Array: ƒ(), Int32Array: ƒ(), Int8Array: ƒ(), reviveTypedArray: ƒ(e, _), SharedArrayBuffer: ƒ(e), typedarray2json: ƒ(e), typedarray: ƒ(), typedarrayCtors: ƒ(e), typedarrayDataTypes: ƒ(), Uint16Array: ƒ(), Uint32Array: ƒ(), Uint8Array: ƒ(), …}

pdf = ƒ(e, _, t)

Defaults

muMin = -10

muMax = 10

muDefault = 0

sigmaMin = 0

sigmaMax = 10

sigmaDefault = 5

rangeVarMean = Array(2) [-20, 20]

range = Array(2) [-10, 10]

nMin = 3

nMax = 100

nDefault = 10

numberOfSamplesDefault = 10

xText = "observations (x)"

muText = "pop mean (μ)"

sigmaText = "pop std dev (σ)"

sampleText = "sample (i)"

meanText = "sample mean (x̅)"

sText = "sample std dev (s)"

nText = "sample size (n)"

NText = "population size (N)"

numberOfSamplesText = "no. of samples"

stdErrorSigmaText = "std error (σ<sub>x̅</sub>)"

stdErrorSText = "std error (s<sub>x̅</sub>)"

probabilityDensityText = "Probability Density Function (PDF)"

frequencyText = "frequency (f)"

scoreText = "score"

meanTex = "\\textcolor{red}{\\overline{x}}"

sigmaTex = "\\textcolor{green}{\\sigma}"

muTex = "\\textcolor{green}{\\mu}"

sTex = "\\textcolor{red}{s}"

nTex = "\\textcolor{red}{n}"

stdErrorSigmaTex = "\\textcolor{green}{{\\sigma_{\\overline{x}}}}"

stdErrorSTex = "\\textcolor{red}{{s_{\\overline{x}}}}"

NTex = "\\textcolor{green}{N}"

Slider generators

setMu = ƒ()

setSigma = ƒ()

setn = ƒ()

setNumberOfSamples = ƒ()

setS = ƒ()

setNumberOfStdDev = ƒ()

setPop = ƒ(…)

Helper functions

getTex = ƒ(symbolTex)

getInfo = ƒ(…)

getTraceErrorsEnvelope = ƒ(…)

getPDF = ƒ(…)

getAxis = ƒ(axis)

Generate samples and traces

getRandomNormal = ƒ()

getExperiment = ƒ(…)

getTracesExperiment = ƒ(…)

getTraceStdErrorsEnvelopeExperiment = ƒ(…)

Colors

lighterGreen = "rgba(0, 255, 0, 0.2)"

lightGreen = "rgba(0, 255, 0, 0.4)"

lighterBlue = "rgba(0, 199, 255, 0.2)"

lightBlue = "rgba(0, 199, 255, 0.4)"

lighterRed = "rgba(255, 0, 0, 0.2)"

lightRed = "rgba(255, 0, 0, 0.2)"

transparent = "rgba(0, 0, 0, 0)"

Standard error of the mean

Normal distributions can be characterised using the mean and the standard deviation.

Sampling allows us to estimate characteristics of a population.

Increasing sample size (n) increases the accuracy of the sample mean (x̅).

The std error (σ_x̅) tells us how accurate our estimate of the pop mean (μ) is likely to be.

Increasing sample size (n) reduces std error (s_x̅).

The standard error allows us to compare the means between two populations.

Population 1

Population 2

Conclusion

References

Appendix

Imports

Defaults

Slider generators

Helper functions

Generate samples and traces

Colors

Normal distributions can be characterised using the mean and the standard deviation.

Sampling allows us to estimate characteristics of a population.

Increasing sample size (n) increases the accuracy of the sample mean (x̅).

The std error (σx̅) tells us how accurate our estimate of the pop mean (μ) is likely to be.

Increasing sample size (n) reduces std error (sx̅).

The standard error allows us to compare the means between two populations.

Population 1

Population 2

Conclusion

References

Appendix

Imports

Defaults

Slider generators

Helper functions

Generate samples and traces

Colors

The std error (σ_x̅) tells us how accurate our estimate of the pop mean (μ) is likely to be.

Increasing sample size (n) reduces std error (s_x̅).