The Central Limit Theorem
The central limit theorem is one of the most important results in statistics. It explains why the normal distribution appears so often even when the underlying data is not normal.
What it states
Take many independent samples from any distribution with a finite mean and variance. Compute the mean of each sample. As the sample size grows, the distribution of those sample means approaches a normal distribution, regardless of the original shape.
The mean of that sampling distribution equals the population mean, and its standard deviation, called the standard error, shrinks as the square root of the sample size.
Why it matters
- It justifies using normal based tools like z tests and confidence intervals on sample averages.
- It explains why measurement noise from many small additive sources tends to look Gaussian.
- It tells us larger samples give more precise estimates, since the standard error falls.
A common caution
The theorem applies to the distribution of the sample mean, not to the raw data. Individual data points keep their original shape. It also needs roughly independent samples and a finite variance.
Key idea
The central limit theorem says sample means become normally distributed as sample size grows, with standard error shrinking by the square root of n.