Overview
The term mean refers to a central value that summarizes a collection of numbers. In both mathematics and statistics the mean is one of several measures commonly called an average. Although people often use "average" and "mean" interchangeably, the word "mean" has specific formal definitions that depend on which kind of mean is intended.
Common kinds of mean
Several types of mean are used in different contexts. Each emphasizes a different mathematical property and can lead to different numerical summaries for the same data set. The most widely used are:
- Arithmetic mean: the sum of values divided by the count. For numbers x1, x2, …, xn it is (x1 + x2 + … + xn)/n. It is the familiar "add up and divide" average.
- Geometric mean: the nth root of the product of n positive numbers, (x1 · x2 · … · xn)^(1/n). It is appropriate for quantities that combine multiplicatively, such as growth rates or ratios.
- Harmonic mean: n divided by the sum of reciprocals, n / (1/x1 + 1/x2 + … + 1/xn), typically used for rates and ratios (for example, average speeds when distance is fixed).
- Weighted mean: each value contributes proportionally to an associated weight, summing weight·value and dividing by total weight. This generalizes the arithmetic mean when observations have different importance or frequency.
Illustrative example and an outlier effect
Consider the data set {1, 2, 2, 100, 100}. The arithmetic mean equals (1 + 2 + 2 + 100 + 100) / 5 = 205 / 5 = 41. Although 41 is numerically correct as the arithmetic mean, it sits far from most of the individual values because two large observations (100, 100) inflate the average. This demonstrates the arithmetic mean's sensitivity to outliers: a few extreme values can pull the mean away from the bulk of the data. In such cases, other summaries like the median or a trimmed mean may better represent a typical value.
Mean in statistics: population and sample
In statistical contexts the mean appears with two interpretations. The population mean denotes the true average of an entire population and is usually written with the Greek letter μ. The sample mean, written x̄ (x-bar), is computed from a finite sample drawn from that population. The sample mean is a primary estimator of the population mean. Under broad conditions the sample mean is unbiased (its expected value equals the population mean) and, by the central limit theorem, its distribution tends toward a normal (Gaussian) shape for large samples, which underpins many methods of statistical inference.
Mathematical and practical properties
- Linearity: the arithmetic mean of a linear combination of datasets equals the corresponding linear combination of their means.
- Least squares: the arithmetic mean minimizes the sum of squared deviations from itself, making it the best single-number predictor in a mean squared error sense.
- Sensitivity to scale and units: means behave predictably under scaling; for example, multiplying all observations by a constant multiplies the mean by the same constant.
- Robustness: some means (harmonic, geometric) have domain restrictions or different sensitivities; others, like the trimmed mean or median, are used when robustness to outliers is desired.
History, generalizations, and uses
Means have been used for centuries to summarize data. Beyond the basic types described above, mathematicians define generalized power means (also called Hölder or generalized means) that interpolate between harmonic, geometric and arithmetic means by varying a parameter. In applied fields, means appear throughout science, engineering, finance, and policy: they summarize central tendency, support decision-making, and form components of more complex indices. Choosing the appropriate mean depends on the data scale, the presence of outliers, and the substantive meaning of averaging in the application.
Distinctions and notable facts
It is important to distinguish between different averages: mean (many varieties), median (middle value), and mode (most frequent value). For positive data, the inequality M_harmonic ≤ M_geometric ≤ M_arithmetic holds (the generalized mean inequality), with equality only when all values are equal. When summarizing data, stating which mean is used — and why — helps avoid misinterpretation.
Further reading on definitions and mathematical properties can be found in introductory texts on probability and mathematical statistics; basic encyclopedic entries or primers typically contrast the mean with other measures of central tendency and illustrate practical examples of when each is appropriate.