# Mathematical Statistics Lesson of the Day – Minimally Sufficient Statistics

November 18, 2014 3 Comments

In using a statistic to estimate a parameter in a probability distribution, it is important to remember that there can be multiple sufficient statistics for the same parameter. Indeed, the entire data set, , can be a sufficient statistic – it certainly contains all of the information that is needed to estimate the parameter. However, using all variables is not very satisfying as a sufficient statistic, because it doesn’t reduce the information in any meaningful way – and a more compact, concise statistic is better than a complicated, multi-dimensional statistic. If we can use a lower-dimensional statistic that still contains all necessary information for estimating the parameter, then we have truly reduced our data set without stripping any value from it.

Our saviour for this problem is a **minimally sufficient statistic**. This is defined as a statistic, , such that

- is a sufficient statistic
- if is any other sufficient statistic, then there exists a function such that

Note that, if there exists a one-to-one function such that

then and are equivalent.

I think that before getting too excited about sufficient statistics, one should consider the Pitman-Koopman lemma, stating that only a limited number of distributions allow for dimension reduction via sufficiency. Those distributions are the exponential families.

I did not know about this – thanks for your comment, Christian.

If the use of sufficient statistics for data reduction is practically restricted to exponential families, are there other strategies for data reduction that work for all families (or at least more families)?

I wish that my mathematical statistics classes at Simon Fraser University and the University of Toronto taught me about this theorem. This caution for tempering expectations about the limitations of sufficient statistics is wise and valuable to know.

I became aware of the PKD lemma only after my PhD when reading Lehmann in full detal! Now I teach it to undergrads as a cautionary tale, along with another one: the range of transforms of a mean parameter that allow for unbiased estimator(s) is mostly restricted to polynomials of that mean, a negligible set in the collection of functions…