Mathematical Statistics Lesson of the Day – Sufficient Statistics

*Update on 2014-11-06: Thanks to Christian Robert’s comment, I have removed the sample median as an example of a sufficient statistic.

Suppose that you collected data

\mathbf{X} = X_1, X_2, ..., X_n

in order to estimate a parameter \theta.  Let f_\theta(x) be the probability density function (PDF)* for X_1, X_2, ..., X_n.

Let

t = T(\mathbf{X})

be a statistic based on \mathbf{X}.  Let g_\theta(t) be the PDF for T(X).

If the conditional PDF

h_\theta(\mathbf{X}) = f_\theta(x) \div g_\theta[T(\mathbf{X})]

is independent of \theta, then T(\mathbf{X}) is a sufficient statistic for \theta.  In other words,

h_\theta(\mathbf{X}) = h(\mathbf{X}),

and \theta does not appear in h(\mathbf{X}).

Intuitively, this means that T(\mathbf{X}) contains everything you need to estimate \theta, so knowing T(\mathbf{X}) (i.e. conditioning f_\theta(x) on T(\mathbf{X})) is sufficient for estimating \theta.

Often, the sufficient statistic for \theta is a summary statistic of X_1, X_2, ..., X_n, such as their

  • sample mean
  • sample median – removed thanks to comment by Christian Robert (Xi’an)
  • sample minimum
  • sample maximum

If such a summary statistic is sufficient for \theta, then knowing this one statistic is just as useful as knowing all n data for estimating \theta.

*This above definition holds for discrete and continuous random variables.

4 Responses to Mathematical Statistics Lesson of the Day – Sufficient Statistics

  1. xi'an says:

    You mention the median among the potential sufficient statistics. I just cannot think of a non-triviial example where this occurs. Would you know any? Thank you.

    • Hi Christian,

      Thank you very much for your comment. I was wrong to include the sample median in the list of sufficient statistics. After much thought, I cannot think of such an example, and my searches on the Internet have revealed no examples, either. Thus, I have deleted it from the list.

      Thanks again for your astute comment.

      Eric

      • xi'an says:

        Thanks for putting this on the list! It made me think with a piece of paper for half an hour, which does not happen soon enough. I think I can prove it cannot happen with exponential families.

Your thoughtful comments are much appreciated!