November 5, 2014 4 Comments
*Update on 2014-11-06: Thanks to Christian Robert’s comment, I have removed the sample median as an example of a sufficient statistic.
Suppose that you collected data
in order to estimate a parameter . Let be the probability density function (PDF)* for .
be a statistic based on . Let be the PDF for .
If the conditional PDF
is independent of , then is a sufficient statistic for . In other words,
and does not appear in .
Intuitively, this means that contains everything you need to estimate , so knowing (i.e. conditioning on ) is sufficient for estimating .
Often, the sufficient statistic for is a summary statistic of , such as their
- sample mean
sample median– removed thanks to comment by Christian Robert (Xi’an)
- sample minimum
- sample maximum
If such a summary statistic is sufficient for , then knowing this one statistic is just as useful as knowing all data for estimating .
*This above definition holds for discrete and continuous random variables.