Mathematical Statistics Lesson of the Day – Sufficient Statistics
November 5, 2014 4 Comments
*Update on 2014-11-06: Thanks to Christian Robert’s comment, I have removed the sample median as an example of a sufficient statistic.
Suppose that you collected data
in order to estimate a parameter . Let be the probability density function (PDF)* for .
Let
be a statistic based on . Let be the PDF for .
If the conditional PDF
is independent of , then is a sufficient statistic for . In other words,
,
and does not appear in .
Intuitively, this means that contains everything you need to estimate , so knowing (i.e. conditioning on ) is sufficient for estimating .
Often, the sufficient statistic for is a summary statistic of , such as their
- sample mean
sample median– removed thanks to comment by Christian Robert (Xi’an)- sample minimum
- sample maximum
If such a summary statistic is sufficient for , then knowing this one statistic is just as useful as knowing all data for estimating .
*This above definition holds for discrete and continuous random variables.
Reblogged this on Samchap's Site.
You mention the median among the potential sufficient statistics. I just cannot think of a non-triviial example where this occurs. Would you know any? Thank you.
Hi Christian,
Thank you very much for your comment. I was wrong to include the sample median in the list of sufficient statistics. After much thought, I cannot think of such an example, and my searches on the Internet have revealed no examples, either. Thus, I have deleted it from the list.
Thanks again for your astute comment.
Eric
Thanks for putting this on the list! It made me think with a piece of paper for half an hour, which does not happen soon enough. I think I can prove it cannot happen with exponential families.