# Mathematical Statistics Lesson of the Day – Sufficient Statistics

November 5, 2014 4 Comments

**Update on 2014-11-06: Thanks to Christian Robert’s comment, I have removed the sample median as an example of a sufficient statistic.*

Suppose that you collected data

in order to **estimate** a **parameter** . Let be the **probability density function (PDF)*** for .

Let

be a **statistic** based on . Let be the PDF for .

If the **conditional PDF**

is **independent** of , then is a **sufficient statistic** for . In other words,

,

and does not appear in .

Intuitively, this means that contains everything you need to estimate , so knowing (i.e. conditioning on ) is sufficient for estimating .

Often, the sufficient statistic for is a **summary statistic** of , such as their

**sample mean****sample median****– removed thanks to comment by Christian Robert (Xi’an)****sample minimum****sample maximum**

If such a summary statistic is sufficient for , then knowing this one statistic is just as useful as knowing all data for estimating .

**This above definition holds for discrete and continuous random variables.*

Reblogged this on Samchap's Site.

You mention the median among the potential sufficient statistics. I just cannot think of a non-triviial example where this occurs. Would you know any? Thank you.

Hi Christian,

Thank you very much for your comment. I was wrong to include the sample median in the list of sufficient statistics. After much thought, I cannot think of such an example, and my searches on the Internet have revealed no examples, either. Thus, I have deleted it from the list.

Thanks again for your astute comment.

Eric

Thanks for putting this on the list! It made me think with a piece of paper for half an hour, which does not happen soon enough. I think I can prove it cannot happen with exponential families.