Mathematical Statistics Lesson of the Day – Basu’s Theorem

Today’s Statistics Lesson of the Day will discuss Basu’s theorem, which connects the previously discussed concepts of minimally sufficient statistics, complete statistics and ancillary statistics.  As before, I will begin with the following set-up.

Suppose that you collected data

\mathbf{X} = X_1, X_2, ..., X_n

in order to estimate a parameter \theta.  Let f_\theta(x) be the probability density function (PDF) or probability mass function (PMF) for X_1, X_2, ..., X_n.

Let

t = T(\mathbf{X})

be a statistics based on \textbf{X}.

Basu’s theorem states that, if T(\textbf{X}) is a complete and minimal sufficient statistic, then T(\textbf{X}) is independent of every ancillary statistic.

Establishing the independence between 2 random variables can be very difficult if their joint distribution is hard to obtain.  This theorem allows the independence between minimally sufficient statistic and every ancillary statistic to be established without their joint distribution – and this is the great utility of Basu’s theorem.

However, establishing that a statistic is complete can be a difficult task.  In a later lesson, I will discuss another theorem that will make this task easier for certain cases.

Analytical Chemistry Lesson of the Day – Specificity in Method Validation and Quality Assurance

In pharmaceutical chemistry, one of the requirements for method validation is specificity, the ability of an analytical method to distinguish the analyte from other chemicals in the sample.  The specificity of the method may be assessed by deliberately adding impurities into a sample containing the analyte and testing how well the method can identify the analyte.

Statistics is an important tool in analytical chemistry, and, ideally, there is no overlap in the vocabulary that is used between the 2 fields.  Unfortunately, the above definition of specificity is different from that in statistics.  In a previous Machine Learning and Applied Statistics Lesson of the Day, I introduced the concepts of sensitivity and specificity in binary classification.  In the context of assessing the predictive accuracy of a binary classifier, its specificity is the proportion of truly negative cases among the classified negative cases.