Applied Statistics Lesson of the Day – The Matched-Pair (or Paired) t-Test

My last lesson introduced the matched pairs experimental design, which is a special type of the randomized blocked design.  Let’s now talk about how to analyze the data from such a design.

Since the experimental units are organized in pairs, the units between pairs (blocks) are not independently assigned.  (The units within each pair are independently assigned – returning to the glove example, one hand is randomly chosen to wear the nitrile glove, while the other is randomly chosen to wear the latex glove.)  Because of this lack of independence between pairs, the independent 2-sample t-test is not applicable.  Instead, use the matched pair t-test (also called the paired or the paired difference t-test).  This is really a 1-sample t-test that tests the difference between the responses of the experimental and the control groups.


Applied Statistics Lesson of the Day: Sample Size and Replication in Experimental Design

The goal of an experiment is to determine

  1. whether or not there is a cause-and-effect relationship between the factor and the response
  2. the strength of the causal relationship, should such a relationship exist.

To answer these questions, the response variable is measured in both the control group and the experimental group.  If there is a difference between the 2 responses, then there is evidence to suggest that the causal relationship exists, and the difference can be measured and quantified.

However, in most* experiments, there is random variation in the response.  Random variation exists in the natural sciences, and there is even more of it in the social sciences.  Thus, an observed difference between the control and experimental groups could be mistakenly attributed to a cause-and-effect relationship when the source of the difference is really just random variation.  In short, the difference may simply be due to the noise rather than the signal.  

To detect an actual difference beyond random variation (i.e. to obtain a higher signal-to-noise ratio), it is important to use replication to obtain a sufficiently large sample size in the experiment.  Replication is the repeated application of the treatments to multiple independently assigned experimental units.  (Recall that randomization is an important part of controlling for confounding variables in an experiment.  Randomization ensures that the experimental units are independently assigned to the different treatments.)  The number of independently assigned experimental units that receive the same treatment is the sample size.

*Deterministic computer experiments are unlike most experiments; they do not have random variation in the responses.

Applied Statistics Lesson of the Day – Basic Terminology in Experimental Design #2: Controlling for Confounders

A well designed experiment must have good control, which is the reduction of effects from confounding variables.  There are several ways to do so:

  • Include a control group.  This group will receive a neutral treatment or a standard treatment.  (This treatment may simply be nothing.)  The experimental group will receive the new treatment or treatment of interest.  The response in the experimental group will be compared to the response in the control group to assess the effect of the new treatment or treatment of interest.  Any effect from confounding variables will affect both the control group and the experimental group equally, so the only difference between the 2 groups should be due to the new treatment or treatment of interest.
  • In medical studies with patients as the experimental units, it is common to include a placebo group.  Patients in the placebo group get a treatment that is known to have no effect.  This accounts for the placebo effect.
    • For example, in a drug study, a patient in the placebo group may get a sugar pill.
  • In experiments with human or animal subjects, participants and/or the experimenters are often blinded.  This means that they do not know which treatment the participant received.  This ensures that knowledge of receiving a particular treatment – for either the participant or the experimenters – is not a confounding variable.  An experiment that blinds both the participants and the experimenters is called a double-blinded experiment.
  • For confounding variables that are difficult or impossible to control for, the experimental units should be assigned to the control group and the experimental group by randomization.  This can be done with random number tables, flipping a coin, or random number generators from computers.  This ensures that confounding effects affect both the control group and the experimental group roughly equally.
    • For example, an experimenter wants to determine if the HPV vaccine will make new students immune to HPV.  There will be 2 groups: the control group will not receive the vaccine, and the experimental group will receive the vaccine.  If the experimenter can choose students from 2 schools for her study, then the students should be randomly assigned into the 2 groups, so that each group will have roughly the same number of students from each school.  This would minimize the confounding effect of the schools.