Applied Statistics Lesson of the Day – Fractional Factorial Design and the Sparsity-of-Effects Principle

Consider again an experiment that seeks to determine the causal relationships between G factors and the response, where G > 1.  Ideally, the sample size is large enough for a full factorial design to be used.  However, if the sample size is small and the number of possible treatments is large, then a fractional factorial design can be used instead.  Such a design assigns the experimental units to a select fraction of the treatments; these treatments are chosen carefully to investigate the most significant causal relationships, while leaving aside the insignificant ones.  

When, then, are the significant causal relationships?  According to the sparsity-of-effects principle, it is unlikely that complex, higher-order effects exist, and that the most important effects are the lower-order effects.  Thus, assign the experimental units so that main (1st-order) effects and the 2nd-order interaction effects can be investigated.  This may neglect the discovery of a few significant higher-order effects, but that is the compromise that a fractional factorial design makes when the sample size available is low and the number of possible treatments is high.  

Applied Statistics Lesson of the Day – The Full Factorial Design

An experimenter may seek to determine the causal relationships between G factors and the response, where G > 1.  On first instinct, you may be tempted to conduct G separate experiments, each using the completely randomized design with 1 factor.  Often, however, it is possible to conduct 1 experiment with G factors at the same time.  This is better than the first approach because

  • it is faster
  • it uses less resources to answer the same questions
  • the interactions between the G factors can be examined

Such an experiment requires the full factorial design; in this design, the treatments are all possible combinations of all levels of all factors.  After controlling for confounding variables and choosing the appropriate range and number of levels of the factor, the different treatments are applied to the different groups, and data on the resulting responses are collected.  

The simplest full factorial experiment consists of 2 factors, each with 2 levels.  Such an experiment would result in 2 \times 2 = 4 treatments, each being a combination of 1 level from the first factor and 1 level from the second factor.  Since this is a full factorial design, experimental units are independently assigned to all treatments.  The 2-factor ANOVA model is commonly used to analyze data from such designs.

In later lessons, I will discuss interactions and 2-factor ANOVA in more detail.

Applied Statistics Lesson of the Day – Blocking and the Randomized Complete Blocked Design (RCBD)

A completely randomized design works well for a homogeneous population – one that does not have major differences between any sub-populations.  However, what if a population is heterogeneous?

Consider an example that commonly occurs in medical studies.  An experiment seeks to determine the effectiveness of a drug on curing a disease, and 100 patients are recruited for this double-blinded study – 50 are men, and 50 are women.  An abundance of biological knowledge tells us that men and women have significantly physiologies, and this is a heterogeneous population with respect to gender.  If a completely randomized design is used for this study, gender could be a confounding variable; this is especially true if the experimental group has a much higher proportion of one gender, and the control group has a much higher proportion of the other gender.  (For instance, purely due to the randomness, 45 males may be assigned to the experimental group, and 45 females may be assigned to the control group.)  If a statistically significant difference in the patients’ survival from the disease is observed between such a pair of experimental and control groups, this effect could be attributed to the drug or to gender, and that would ruin the goal of determining the cause-and-effect relationship between the drug and survival from the disease.

To overcome this heterogeneity and control for the effect of gender, a randomized blocked design could be used.  Blocking is the division of the experimental units into homogeneous sub-populations before assigning treatments to them.  A randomized blocked design for our above example would divide the males and females into 2 separate sub-populations, and then each of these 2 groups is split into the experimental and control group.  Thus, the experiment actually has 4 groups:

  1. 25 men take the drug (experimental)
  2. 25 men take a placebo (control)
  3. 25 women take the drug (experimental)
  4. 25 women take a placebo (control)

Essentially, the population is divided into blocks of homogeneous sub-populations, and a completely randomized design is applied to each block.  This minimizes the effect of gender on the response and increases the precision of the estimate of the effect of the drug.

Applied Statistics Lesson of the Day – The Completely Randomized Design with 1 Factor

The simplest experimental design is the completely randomized design with 1 factor.  In this design, each experimental unit is randomly assigned to a factor level.  This design is most useful for a homogeneous population (one that does not have major differences between any sub-populations).  It is appealing because of its simplicity and flexibility – it can be used for a factor with any number of levels, and different treatments can have different sample sizes.  After controlling for confounding variables and choosing the appropriate range and number of levels of the factor, the different treatments are applied to the different groups, and data on the resulting responses are collected.  The means of the response variable in the different groups are compared; if there are significant differences, then there is evidence to suggest that the factor and the response have a causal relationship.  The single-factor analysis of variance (ANOVA) model is most commonly used to analyze the data in such an experiment, but it does assume that the data in each group have a normal distribution, and that all groups have equal variance.  The Kruskal-Wallis test is a non-parametric alternative to ANOVA in analyzing data from single-factor completely randomized experiments.

If the factor has 2 levels, you may think that an independent 2-sample t-test with equal variance can also be used to analyze the data.  This is true, but the square of the t-test statistic in this case is just the F-test statistic in a single-factor ANOVA with 2 groups.  Thus, the results of these 2 tests are the same.  ANOVA generalizes the independent 2-sample t-test with equal variance to more than 2 groups.

Some textbooks state that “random assignment” means random assignment of experimental units to treatments, whereas other textbooks state that it means random assignment of treatments to experimental units.  I don’t think that there is any difference between these 2 definitions, but I welcome your thoughts in the comments.

Applied Statistics Lesson of the Day – Positive Control in Experimental Design

In my recent lesson on controlling for confounders in experimental design, the control group was described as one that received a neutral or standard treatment, and the standard treatment may simply be nothing.  This is a negative control group.  Not all experiments require a negative control group; some experiments instead have positive control group.

A positive control group is a group of experimental units that receive a treatment that is known to cause an effect on the response.  Such a causal relationship would have been previously established, and its inclusion in the experiment allows a new treatment to be compared to this existing treatment.  Again, both the positive control group and the experimental group experience the same experimental procedures and conditions except for the treatment.  The existing treatment with the known effect on the response is applied to the positive control group, and the new treatment with the unknown effect on the response is applied to the experimental group.  If the new treatment has a causal relationship with the response, both the positive control group and the experimental group should have the same responses.  (This assumes, of course, that the response can only be changed in 1 direction.  If the response can increase or decrease in value (or, more generally, change in more than 1 way), then it is possible for the positive control group and the experimental group to have the different responses.

In short, in an experiment with a positive control group, an existing treatment is known to “work”, and the new treatment is being tested to see if it can “work” just as well or even better.  Experiments to test for the effectiveness of a new medical therapies or a disease detector often have positive controls; there are existing therapies or detectors that work well, and the new therapy or detector is being evaluated for its effectiveness.

Experiments with positive controls are useful for ensuring that the experimental procedures and conditions proceed as planned.  If the positive control does not show the expected response, then something is wrong with the experimental procedures or conditions, and any “good” result from the new treatment should be considered with skepticism.


Applied Statistics Lesson of the Day – Choosing the Range of Levels for Quantitative Factors in Experimental Design

In addition to choosing the number of levels for a quantitative factor in designing an experiment, the experimenter must also choose the range of the levels of the factor.

  • If the levels are too close together, then there may not be a noticeable difference in the corresponding responses.
  • If the levels are too far apart, then an important trend in the causal relationship could be missed.

Consider the following example of making sourdough bread from Gänzle et al. (1998).  The experimenters sought to determine the relationship between temperature and the growth rates of 2 strains of bacteria and 1 strain of yeast, and they used mathematical models and experimental data to study this relationship.  The plots below show the results for Lactobacillus sanfranciscensis LTH2581 (Panel A) and LTH1729 (Panel B), and Candida milleri LTH H198 (Panel C).  The figures contain the predicted curves (solid and dashed lines) and the actual data (circles).  Notice that, for all 3 organisms,

  • the relationship is relatively “flat” in the beginning, so choosing temperatures that are too close together at low temperatures (e.g. 1 and 2 degrees Celsius) would not yield noticeably different growth rates
  • the overall relationship between growth rate and temperature is rather complicated, and choosing temperatures that are too far apart might miss important trends.

yeast temperature

Once again, the experimenter’s prior knowledge and hypothesis can be very useful in making this decision.  In this case, the experimenters had the benefit of their mathematical models in guiding their hypothesis and choosing the range of temperatures for collecting the data on the growth rates.


Gänzle, Michael G., Michaela Ehmann, and Walter P. Hammes. “Modeling of growth of Lactobacillus sanfranciscensis and Candida milleri in response to process parameters of sourdough fermentation.” Applied and environmental microbiology 64.7 (1998): 2616-2623.

Applied Statistics Lesson of the Day: Sample Size and Replication in Experimental Design

The goal of an experiment is to determine

  1. whether or not there is a cause-and-effect relationship between the factor and the response
  2. the strength of the causal relationship, should such a relationship exist.

To answer these questions, the response variable is measured in both the control group and the experimental group.  If there is a difference between the 2 responses, then there is evidence to suggest that the causal relationship exists, and the difference can be measured and quantified.

However, in most* experiments, there is random variation in the response.  Random variation exists in the natural sciences, and there is even more of it in the social sciences.  Thus, an observed difference between the control and experimental groups could be mistakenly attributed to a cause-and-effect relationship when the source of the difference is really just random variation.  In short, the difference may simply be due to the noise rather than the signal.  

To detect an actual difference beyond random variation (i.e. to obtain a higher signal-to-noise ratio), it is important to use replication to obtain a sufficiently large sample size in the experiment.  Replication is the repeated application of the treatments to multiple independently assigned experimental units.  (Recall that randomization is an important part of controlling for confounding variables in an experiment.  Randomization ensures that the experimental units are independently assigned to the different treatments.)  The number of independently assigned experimental units that receive the same treatment is the sample size.

*Deterministic computer experiments are unlike most experiments; they do not have random variation in the responses.

Applied Statistics Lesson of the Day – Basic Terminology in Experimental Design #2: Controlling for Confounders

A well designed experiment must have good control, which is the reduction of effects from confounding variables.  There are several ways to do so:

  • Include a control group.  This group will receive a neutral treatment or a standard treatment.  (This treatment may simply be nothing.)  The experimental group will receive the new treatment or treatment of interest.  The response in the experimental group will be compared to the response in the control group to assess the effect of the new treatment or treatment of interest.  Any effect from confounding variables will affect both the control group and the experimental group equally, so the only difference between the 2 groups should be due to the new treatment or treatment of interest.
  • In medical studies with patients as the experimental units, it is common to include a placebo group.  Patients in the placebo group get a treatment that is known to have no effect.  This accounts for the placebo effect.
    • For example, in a drug study, a patient in the placebo group may get a sugar pill.
  • In experiments with human or animal subjects, participants and/or the experimenters are often blinded.  This means that they do not know which treatment the participant received.  This ensures that knowledge of receiving a particular treatment – for either the participant or the experimenters – is not a confounding variable.  An experiment that blinds both the participants and the experimenters is called a double-blinded experiment.
  • For confounding variables that are difficult or impossible to control for, the experimental units should be assigned to the control group and the experimental group by randomization.  This can be done with random number tables, flipping a coin, or random number generators from computers.  This ensures that confounding effects affect both the control group and the experimental group roughly equally.
    • For example, an experimenter wants to determine if the HPV vaccine will make new students immune to HPV.  There will be 2 groups: the control group will not receive the vaccine, and the experimental group will receive the vaccine.  If the experimenter can choose students from 2 schools for her study, then the students should be randomly assigned into the 2 groups, so that each group will have roughly the same number of students from each school.  This would minimize the confounding effect of the schools.

Applied Statistics Lesson of the Day – Basic Terminology in Experimental Design #1

The word “experiment” can mean many different things in various contexts.  In science and statistics, it has a very particular and subtle definition, one that is not immediately familiar to many people who work outside of the field of experimental design. This is the first of a series of blog posts to clarify what an experiment is, how it is conducted, and why it is so central to science and statistics.

Experiment: A procedure to determine the causal relationship between 2 variables – an explanatory variable and a response variable.  The value of the explanatory variable is changed, and the value of the response variable is observed for each value of the explantory variable.

  • An experiment can have 2 or more explanatory variables and 2 or more response variables.
  • In my experience, I find that most experiments have 1 response variable, but many experiments have 2 or more explanatory variables.  The interactions between the multiple explanatory variables are often of interest.
  • All other variables are held constant in this process to avoid confounding.

Explanatory Variable or Factor: The variable whose values are set by the experimenter.  This variable is the cause in the hypothesis.  (*Many people call this the independent variable.  I discourage this usage, because “independent” means something very different in statistics.)

Response Variable: The variable whose values are observed by the experimenter as the explanatory variable’s value is changed.  This variable is the effect in the hypothesis.  (*Many people call this the dependent variable.  Further to my previous point about “independent variables”, dependence means something very different in statistics, and I discourage using this usage.)

Factor Level: Each possible value of the factor (explanatory variable).  A factor must have at least 2 levels.

Treatment: Each possible combination of factor levels.

  • If the experiment has only 1 explanatory variable, then each treatment is simply each factor level.
  • If the experiment has 2 explanatory variables, X and Y, then each treatment is a combination of 1 factor level from X and 1 factor level from Y.  Such combining of factor levels generalizes to experiments with more than 2 explanatory variables.

Experimental Unit: The object on which a treatment is applied.  This can be anything – person, group of people, animal, plant, chemical, guitar, baseball, etc.