## Applied Statistics Lesson of the Day – Notation for Fractional Factorial Designs

Fractional factorial designs use the $L^{F-p}$ notation; unfortunately, this notation is not clearly explained in most textbooks or web sites about experimental design.  I hope that my explanation below is useful.

• $L$ is the number of levels in each factor; note that the $L^{F-p}$ notation assumes that all factors have the same number of levels.
• If a factor has 2 levels, then the levels are usually coded as $+1$ and $-1$.
• If a factor has 3 levels, then the levels are usually coded as $+1$, $0$, and $-1$.
• $F$ is the number of factors in the experiment
• $p$ is the number of times that the full factorial design is fractionated by $L$.  This number is badly explained by most textbooks and web sites that I have seen, because they simply say that $p$ is the fraction – this is confusing, because a fraction has a numerator and a denominator, and $p$ is just 1 number.  To clarify,
• the fraction is $L^{-p}$
• the number of treatments in the fractional factorial design is $L^{-p}$ multiplied by the total possible number of treatments in the full factorial design, which is $L^F$.

If all $L^F$ possible treatments are used in the experiment, then a full factorial design is used.  If a fractional factorial design is used instead, then $L^{-p}$ denotes the fraction of the $L^F$ treatments that is used.

Most factorial experiments use binary factors (i.e. factors with 2 levels, $L = 2$).  Thus,

• if $p = 1$, then the fraction of treatments that is used is $2^{-1} = 1/2$.
• if $p = 2$, then the fraction of treatments that is used is $2^{-2} = 1/4$.

This is why

• a $2^{F-1}$ design is often called a half-fraction design.
• a $2^{F-2}$ design is often called a quarter-fraction design.

However, most sources that I have read do not bother to mention that $L$ can be greater than 2; experiments with 3-level factors are less frequent but still common.  Thus, the terms half-fraction design and half-quarter design only apply to binary factors.  If $L = 3$, then

• a $3^{F-1}$ design uses one-third of all possible treatments.
• a $3^{F-2}$ design uses one-ninth of all possible treatments.

## Applied Statistics Lesson of the Day – The Full Factorial Design

An experimenter may seek to determine the causal relationships between $G$ factors and the response, where $G > 1$.  On first instinct, you may be tempted to conduct $G$ separate experiments, each using the completely randomized design with 1 factor.  Often, however, it is possible to conduct 1 experiment with $G$ factors at the same time.  This is better than the first approach because

• it is faster
• it uses less resources to answer the same questions
• the interactions between the $G$ factors can be examined

Such an experiment requires the full factorial design; in this design, the treatments are all possible combinations of all levels of all factors.  After controlling for confounding variables and choosing the appropriate range and number of levels of the factor, the different treatments are applied to the different groups, and data on the resulting responses are collected.

The simplest full factorial experiment consists of 2 factors, each with 2 levels.  Such an experiment would result in $2 \times 2 = 4$ treatments, each being a combination of 1 level from the first factor and 1 level from the second factor.  Since this is a full factorial design, experimental units are independently assigned to all treatments.  The 2-factor ANOVA model is commonly used to analyze data from such designs.

In later lessons, I will discuss interactions and 2-factor ANOVA in more detail.

## Applied Statistics Lesson of the Day – The Completely Randomized Design with 1 Factor

The simplest experimental design is the completely randomized design with 1 factor.  In this design, each experimental unit is randomly assigned to a factor level.  This design is most useful for a homogeneous population (one that does not have major differences between any sub-populations).  It is appealing because of its simplicity and flexibility – it can be used for a factor with any number of levels, and different treatments can have different sample sizes.  After controlling for confounding variables and choosing the appropriate range and number of levels of the factor, the different treatments are applied to the different groups, and data on the resulting responses are collected.  The means of the response variable in the different groups are compared; if there are significant differences, then there is evidence to suggest that the factor and the response have a causal relationship.  The single-factor analysis of variance (ANOVA) model is most commonly used to analyze the data in such an experiment, but it does assume that the data in each group have a normal distribution, and that all groups have equal variance.  The Kruskal-Wallis test is a non-parametric alternative to ANOVA in analyzing data from single-factor completely randomized experiments.

If the factor has 2 levels, you may think that an independent 2-sample t-test with equal variance can also be used to analyze the data.  This is true, but the square of the t-test statistic in this case is just the F-test statistic in a single-factor ANOVA with 2 groups.  Thus, the results of these 2 tests are the same.  ANOVA generalizes the independent 2-sample t-test with equal variance to more than 2 groups.

Some textbooks state that “random assignment” means random assignment of experimental units to treatments, whereas other textbooks state that it means random assignment of treatments to experimental units.  I don’t think that there is any difference between these 2 definitions, but I welcome your thoughts in the comments.

## Applied Statistics Lesson of the Day – Choosing the Range of Levels for Quantitative Factors in Experimental Design

In addition to choosing the number of levels for a quantitative factor in designing an experiment, the experimenter must also choose the range of the levels of the factor.

• If the levels are too close together, then there may not be a noticeable difference in the corresponding responses.
• If the levels are too far apart, then an important trend in the causal relationship could be missed.

Consider the following example of making sourdough bread from Gänzle et al. (1998).  The experimenters sought to determine the relationship between temperature and the growth rates of 2 strains of bacteria and 1 strain of yeast, and they used mathematical models and experimental data to study this relationship.  The plots below show the results for Lactobacillus sanfranciscensis LTH2581 (Panel A) and LTH1729 (Panel B), and Candida milleri LTH H198 (Panel C).  The figures contain the predicted curves (solid and dashed lines) and the actual data (circles).  Notice that, for all 3 organisms,

• the relationship is relatively “flat” in the beginning, so choosing temperatures that are too close together at low temperatures (e.g. 1 and 2 degrees Celsius) would not yield noticeably different growth rates
• the overall relationship between growth rate and temperature is rather complicated, and choosing temperatures that are too far apart might miss important trends.

Once again, the experimenter’s prior knowledge and hypothesis can be very useful in making this decision.  In this case, the experimenters had the benefit of their mathematical models in guiding their hypothesis and choosing the range of temperatures for collecting the data on the growth rates.

#### Reference:

Gänzle, Michael G., Michaela Ehmann, and Walter P. Hammes. “Modeling of growth of Lactobacillus sanfranciscensis and Candida milleri in response to process parameters of sourdough fermentation.” Applied and environmental microbiology 64.7 (1998): 2616-2623.

## Applied Statistics Lesson of the Day – Choosing the Number of Levels for Factors in Experimental Design

The experimenter needs to decide the number of levels for each factor in an experiment.

• For a qualitative (categorical) factor, the number of levels may simply be the number of categories for that factor.  However, because of cost constraints, an experimenter may choose to drop a certain category.  Based on the experimenter’s prior knowledge or hypothesis, the category with the least potential for showing a cause-and-effect relationship between the factor and the response should be dropped.
• For a quantitative (numeric) factor, the number of levels should reflect the cause-and-effect relationship between the factor and the response.  Again, the experimenter’s prior knowledge or hypothesis is valuable in making this decision.
• If the relationship in the chosen range of the factor is hypothesized to be roughly linear, then 2 levels (perhaps the minimum and the maximum) should be sufficient.
• If the relationship in the chosen range of the factor is hypothesized to be roughly quadratic, then 3 levels would be useful.  Often, 3 levels are enough.
• If the relationship in the chosen range of the factor is hypothesized to be more complicated than a quadratic relationship, consider using 4 or more levels.

## Applied Statistics Lesson of the Day – Basic Terminology in Experimental Design #1

The word “experiment” can mean many different things in various contexts.  In science and statistics, it has a very particular and subtle definition, one that is not immediately familiar to many people who work outside of the field of experimental design. This is the first of a series of blog posts to clarify what an experiment is, how it is conducted, and why it is so central to science and statistics.

Experiment: A procedure to determine the causal relationship between 2 variables – an explanatory variable and a response variable.  The value of the explanatory variable is changed, and the value of the response variable is observed for each value of the explantory variable.

• An experiment can have 2 or more explanatory variables and 2 or more response variables.
• In my experience, I find that most experiments have 1 response variable, but many experiments have 2 or more explanatory variables.  The interactions between the multiple explanatory variables are often of interest.
• All other variables are held constant in this process to avoid confounding.

Explanatory Variable or Factor: The variable whose values are set by the experimenter.  This variable is the cause in the hypothesis.  (*Many people call this the independent variable.  I discourage this usage, because “independent” means something very different in statistics.)

Response Variable: The variable whose values are observed by the experimenter as the explanatory variable’s value is changed.  This variable is the effect in the hypothesis.  (*Many people call this the dependent variable.  Further to my previous point about “independent variables”, dependence means something very different in statistics, and I discourage using this usage.)

Factor Level: Each possible value of the factor (explanatory variable).  A factor must have at least 2 levels.

Treatment: Each possible combination of factor levels.

• If the experiment has only 1 explanatory variable, then each treatment is simply each factor level.
• If the experiment has 2 explanatory variables, X and Y, then each treatment is a combination of 1 factor level from X and 1 factor level from Y.  Such combining of factor levels generalizes to experiments with more than 2 explanatory variables.

Experimental Unit: The object on which a treatment is applied.  This can be anything – person, group of people, animal, plant, chemical, guitar, baseball, etc.