How to Calculate a Partial Correlation Coefficient in R: An Example with Oxidizing Ammonia to Make Nitric Acid
May 5, 2013 2 Comments
Today, I will talk about the math behind calculating partial correlation and illustrate the computation in R with an example involving the oxidation of ammonia to make nitric acid using a built-in data set in R called stackloss. In a separate post, I will also share an R function that I wrote to estimate partial correlation. In a later post, I will discuss the interpretation of the partial correlation coefficient at length.
I read Pages 234-237 in Section 6.6 of “Discovering Statistics Using R” by Andy Field, Jeremy Miles, and Zoe Field to learn about partial correlation. They used a data set called “Exam Anxiety.dat” available from their companion web site (look under “6 Correlation”) to illustrate this concept; they calculated the partial correlation coefficient between exam anxiety and revision time while controlling for exam score. As I discuss further below, the plot between the 2 above residuals helps to illustrate the calculation of partial correlation coefficients. This plot makes intuitive sense; if you take more time to study for an exam, you tend to have less exam anxiety, so there is a negative correlation between revision time and exam anxiety.
They used a function called pcor() in a package called “ggm”; however, I suspect that this package is no longer working properly, because it depends on a deprecated package called “RBGL” (i.e. “RBGL” is no longer available in CRAN). See this discussion thread for further information. Thus, I wrote my own R function to illustrate partial correlation.
Partial correlation is the correlation between 2 random variables while holding other variables constant. To calculate the partial correlation between X and Y while holding Z constant (or controlling for the effect of Z, or averaging out Z),