# Exploratory Data Analysis: Quantile-Quantile Plots for New York’s Ozone Pollution Data

September 22, 2013 3 Comments

Eric Cai shares his love of statistics, machine learning, chemistry and math while teaching the concepts and writing readily usable code. Watch Eric's video tutorials on Youtube. Follow Eric on Twitter @chemstateric.

Using the ‘fitDistr’ function of the ‘propagate’ package gives me the following order of best fitted distributions to ‘ozone':

library(propagate)

res <- fitDistr(ozone)

res$aic

Distribution AIC

16 Johnson SU -838.1165

4 Log-normal -834.4381

11 Generalized Trapezoidal -828.3748

12 Gamma -812.3913

19 4P Beta -803.8396

18 3P Weibull -802.6010

8 Triangular -801.4067

14 Laplace -799.1498

3 Generalized normal -797.8691

15 Gumbel -797.0433

13 Cauchy -796.9815

5 Scaled/shifted t- -795.3024

6 Logistic -783.7429

1 Normal -775.1892

2 Skewed-normal -773.1892

9 Trapezoidal -760.0612

7 Uniform -720.5523

10 Curvilinear Trapezoidal -718.7252

20 Arcsine -694.8609

21 von Mises -578.6548

17 Johnson SB -566.5763

with JohsonSU and log-normal on first/second place. Haven't checked the qqplots for that though…

Cheers,

Andrej

Thanks for taking the initiative to conduct this analysis and share the results with us, Andrej! I appreciate you mentioning this function in the past, and it’s nice to see it in action!

I like to use multiple tools to explore a data set, and it’s best to combine the wisdom of multiple tools to gain that exploratory inference. Thanks for demonstrating another tool that I will to my bag of tricks!

Reblogged this on nishant@analyst.