# Exploratory Data Analysis: Quantile-Quantile Plots for New York’s Ozone Pollution Data

September 22, 2013 3 Comments

Eric Cai shares his love of statistics, machine learning, chemistry and math while teaching the concepts and writing readily usable code. Watch Eric's video tutorials on Youtube. Follow Eric on Twitter @chemstateric.

- Statistics
- Applied Statistics
- Biostatistics
- Categorical Data Analysis
- Data Analysis
- Data Mining
- Data Visualization
- Descriptive Statistics
- JMP
- Machine Learning
- Machine Learning Lesson of the Day
- Mathematical Statistics
- Probability
- R programming
- SAS Programming
- Statistical Computing
- Statistical Consulting
- Statistics in Industry and Practice
- Statistics Lesson of the Day
- Survival Analysis

- Chemistry
- Math
- R
- SAS
- Tutorials
- Video
- Enlightenment

September 22, 2013 3 Comments

applied statistics
box plot
career
chemistry
chiral
chirality
control group
data
data analysis
data manipulation
data visualization
descriptive statistics
design of experiments
diastereomer
DOE
economics
enantiomer
enthalpy
expected value
experiment
experimental design
exploratory data analysis
factor
factor level
heat
HIV
Inorganic Chemistry
internal energy
JMP
linear regression
machine learning
math
mathematical statistics
mathematics
normal distribution
organic chemistry
ozone
physical chemistry
plot
plots
plotting
PNG
pressure
probability
probability density function
quantile
R
random variable
regression
R programming
sample mean
sample size
SAS
sas programming
SFU
Simon Fraser University
simple linear regression
statistics
stereochemistry
stereogenic centre
stereoisomer
summary()
supervised learning
TASS
temperature
text
thermodynamics
treatment
university of toronto
validation
Vancouver
VanSUG
variance
water
work

- Mathematical Statistics Lesson of the Day – An Example of An Ancillary Statistic
- Physical Chemistry Lesson of the Day – What is the Primary Determinant of the Effective Nuclear Charge for Outer Electrons?
- Data Science Seminar by David Campbell on Approximate Bayesian Computation and the Earthworm Invasion in Canada
- Mathematical Statistics Lesson of the Day – Ancillary Statistics
- Analytical Chemistry Lesson of the Day – Method Validation in Quality Assurance

- Alyssa Frazee's Blog – Statistics and Python Programming
- Andrew Gelman's Blog
- Catherine Rampell's Columns in The Washington Post
- Chemical and Engineering News Blog
- Daniel Lemire's Blog
- Data School – Data Science for Beginners by Kevin Markham
- David Zetland's Blog – Aguanomics
- Ellen Maki's Blog – Statistics for Healthcare Research
- Emma Pierson's Blog – Obsession with Regression
- Evan Soltas' Blog – Economics & Thought
- FiveThirtyEight
- Heather Krause's Blog – Datassist's Data Blog
- Hilary Parker's Blog – Not So Standard Deviations
- JMP's Official Blog
- John D. Cook's Blog – The Endeavour
- Julia Evans' Blog – Python Programming and Data Science
- Larry Wasserman's Blog – Normal Deviate
- Mitch Andre Garcia's Chemistry Blog
- Phys.org
- R Bloggers
- R4Stats.com
- Rick Wicklin's Blog – The Do Loop
- SAS and R Examples by Ken Kleinman and Nicholas Horton
- Scott Sumner's Blog – The Money Illusion
- StatsBlogs
- Terrance Tao's Blog
- The Sceptical Chymist – Nature Chemistry's Blog
- The Upshot – New York Times
- Vox.com
- Worthwhile Canadian Initiative
- Yichuan Wang's Blog – Synthenomics

- June 2015
- May 2015
- April 2015
- March 2015
- February 2015
- January 2015
- December 2014
- November 2014
- October 2014
- September 2014
- August 2014
- July 2014
- June 2014
- May 2014
- April 2014
- March 2014
- February 2014
- January 2014
- December 2013
- November 2013
- October 2013
- September 2013
- August 2013
- July 2013
- June 2013
- May 2013
- April 2013
- March 2013
- February 2013

- Advice
- Analytical Chemistry
- Applied Mathematics
- Applied Statistics
- Basic Chemistry
- Biochemistry
- Biostatistics
- Blog Administration
- Categorical Data Analysis
- Chemistry
- Chemistry Lesson of the Day
- Data Analysis
- Data Mining
- Descriptive Statistics
- Environmental Chemistry
- Eric's Enlightenment
- Experimental Design
- Inorganic Chemistry
- JMP
- Machine Learning
- Machine Learning Lesson of the Day
- Mathematical Statistics
- Mathematics
- Nuclear Chemistry
- Numerical Analysis
- Organic Chemistry
- Personal Advice
- Physical Chemistry
- Plots
- Practical Applications of Chemistry
- Predictive Modelling
- Presentations & Appearances
- Probability
- R programming
- Radiochemistry
- SAS Programming
- Scientific Applications of Chemistry
- Statistical Computing
- Statistical Consulting
- Statistics
- Statistics in Industry and Practice
- Statistics Lesson of the Day
- Survival Analysis
- Tutorials
- Uncategorized
- Video

%d bloggers like this:

Using the ‘fitDistr’ function of the ‘propagate’ package gives me the following order of best fitted distributions to ‘ozone':

library(propagate)

res <- fitDistr(ozone)

res$aic

Distribution AIC

16 Johnson SU -838.1165

4 Log-normal -834.4381

11 Generalized Trapezoidal -828.3748

12 Gamma -812.3913

19 4P Beta -803.8396

18 3P Weibull -802.6010

8 Triangular -801.4067

14 Laplace -799.1498

3 Generalized normal -797.8691

15 Gumbel -797.0433

13 Cauchy -796.9815

5 Scaled/shifted t- -795.3024

6 Logistic -783.7429

1 Normal -775.1892

2 Skewed-normal -773.1892

9 Trapezoidal -760.0612

7 Uniform -720.5523

10 Curvilinear Trapezoidal -718.7252

20 Arcsine -694.8609

21 von Mises -578.6548

17 Johnson SB -566.5763

with JohsonSU and log-normal on first/second place. Haven't checked the qqplots for that though…

Cheers,

Andrej

Thanks for taking the initiative to conduct this analysis and share the results with us, Andrej! I appreciate you mentioning this function in the past, and it’s nice to see it in action!

I like to use multiple tools to explore a data set, and it’s best to combine the wisdom of multiple tools to gain that exploratory inference. Thanks for demonstrating another tool that I will to my bag of tricks!

Reblogged this on nishant@analyst.