## Exploratory Data Analysis: Combining Histograms and Density Plots to Examine the Distribution of the Ozone Pollution Data from New York in R

July 29, 2013 5 Comments

#### Introduction

This is a follow-up post to my recent introduction of histograms. Previously, I presented the conceptual foundations of histograms and used a histogram to approximate the distribution of the “Ozone” data from the built-in data set “airquality” in R. Today, I will examine this distribution in more detail by overlaying the histogram with parametric and non-parametric kernel density plots. I will finally answer the question that I have asked (and hinted to answer) several times: Are the “Ozone” data normally distributed, or is another distribution more suitable?

**Read the rest of this post to learn how to combine histograms with density curves like this above plot!**

This is another post in my continuing series on exploratory data analysis (EDA). Previous posts in this series on EDA include

- Descriptive statistics
- Box plots
- The conceptual foundations of kernel density estimation
- How to construct kernel density plots and rug plots in R
- Violin plots
- The conceptual foundations of empirical cumulative distribution functions (CDFs)
- 2 ways of plotting empirical CDFs in R
- The conceptual foundations of histograms

## Recent Comments