Eric’s Enlightenment for Wednesday, June 3, 2015

  1. Jodi Beggs uses the Rule of 70 to explain why small differences in GDP growth rates have large ramifications.
  2. Rick Wicklin illustrates the importance of choosing bin widths carefully when plotting histograms.
  3. Shana Kelley et al. have developed an electrochemical sensor for detecting selected mutated nucleic acids (i.e. cancer markers in DNA!).  “The sensor comprises gold electrical leads deposited on a silicon wafer, with palladium nano-electrodes.”
  4. Rhett Allain provides a very detailed and analytical critique of Mjölnir (Thor’s hammer) – specifically, its unrealistic centre of mass.  This is an impressive exercise in physics!
  5. Congratulations to the Career Services Centre at Simon Fraser University for winning TalentEgg’s Special Award for Innovation by a Career Centre!  I was fortunate to volunteer there as a career advisor for 5 years, and it was a wonderful place to learn, grow and give back to the community. My career has benefited greatly from that experience, and it is a pleasure to continue my involvement as a guest blogger for its official blog, The Career Services Informer. Way to go, everyone!

Exploratory Data Analysis: Combining Histograms and Density Plots to Examine the Distribution of the Ozone Pollution Data from New York in R


This is a follow-up post to my recent introduction of histograms.  Previously, I presented the conceptual foundations of histograms and used a histogram to approximate the distribution of the “Ozone” data from the built-in data set “airquality” in R.  Today, I will examine this distribution in more detail by overlaying the histogram with parametric and non-parametric kernel density plots.  I will finally answer the question that I have asked (and hinted to answer) several times: Are the “Ozone” data normally distributed, or is another distribution more suitable?

histogram and kernel density plot

Read the rest of this post to learn how to combine histograms with density curves like this above plot!

This is another post in my continuing series on exploratory data analysis (EDA).  Previous posts in this series on EDA include

Read more of this post

Exploratory Data Analysis: Conceptual Foundations of Histograms – Illustrated with New York’s Ozone Pollution Data


Continuing my recent series on exploratory data analysis (EDA), today’s post focuses on histograms, which are very useful plots for visualizing the distribution of a data set.  I will discuss how histograms are constructed and use histograms to assess the distribution of the “Ozone” data from the built-in “airquality” data set in R.  In a later post, I will assess the distribution of the “Ozone” data in greater depth by combining histograms with various types of density plots.

Previous posts in this series on EDA include


Read the rest of this post to learn how to construct a histogram and get the R code for producing the above plot!

Read more of this post


Get every new post delivered to your Inbox.

Join 504 other followers