June 24, 2013 11 Comments
Continuing my recent series on exploratory data analysis (EDA), this post focuses on the conceptual foundations of empirical cumulative distribution functions (CDFs); in a separate post, I will show how to plot them in R. (Previous posts in this series include descriptive statistics, box plots, kernel density estimation, and violin plots.)
To give you a sense of what an empirical CDF looks like, here is an example created from 100 randomly generated numbers from the standard normal distribution. The ecdf() function in R was used to generate this plot; the entire code is provided at the end of this post, but read my next post for more detail on how to generate plots of empirical CDFs in R.
Read to rest of this post to learn what an empirical CDF is and how to produce the above plot!