Highlighting cells to quickly view Average, Count, and Sum in Excel

I recently needed to check my answers after some data analysis in Alteryx.  I computed many averages using a formula in Alteryx, and I wanted to check those results by calculating the average for a few randomly selected rows.  I did this by invoking a helpful tool in Microsoft Excel.  I will illustrate this functionality with some random data.

In Column E, I used a formula to calculate the average of the 3 populations in Columns B, C, and D.  To manually check that the formula is correct, I highlighted the 3 columns for ID #125.  On the bottom right, Excel calculates the average; it’s difficult to see in the picture below, but Excel confirms that the average is 707,154.

Excel average

Whenever you highlight a range of cells containing numeric data, Excel will provide the average, count, and sum of the selected cells.  I did not know about this functionality when I first began working as a statistician, and I am very glad that I did learn it eventually – it is very useful for checking answers by analyzing a few randomly selected rows in Excel!

Advertisements

The advantages of using count() to get N-way frequency tables as data frames in R

Introduction

I recently introduced how to use the count() function in the “plyr” package in R to produce 1-way frequency tables in R.  Several commenters provided alternative ways of doing so, and they are all appreciated.  Today, I want to extend that tutorial by demonstrating how count() can be used to produce N-way frequency tables in the list format – this will magnify the superiority of this function over other functions like table() and xtabs().

 

2-Way Frequencies: The Cross-Tabulated Format vs. The List-Format

To get a 2-way frequency table (i.e. a frequency table of the counts of a data set as divided by 2 categorical variables), you can display it in a cross-tabulated format or in a list format.

In R, the xtabs() function is good for cross-tabulation.  Let’s use the “mtcars” data set again; recall that it is a built-in data set in Base R.

> y = xtabs(~ cyl + gear, mtcars)
> y
          gear
 cyl      3     4     5
 4        1     8     2
 6        2     4     1
 8        12    0     2

Read more of this post

How to Get the Frequency Table of a Categorical Variable as a Data Frame in R

Introduction

One feature that I like about R is the ability to access and manipulate the outputs of many functions.  For example, you can extract the kernel density estimates from density() and scale them to ensure that the resulting density integrates to 1 over its support set.

I recently needed to get a frequency table of a categorical variable in R, and I wanted the output as a data table that I can access and manipulate.  This is a fairly simple and common task in statistics and data analysis, so I thought that there must be a function in Base R that can easily generate this.  Sadly, I could not find such a function.  In this post, I will explain why the seemingly obvious table() function does not work, and I will demonstrate how the count() function in the ‘plyr’ package can achieve this goal.

The Example Data Set – mtcars

Let’s use the mtcars data set that is built into R as an example.  The categorical variable that I want to explore is “gear” – this denotes the number of forward gears in the car – so let’s view the first 6 observations of just the car model and the gear.  We can use the subset() function to restrict the data set to show just the row names and “gear”.

> head(subset(mtcars, select = 'gear'))
                     gear
Mazda RX4            4
Mazda RX4 Wag        4
Datsun 710           4
Hornet 4 Drive       3
Hornet Sportabout    3
Valiant              3

Read more of this post

Checking the Goodness of Fit of the Poisson Distribution in R for Alpha Decay by Americium-241

Introduction

Today, I will discuss the alpha decay of americium-241 and use R to model the number of emissions from a real data set with the Poisson distribution.  I was especially intrigued in learning about the use of Am-241 in smoke detectors, and I will elaborate on this clever application.  I will then use the Pearson chi-squared test to check the goodness of fit of my model.  The R script for the full analysis is given at the end of the post; there is a particularly useful code for superscripting the mass number of a chemical isotope in the title of a plot.  While there are many examples of superscripts in plot titles and axes that can be found on the web, none showed how to put the superscript before a text.  I hope that this and other tricks in this script are of use to you.

smoke detector

 

Smoke Detector with Americium-241

Source: Creative Commons via Eric Mason’s Coursework for Physics 241 at Stanford University

Read more of this post