Getting the names, types, formats, lengths, and labels of variables in a SAS data set

After reading my blog post on getting the variable names of a SAS data set, a reader named Robin asked how to get the formats as well.  I asked SAS Technical Support for help, and a consultant named Jerry Leonard provided a beautiful solution using PROC SQL.  Besides the names and formats of the variables, it also gives the types, lengths, and labels.  Here is an example of how to do so with the CLASS data set in the built-in SASHELP library.

* add formats and labels to 3 of the variables in the CLASS data set;
data class;                                                      
       set sashelp.class;                                            
       format 
            age 8.  
            weight height 8.2 
            name $15.;          
       label 
            age = 'Age'
            weight = 'Weight'
            height = 'Height';
run;                                                             
                  

* extract the variable information using PROC SQL; 
proc sql 
       noprint;                                                
       create table class_info as 
       select libname as library, 
              memname as data_set, 
              name as variable_name, 
              type, 
              length, 
              format, 
              label       
       from dictionary.columns                                       
       where libname = 'WORK' and memname = 'CLASS';                     
       /* libname and memname values must be upper case  */         
quit;                                                          
                   
 
* print the resulting table;
proc print 
       data = class_info;                                            
run;

Here is the result of that PROC PRINT step in the Results Viewer.  Notice that it also has the type, length, format, and label of each variable.

Obs library data_set variable_name type length format label
1 WORK CLASS Name char 8 $15.
2 WORK CLASS Sex char 1
3 WORK CLASS Age num 8 8. Age
4 WORK CLASS Height num 8 8.2 Height
5 WORK CLASS Weight num 8 8.2 Weight

Thank you, Jerry, for sharing your tip!

Physical Chemistry Lesson of the Day: Pressure-Volume Work

In chemistry, a common type of work is the expansion or compression of a gas under constant pressure.  Recall from physics that pressure is defined as force applied per unit of area.

P = F \div A

P \times A = F

Consider a chemical reaction that releases a gas as its product inside a sealed cylinder with a movable piston.

 

Gax_expanding_doing_work_on_a_piston_in_a_cylinder

Image from Dpumroy via Wikimedia.

As the gas expands inside the cylinder, it pushes against the piston, and work is done by the system against the surroundings.  The atmospheric pressure on the cylinder remains constant while the cylinder expands, and the volume of the cylinder increases as a result.  The volume of the cylinder at any given point is the area of the piston times the length of the cylinder.  The change in volume is equal to the area of the piston times the distance along which the piston was pushed by the expanding gas.

w = -P \times \Delta V

w = -P \times A \times \Delta L

w = -F \times \Delta L

Note that this last line is just the definition of work under constant force in the same direction as the displacement, multiplied by the negative sign to follow the sign convention in chemistry.

Exploratory Data Analysis – Computing Descriptive Statistics in R for Data on Ozone Pollution in New York City

Introduction

This is the first of a series of posts on exploratory data analysis (EDA).  This post will calculate the common summary statistics of a univariate continuous data set – the data on ozone pollution in New York City that is part of the built-in “airquality” data set in R.  This is a particularly good data set to work with, since it has missing values – a common problem in many real data sets.  In later posts, I will continue this series by exploring other methods in EDA, including box plots and kernel density plots.

Read more of this post