Convert multiple variables between character and numeric formats in SAS
April 27, 2018 5 Comments
Introduction
I often get data that are coded as character, but are actually meant to be numeric. Thus, converting them into the correct variable types is a common task, and SAS Note #24590 shows how to do so. However, I recently needed to do hundreds of these conversions, so I wanted some code to accomplish this quickly and accurately. This tutorial shows how to do so.
Let’s consider this small data set in SAS as an example. They are hypothetical statistics of 3 players from a basketball game.
data basketball1; input jersey points $ rebounds $ assists $; datalines; 21 10 14 1 4 11 3 12 23 29 4 5 ; run;
The 3 performance metrics (points, rebounds, and assists) are clearly numeric, but they are currently coded as character. (You can use PROC CONTENTS to confirm this if needed.)
The jersey number is really a character variable, because its magnitude has no real-life meaning. The National Basketball Association (NBA) allows “00” as a possible jersey number. (Robert Parish wore this jersey number; he won 4 NBA championships and reached the Naismith Basketball Hall of Fame.) If you code “00” as a numeric variable, then it will render as “0”. Thus, for NBA jersey numbers, it is best to save it as a character variable.
I can convert these variables into the correct types using the following code. Note that I chose “2.” for the length of “JERSEY”, because I know that jersey numbers in the NBA have, at most, 2 digits.
data basketball2; set basketball1; jersey2 = put(jersey, 2.); drop jersey; rename jersey2 = jersey; points2 = input(points, 8.); drop points; rename points2 = points; rebounds2 = input(rebounds, 8.); drop rebounds; rename rebounds2 = rebounds; assists2 = input(assists, 8.); drop assists; rename assists2 = assists; run;
Despite this success, the above code can be very cumbersome when I need to do this for many variables, and this situation arose in my job recently. In this tutorial, I will show a fast way of doing these conversions for many variables at once. I will use this BASKETBALL1 data set as an example, and I will convert POINTS, REBOUNDS, and ASSISTS from character to numeric simultaneously.
Recent Comments