Tag Archive for SAS

ERROR: Ambiguous reference, column xx is in more than one table.

Sometimes, SAS messages are not all that they seem and a number of them are issued from PROC SQL when something goes awry with your code. In fact, I got a message like the above when ordering the results of the join using a variable that didn’t exist in either of the datasets that were joined. This type of thing has been around for a while (I have been using SAS since version 6.11 and it was there then) and it amazes me that we haven’t seen a better message in more recent versions of SAS; it was SAS 9.2 where I saw it most recently.

proc sql noprint;
select a.yy, a.yyy, b.zz
from a left join b
on a.yy=b.yy
order by xx;
quit;

Calculating geometric means using SAS

Recently, I needed to calculate geometric means after a break of a number of years and ended racking a weary brain until it was brought back to me. In order that I am not in the same situation again, I am recording it here and sharing it always is good too.

The first step is to take the natural log (to base e or the approximated irrational value of the mathematical constant, 2.718281828) of the actual values in the data. Since you cannot have a log of zero, the solution is either to exclude those values or substitute a small value that will not affect the overall result as is done in the data step below. In SAS, the log function uses the number e as its base and you need to use the log10 equivalent when base 10 logs are needed.

data temp;
set temp;
if result=0 then _result=0.000001;
else _result=result;
ln_result=log(_result);
run;

Next, the mean of the log values is determined and you can use any method of doing that so long as it gives the correct values. PROC MEANS is used here but PROC SUMMARY (identical to MEANS except it defaults to data set creation while that generates output by default; because of that, we need to use the NOPRINT option here), PROC UNIVARIATE or even the MEAN function in PROC SQL.

proc means data=temp noprint;
output out=mean mean=mean;
var ln_result;
run;

With the mean of the log values obtained, we need to take the exponential of the obtained value(s) using the EXP function. This returns values of the same magnitude as in the original data using the formula emean.

data gmean;
set mean;
gmean=exp(mean);
run;

Creating placeholder graphics in SAS using PROC GSLIDE for when no data are available

Recently, I found myself with a plot to produce but there were no data to be presented so a placeholder output is needed. For a lisitng or a table, this is a matter of detecting if there are observations to be listed or summarised and then issuing a placeholder lisitng using PROC REPORT if there are no data available. Using SAS/GRAPH, something similar can be acheived using one of its curiosities.

In the case of SAS/GRAPH, PROC GSLIDE looks like the tool to user for the same purpose. The procedure does get covered as part of a SAS Institute SAS/GRAPH training course but they tend to gloss over it. After all, there is little reason to go creating presentations in SAS when PowerPoint and its kind offer far more functionality. However, it would make an interesting tale to tell how GSLIDE became part of SAS/GRAPH in the first place. Its existence makes me wonder if it pre-exists the main slideshow production tools that we use today.

The code that uses PROC GSLIDE to create a placeholder graphic is as follows (detection of the number of observations in a SAS dataset is another entry on here):

proc gslide;
note height=10;
note j=center “No data are available”;
run;
quit;

PROC GSLIDE is one of those run group procedures in SAS so a QUIT statement is needed to close it. The NOTE statements specify the text to be added to the graphic. The first of these creates a blank line of the required height for placing the main text in the middle of the graphic. It is the second one that adds the centred text that tells users of the generated output what has happened.

Creating waterfall plots in SAS using PROC GCHART

Recently, I needed to create a waterfall plot couldn’t use PROC SGPLOT since it was incompatible with publishing macros that use PROC GREPLAY on the platform that I was using; SGPLOT doesn’t generate plots in SAS catalogs but directly creates graphics files instead. Therefore, I decided that PROC GCHART needed to be given a go and it delivered what was needed .

The first step is to get the data into the required sort order:

proc sort data=temp;
by descending result;
run;

Then, it is time to add an ID variable for use in the plot’s X-axis (or midpoint axis in PROC GCHART) using an implied value retention to ensure that every record in the dataset had a unique identifier:

data temp;
set temp;
id+1;
run;

After that, axes have to be set up as needed. For instance, the X-axis (the axis2 statement below) needs to be just a line with no labels or tick marks on there and the Y-axis was fully set up with these, turning the label from vertical to horizontal as needed with the ANGLE option controlling the overall angle of the word(s) and the ROTATE option dealing with the letters, and a range declaration using the ORDER option.

axis1 label=none major=none minor=none value=none;
axis2 label=(rotate=0 angle=90 “Result”) order=(-50 to 80 by 10);

With the axis statements declared, the GCHART procedure can be defined. Of this, the VBAR statement is the engine of the plot creation with the ID variable used for the midpoint axis and the result variable used as the summary variable for the Y-axis. The DISCRETE keyword is needed to produce a bar for every value of the ID variable or GCHART will bundle them by default. Next, references for the above axis statements (MAXIS option for midpoint axis and AXIS option for Y-axis) are added and the plot definition is complete. One thing that has to be remembered is that GCHART uses run group processing so a QUIT statement is needed at the end to close it at execution time. This feature has its uses and appears in other procedures too though SAS procedures generally are concluded by a RUN statement.

proc gchart data=temp;
vbar id / sumvar=result discrete axis=axis2 maxis=axis1;
run;
quit;

Smoother use of more than one SAS DMS session at a time

Unless you have access to SAS Enterprise Guide, being able to work on one project at a time can be a little inconvenient. It is possible to open up more than one Display Manager System (DMS, the traditional SAS programming interface) session at a time but you can get a pop up window for SAS documentation for second and subsequent sessions and you don’t get your settings shared across them either.

The cause of both of the above is the locking of the SASUSER directory files by the first SAS session. However, it is possible to set up a number of directories and set the -sasuser option to point at different ones for different sessions.

On Windows, the command in the SAS shortcut becomes:

C:\Program Files\SAS\SAS 9.1\sas.exe -sasuser “c:\sasuser\session 1\”

On UNIX or Linux, it would look similar to this:

sas -sasuser “~/sasuser/session1/”

The “session1″ in the folder paths above can be replaced with whatever you need and you can have as many as you want too. It might not seem much of a need but synchronising the SASUSER folders every now and again can give you a more consistent set of settings across each and you don’t get intrusive pop up boxes or extra messages in the log either.

Dealing with variable length warnings in SAS 9.2

A habit of mine is to put a LENGTH or ATTRIB statement between DATA and SET statements in a SAS data step to reset variable lengths. By default, it seems that this can triggers truncation warnings in SAS 9.2 or SAS 9.3 when it didn’t in previous versions. SAS 9.1.3, for instance, allowed you have something like the following for shortening a variable length without issuing any messages at all:

data b;
length x $100;
set a;
run;

In this case, x could have a length of 200 previously and SAS 9.1.3 wouldn’t have complained. Now, SAS 9.2 and 9.3 will issue a warning if the new length is less than the old length. This can be useful to know but it can be changed using the VARLENCHK system option. The default value is WARN but it can be set to ERROR if you really want to ensure that there is no chance of truncation. Then, you get error messages and the program fails where it normally would run with warnings. Setting the value of the option to NOWARN restores the type of behaviour seen in SAS 9.1.3 and versions prior to that.

The SAS documentation says that the ability to change VARLENCHK can be restricted by an administrator so you might need to deal with this situation in a more locked down environment. Then, one option would be to do something like the following:

data b;
drop x;
rename _x=x;
set a;
length _x $100;
_x=strip(x);
run;

It’s a bit more laborious than setting to the VARLENCHK option to NOWARN but the idea is that you create a new variable of the right length and replace the old one with it. That gets rid of warnings or errors in the log and resets the variable length as needed. Of course, you have to ensure that there is no value truncation with either remedy. If any is found, then the dataset specification probably needs updating to accommodate the length of the values in the data. After all, there is no substitute for getting to know your data and doing your own checking should you decide to take matters into your hands.

There is a use for the default behaviour though. If you use a specification to specify a shell for a dataset, then you will be warned when the shell shortens variable lengths. That allows you to either adjust the dataset or your program. Also, it gives more information when you get variable length mismatch warnings when concatenating or merging datasets. There was a time when SAS wasn’t so communicative in these situations and some investigation was needed to establish which variable was affected. Now, that has changed without leaving the option to work differently if you so do desire. Sometimes, what can seem like an added restriction can have its uses.

  • As is commonly the case with places like these, all the views that you find expressed on here in postings and articles are mine alone and not those of any organisation with which I have any association, through work or otherwise. With regards to any comments left on the site, I reserve the right to reject any that are inappropriate. Otherwise, whatever is said is the sole responsibility of whoever is leaving the comment.

Private