Technology Tales

Adventures & experiences in contemporary technology

Using the IN operator in SAS Macro programming

8th October 2012

This useful addition came in SAS 9.2 and I am amazed that it isn’t enabled by default. To accomplish that, you need to set the MINOPERATOR option unless someone has done it for you in the SAS AUTOEXEC or another configuration program. Thus, the safety first approach is to have code like the following:

options minoperator;

%macro inop(x);

%if &x in (a b c) %then %do;
%put Value is included;
%end;
%else %do;
%put Value not included;
%end;

%mend inop;

%inop(a);

Also, the default delimiter is the space, so if you need to change that, then the MINDELIMITER option needs setting. Adjusting the above code so that the delimiter now is the comma character gives us the following:

options minoperator mindelimiter=",";

%macro inop(x);

%if &x in (a,b,c) %then %do;
%put Value is included;
%end;
%else %do;
%put Value not included;
%end;

%mend inop;

%inop(a);

Without any of the above, the only approach is to have the following and that is what we had to do for SAS versions prior to 9.2:

%macro inop(x);

%if &x=a or &x=b or &x=c %then %do;
%put Value is included;
%end;
%else %do;
%put Value not included;
%end;

%mend inop;

%inop(a);

It may be clunky but it does work and remains a fallback in newer versions of SAS. Saying that, having the IN operator available makes writing SAS Macro code that little bit more swish so it’s a good thing to know.

ERROR: This range is repeated, or values overlap: – .

15th September 2012

This is another posting in an occasional series on SAS error and warning messages that aren’t as clear as they’d need to be. What produced the message was my creation of a control data set that I then wished to use to create a data-driven (in)format. It was the PROC FORMAT step that issued the message and I got no (in)format created. However, there were no duplicate entries in the control data set as the message suggested to me so a little more investigation was needed.

What that revealed was that there might be one variable missing from the data set that I needed to have. The SAS documentation has FMTNAME, START and LABEL as compulsory variables with they containing the following: format name, initial value and displayed value. My intention was to create a numeric code variable for one containing character strings using my data-driven format with then numbers specified within a character variable as it should be. What was missing then was TYPE.

This variable can be one of the following values: C for character formats, I for numeric informats, J for character informats, N for numeric formats and P for picture formats. Due to it being a conversion from character values to numeric ones, I set the values of TYPE to I and used an input function to do the required operations. The code for successfully creating the informat is below:

proc sql noprint;
create table tpts as
select distinct "_vstpt" as fmtname,
"I" as type,
vstpt as start,
vstpt as end,
strip(put(vstptnum,best.)) as label
from test
where not missing(vstptnum);
quit;

proc format library=work cntlin=tpts;
run;
quit;

Though I didn’t need to do it, I added an END variable too for sake of completeness. In this case, the range is such that its start and end are the same and there are cases where that will not be the case though I am not dwelling on those.

WARNING: Engine XPORT does not support SORTEDBY operations. SORTEDBY information cannot be copied.

24th July 2012

When recently creating a transport file using PROC COPY and the XPORT library engine, I found the above message in the log. The code used was similar to the following:

libname tran xport "c:\temp\tran.xpt";

proc copy in=data out=tran;
run;

When I went seeking out the cause on the web, I discovered this SAS Note that dates from before the release of SAS 6.12, putting the issue at more than ten years old. My take on its continuing existence is that we still to use a transport file format that was introduced in SAS 5.x for sake of interoperability, both between SAS versions and across alternatives to the platform.

The SORTEDBY flag in a dataset header holds the keys used to sort the data and it isn’t being copied into the XPORT transport files, hence the warning. To get rid of it, you need to remove the information manually in data step using the SORTEDBY option on the DATA statement or using PROC DATASETS, which avoids rewriting the entire data set.

First up is the data step option:

data test(sortedby=_null_);
set sashelp.class;
run;

Then, there’s PROC DATASETS:

proc datasets;
modify test(sortedby=_null_);
run;
quit;

It might seem counterproductive to exclude the information but it makes no sense to keep what’s being lost anyway. So long as the actual sort order is unchanged, and I believe that the code that that below will not alter it, we can live with its documentation in a specification until transport files created using PROC CPORT are as portable as those from PROC COPY.

ERROR: Ambiguous reference, column xx is in more than one table.

5th May 2012

Sometimes, SAS messages are not all that they seem and a number of them are issued from PROC SQL when something goes awry with your code. In fact, I got a message like the above when ordering the results of the join using a variable that didn’t exist in either of the datasets that were joined. This type of thing has been around for a while (I have been using SAS since version 6.11 and it was there then) and it amazes me that we haven’t seen a better message in more recent versions of SAS; it was SAS 9.2 where I saw it most recently.

proc sql noprint;
select a.yy, a.yyy, b.zz
from a left join b
on a.yy=b.yy
order by xx;
quit;

Creating placeholder graphics in SAS using PROC GSLIDE for when no data are available

18th March 2012

Recently, I found myself with a plot to produce but there were no data to be presented so a placeholder output is needed. For a lisitng or a table, this is a matter of detecting if there are observations to be listed or summarised and then issuing a placeholder lisitng using PROC REPORT if there are no data available. Using SAS/GRAPH, something similar can be acheived using one of its curiosities.

In the case of SAS/GRAPH, PROC GSLIDE looks like the tool to user for the same purpose. The procedure does get covered as part of a SAS Institute SAS/GRAPH training course but they tend to gloss over it. After all, there is little reason to go creating presentations in SAS when PowerPoint and its kind offer far more functionality. However, it would make an interesting tale to tell how GSLIDE became part of SAS/GRAPH in the first place. Its existence makes me wonder if it pre-exists the main slideshow production tools that we use today.

The code that uses PROC GSLIDE to create a placeholder graphic is as follows (detection of the number of observations in a SAS dataset is another entry on here):

proc gslide;
note height=10;
note j=center "No data are available";
run;
quit;

PROC GSLIDE is one of those run group procedures in SAS so a QUIT statement is needed to close it. The NOTE statements specify the text to be added to the graphic. The first of these creates a blank line of the required height for placing the main text in the middle of the graphic. It is the second one that adds the centred text that tells users of the generated output what has happened.

Creating waterfall plots in SAS using PROC GCHART

17th March 2012

Recently, I needed to create a waterfall plot couldn’t use PROC SGPLOT since it was incompatible with publishing macros that use PROC GREPLAY on the platform that I was using; SGPLOT doesn’t generate plots in SAS catalogs but directly creates graphics files instead. Therefore, I decided that PROC GCHART needed to be given a go and it delivered what was needed .

The first step is to get the data into the required sort order:

proc sort data=temp;
by descending result;
run;

Then, it is time to add an ID variable for use in the plot’s X-axis (or midpoint axis in PROC GCHART) using an implied value retention to ensure that every record in the dataset had a unique identifier:

data temp;
set temp;
id+1;
run;

After that, axes have to be set up as needed. For instance, the X-axis (the axis2 statement below) needs to be just a line with no labels or tick marks on there and the Y-axis was fully set up with these, turning the label from vertical to horizontal as needed with the ANGLE option controlling the overall angle of the word(s) and the ROTATE option dealing with the letters, and a range declaration using the ORDER option.

axis1 label=none major=none minor=none value=none;
axis2 label=(rotate=0 angle=90 "Result") order=(-50 to 80 by 10);

With the axis statements declared, the GCHART procedure can be defined. Of this, the VBAR statement is the engine of the plot creation with the ID variable used for the midpoint axis and the result variable used as the summary variable for the Y-axis. The DISCRETE keyword is needed to produce a bar for every value of the ID variable or GCHART will bundle them by default. Next, references for the above axis statements (MAXIS option for midpoint axis and AXIS option for Y-axis) are added and the plot definition is complete. One thing that has to be remembered is that GCHART uses run group processing so a QUIT statement is needed at the end to close it at execution time. This feature has its uses and appears in other procedures too though SAS procedures generally are concluded by a RUN statement.

proc gchart data=temp;
vbar id / sumvar=result discrete axis=axis2 maxis=axis1;
run;
quit;

Smoother use of more than one SAS DMS session at a time

11th March 2012

Unless you have access to SAS Enterprise Guide, being able to work on one project at a time can be a little inconvenient. It is possible to open up more than one Display Manager System (DMS, the traditional SAS programming interface) session at a time but you can get a pop up window for SAS documentation for second and subsequent sessions and you don’t get your settings shared across them either.

The cause of both of the above is the locking of the SASUSER directory files by the first SAS session. However, it is possible to set up a number of directories and set the -sasuser option to point at different ones for different sessions.

On Windows, the command in the SAS shortcut becomes:

C:\Program Files\SAS\SAS 9.1\sas.exe -sasuser "c:\sasuser\session 1\"

On UNIX or Linux, it would look similar to this:

sas -sasuser "~/sasuser/session1/"

The “session1” in the folder paths above can be replaced with whatever you need and you can have as many as you want too. It might not seem much of a need but synchronising the SASUSER folders every now and again can give you a more consistent set of settings across each and you don’t get intrusive pop up boxes or extra messages in the log either.

Dealing with variable length warnings in SAS 9.2

11th January 2012

A habit of mine is to put a LENGTH or ATTRIB statement between DATA and SET statements in a SAS data step to reset variable lengths. By default, it seems that this triggers truncation warnings in SAS 9.2 or SAS 9.3 when it didn’t in previous versions. SAS 9.1.3, for instance, allowed you have something like the following for shortening a variable length without issuing any messages at all:

data b;
length x $100;
set a;
run;

In this case, x could have a length of 200 previously and SAS 9.1.3 wouldn’t have complained. Now, SAS 9.2 and 9.3 will issue a warning if the new length is less than the old length. This can be useful to know but it can be changed using the VARLENCHK system option. The default value is WARN but it can be set to ERROR if you really want to ensure that there is no chance of truncation. Then, you get error messages and the program fails where it normally would run with warnings. Setting the value of the option to NOWARN restores the type of behaviour seen in SAS 9.1.3 and versions prior to that.

The SAS documentation says that the ability to change VARLENCHK can be restricted by an administrator so you might need to deal with this situation in a more locked down environment. Then, one option would be to do something like the following:

data b;
drop x;
rename _x=x;
set a;
length _x $100;
_x=strip(x);
run;

It’s a bit more laborious than setting to the VARLENCHK option to NOWARN but the idea is that you create a new variable of the right length and replace the old one with it. That gets rid of warnings or errors in the log and resets the variable length as needed. Of course, you have to ensure that there is no value truncation with either remedy. If any is found, then the dataset specification probably needs updating to accommodate the length of the values in the data. After all, there is no substitute for getting to know your data and doing your own checking should you decide to take matters into your hands.

There is a use for the default behaviour though. If you use a specification to specify a shell for a dataset, then you will be warned when the shell shortens variable lengths. That allows you to either adjust the dataset or your program. Also, it gives more information when you get variable length mismatch warnings when concatenating or merging datasets. There was a time when SAS wasn’t so communicative in these situations and some investigation was needed to establish which variable was affected. Now, that has changed without leaving the option to work differently if you so do desire. Sometimes, what can seem like an added restriction can have its uses.

Setting VIEWTABLE to show column names in SAS

15th September 2011

By default in the DMS, Base SAS opens datasets from its Explorer using VIEWTABLE and with variable labels in the column headings and not variable names. Because I have been fortunate to use systems with SAS/FSP both installed and licensed, I have taken to using FSVIEW for browsing SAS datasets as a workaround and, though the interface may look old to some, it proves to be a very flexible tool that still has a few things to teach newer ones. With SAS Enterprise Guide, the dataset viewing functionality is different to both VIEWTABLE and FSVIEW but I have been to make it work for me. The SAS EG dataset viewing tool may appear like the former of these but it has a few tricks to teach its forbear.

Now that I find myself working again with the traditional SAS DMS interface and without SAS/FSP, I decided to see if there was a way to get VIEWTABLE to display variable names instead of variable labels by default. As it happened, the answer was found in an internet forum discussion. From the SAS command line, you can achieve the result issuing a command like the following:

VT SASHELP.VCOLUMN COLHEADING=NAMES

VT is the VIEWTABLE shortcut but it is the COLHEADING=NAMES option on the line that gets variable names shown in column headings. Taking it further, you can set this as the default setting for datasets opened using a mouse from Explorer panes using the following procedure:

  • Click in or on the Explorer pane to highlight the the Explorer window.
  • Select Tools->Options->Explorer in the menus.
  • Select the Members tab.
  • Double click on the TABLE icon.
  • Double click on the &Open action.
  • Set the Action command to:  VIEWTABLE %8b.’%s’.DATA COLHEADING=NAMES.
  • Click on the Set Default button.
  • Save changes and close the Explorer Options window.

Because the DMS looks similar across versions 8.0 through to 9.2, the above instructions should be relevant to all of those. While I have yet to get the opportunity to use SAS 9.3, I would be surprised to find that the traditional SAS interface has changed there too, even though much else has changed about SAS. In fact, the latest version of SAS has brought quite a few new interesting features for programmers so it seems that you can do more through a familiar interface, not entirely a bad thing. It looks as if this VIEWTABLE tweak could be useful for a while yet.

Using Data Step to Create a Dataset Template from a Dataset in SAS

23rd November 2010

Recently, I wanted to make sure that some temporary datasets that were being created during data processing in a dataset creation program weren’t truncating values or differed from the variable lengths in the original. It was then that a brainwave struck me: create an empty dataset shell using data step and use that set all the variable lengths for me when the new datasets were concatenated to it. The code turned out to be very simple and here is an example of how it looked:

data shell;
stop;
set example;
run;

The STOP statement, prevents the data step from reading in any of the values in the template dataset and just its header is written out to another (empty) dataset that can be used to set things up as you would want them to be. It certainly was a quick solution in my case.

  • All the views that you find expressed on here in postings and articles are mine alone and not those of any organisation with which I have any association, through work or otherwise. As regards editorial policy, whatever appears here is entirely of my own choice and not that of any other person or organisation.

  • Please note that everything you find here is copyrighted material. The content may be available to read without charge and without advertising but it is not to be reproduced without attribution. As it happens, a number of the images are sourced from stock libraries like iStockPhoto so they certainly are not for abstraction.

  • With regards to any comments left on the site, I expect them to be civil in tone of voice and reserve the right to reject any that are either inappropriate or irrelevant. Comment review is subject to automated processing as well as manual inspection but whatever is said is the sole responsibility of the individual contributor.