Technology Tales

Adventures & experiences in contemporary technology

Smoother use of more than one SAS DMS session at a time

11th March 2012

Unless you have access to SAS Enterprise Guide, being able to work on one project at a time can be a little inconvenient. It is possible to open up more than one Display Manager System (DMS, the traditional SAS programming interface) session at a time but you can get a pop up window for SAS documentation for second and subsequent sessions and you don’t get your settings shared across them either.

The cause of both of the above is the locking of the SASUSER directory files by the first SAS session. However, it is possible to set up a number of directories and set the -sasuser option to point at different ones for different sessions.

On Windows, the command in the SAS shortcut becomes:

C:\Program Files\SAS\SAS 9.1\sas.exe -sasuser "c:\sasuser\session 1\"

On UNIX or Linux, it would look similar to this:

sas -sasuser "~/sasuser/session1/"

The “session1” in the folder paths above can be replaced with whatever you need and you can have as many as you want too. It might not seem much of a need but synchronising the SASUSER folders every now and again can give you a more consistent set of settings across each and you don’t get intrusive pop up boxes or extra messages in the log either.

Dealing with variable length warnings in SAS 9.2

11th January 2012

A habit of mine is to put a LENGTH or ATTRIB statement between DATA and SET statements in a SAS data step to reset variable lengths. By default, it seems that this triggers truncation warnings in SAS 9.2 or SAS 9.3 when it didn’t in previous versions. SAS 9.1.3, for instance, allowed you have something like the following for shortening a variable length without issuing any messages at all:

data b;
length x $100;
set a;
run;

In this case, x could have a length of 200 previously and SAS 9.1.3 wouldn’t have complained. Now, SAS 9.2 and 9.3 will issue a warning if the new length is less than the old length. This can be useful to know but it can be changed using the VARLENCHK system option. The default value is WARN but it can be set to ERROR if you really want to ensure that there is no chance of truncation. Then, you get error messages and the program fails where it normally would run with warnings. Setting the value of the option to NOWARN restores the type of behaviour seen in SAS 9.1.3 and versions prior to that.

The SAS documentation says that the ability to change VARLENCHK can be restricted by an administrator so you might need to deal with this situation in a more locked down environment. Then, one option would be to do something like the following:

data b;
drop x;
rename _x=x;
set a;
length _x $100;
_x=strip(x);
run;

It’s a bit more laborious than setting to the VARLENCHK option to NOWARN but the idea is that you create a new variable of the right length and replace the old one with it. That gets rid of warnings or errors in the log and resets the variable length as needed. Of course, you have to ensure that there is no value truncation with either remedy. If any is found, then the dataset specification probably needs updating to accommodate the length of the values in the data. After all, there is no substitute for getting to know your data and doing your own checking should you decide to take matters into your hands.

There is a use for the default behaviour though. If you use a specification to specify a shell for a dataset, then you will be warned when the shell shortens variable lengths. That allows you to either adjust the dataset or your program. Also, it gives more information when you get variable length mismatch warnings when concatenating or merging datasets. There was a time when SAS wasn’t so communicative in these situations and some investigation was needed to establish which variable was affected. Now, that has changed without leaving the option to work differently if you so do desire. Sometimes, what can seem like an added restriction can have its uses.

Setting VIEWTABLE to show column names in SAS

15th September 2011

By default in the DMS, Base SAS opens datasets from its Explorer using VIEWTABLE and with variable labels in the column headings and not variable names. Because I have been fortunate to use systems with SAS/FSP both installed and licensed, I have taken to using FSVIEW for browsing SAS datasets as a workaround and, though the interface may look old to some, it proves to be a very flexible tool that still has a few things to teach newer ones. With SAS Enterprise Guide, the dataset viewing functionality is different to both VIEWTABLE and FSVIEW but I have been to make it work for me. The SAS EG dataset viewing tool may appear like the former of these but it has a few tricks to teach its forbear.

Now that I find myself working again with the traditional SAS DMS interface and without SAS/FSP, I decided to see if there was a way to get VIEWTABLE to display variable names instead of variable labels by default. As it happened, the answer was found in an internet forum discussion. From the SAS command line, you can achieve the result issuing a command like the following:

VT SASHELP.VCOLUMN COLHEADING=NAMES

VT is the VIEWTABLE shortcut but it is the COLHEADING=NAMES option on the line that gets variable names shown in column headings. Taking it further, you can set this as the default setting for datasets opened using a mouse from Explorer panes using the following procedure:

  • Click in or on the Explorer pane to highlight the the Explorer window.
  • Select Tools->Options->Explorer in the menus.
  • Select the Members tab.
  • Double click on the TABLE icon.
  • Double click on the &Open action.
  • Set the Action command to:  VIEWTABLE %8b.’%s’.DATA COLHEADING=NAMES.
  • Click on the Set Default button.
  • Save changes and close the Explorer Options window.

Because the DMS looks similar across versions 8.0 through to 9.2, the above instructions should be relevant to all of those. While I have yet to get the opportunity to use SAS 9.3, I would be surprised to find that the traditional SAS interface has changed there too, even though much else has changed about SAS. In fact, the latest version of SAS has brought quite a few new interesting features for programmers so it seems that you can do more through a familiar interface, not entirely a bad thing. It looks as if this VIEWTABLE tweak could be useful for a while yet.

Using Data Step to Create a Dataset Template from a Dataset in SAS

23rd November 2010

Recently, I wanted to make sure that some temporary datasets that were being created during data processing in a dataset creation program weren’t truncating values or differed from the variable lengths in the original. It was then that a brainwave struck me: create an empty dataset shell using data step and use that set all the variable lengths for me when the new datasets were concatenated to it. The code turned out to be very simple and here is an example of how it looked:

data shell;
stop;
set example;
run;

The STOP statement, prevents the data step from reading in any of the values in the template dataset and just its header is written out to another (empty) dataset that can be used to set things up as you would want them to be. It certainly was a quick solution in my case.

Creating a Data Set Containing Confidence Intervals Using PROC UNIVARIATE

5th September 2010

While you could generate data sets containing means and confidence intervals using PROC SUMMARY or PROC MEANS, curiosity and the need to verify a program using a different technique were what drove me to consider using PROC UNIVARIATE for the task. For the record, the PROC SUMMARY code is below and the only difference between it and MEANS is that it doesn’t produce output by default, something that’s not needed in this case anyway. Quite why there are two SAS procedures doing exactly the same thing is beyond me though I do wonder if the NOPRINT options was a later addition than these two procedures. The LCLM and UCLM keywords are what triggers the calculation of confidence limits and the ALPHA option controls the confidence interval used; 0.05 specifies a 95% interval, 0.1 a 90% one and so on.

proc summary data=sashelp.class mean lclm uclm alpha=0.05;
var age;
output out=sasuser.lims mean=mean lclm=lclm uclm=uclm;
run;

Given that I have had PROC UNIVARIATE producing statistics that MEANS/SUMMARY didn’t in previous versions of SAS (I believe that is was standard deviation that was absent from MEANS/SUMMARY), I might have expected the calculation and export of confidence limits to a data set to be straightforward. Sadly, it’s not a case of simply adding LCLM and UCLM keywords in the OUTPUT statement for the procedure and ODS OUTPUT is needed to create the data set instead. An ODS SELECT statement is needed to pick out the BasicIntervals output object (UNIVARIATE creates quite a few, it seems) that is created through specification of the CIBASIC and ALPHA (performs the same role as it does for PROC MEANS/SUMMARY) options on the PROC UNIVARIATE statement. The reason for the ODS LISTING and ODS RTF statements below is to stop output being sent to the output window in a standard SAS session. For some reason, it appears that you need the sending of output to one of the LISTING, HTML or RTF destinations or there will be no data in the data set; I met up with the same behaviour when using ODS PS, an ODS PRINTER destination. The data set will contain statistics for mean, standard deviation and variance so that’s why there is a WHERE clause on the ODS OUTPUT statement.

ods listing close;
ods rtf body="c:\temp\uni_eg.doc";
ods select BasicIntervals;
ods output BasicIntervals=sasuser.stats(where=(lowcase(parameter)="mean") );

proc univariate cibasic alpha=0.05 data=sashelp.class;
var age;
run;

ods output close;
ods rtf close;
ods listing;

Using ODS Graphics to Create Plots Using PROC LIFETEST

3rd September 2010

One of the nice things about SAS 9.2 is that creation of statistical graphics is enhanced using ODS. One of the beneficiaries of this is PROC LIFETEST, a procedure that gained a lot when data sets could be created from it using ODS OUTPUT  statements. Before that, it was a matter of creating text output and converting it to a SAS data set using Data Step and that was a nuisance on a system that attached special significance to output destinations set up using PROC PRINTTO. What you’ll find below is a sample of the type of code for creating a Kaplan-Meier survival plot for time to adverse events resulting in discontinuation of study treatment with actual and censored times. The IMAGENAME parameter on the ODS GRAPHICS statement line controls the name of the file and it is possible to change the type using the IMAGEFMT parameter too.

ods graphics on / imagename=”fig5″;
proc lifetest data=km3 method=km plots=survival;
time timetoae*cens_ae(0);
run;
ods graphics off;

On Making PROC REPORT Work Harder

1st September 2010

In the early years of my SAS programming career, there seemed to be just the one procedure to use if you wanted to create a summary table. That was TABULATE and it was great for generating columns according to the value of a variable such as the treatment received by a subject in a clinical study. To a point, it could generate statistics for you too and I often used it to sum frequency and percentage variables. Since then, it seems to have been enhanced a little and it surprised me with the statistics it could produce when I had a recent play. Here’s the code:

proc tabulate data=sashelp.class;
class sex;
var age;
table age*(n median*f=8. mean*f=8.1 std*f=8.1 min*f=8. max*f=8. lclm*f=8.1 uclm*f=8.1),sex / misstext="0";
run;

When you compare that with the idea of creating one variable per column and then defining them in PROC REPORT as many do, it has to look more elegant and the results aren’t bad either though they can be tweaked further from the quick example that I generated. That last comment brings me to the point that PROC REPORT seems to have taken over from TABULATE wherever I care to look these days and I do ask myself if it is the right tool for that for which it is being used or if it is being used in the best way.

Using Data Step to create one variable per column in a PROC REPORT output doesn’t strike me as the best way to write reusable code but there are ways to make REPORT do more for you. For example, by defining GROUP, ACROSS and ANALYSIS columns in an output, you can persuade the procedure to do the summarising for you and there’s some example code below with the comma nesting height under sex in the resulting table. Sums are created by default if you do this and forgoing an analysis column definition means that you get a frequency table, not at all a useless thing in many cases.

proc report data=sashelp.class nowd missing;
columns age sex,height;
define age / group "Age";
define sex / across "Sex";
define height / analysis mean f=missing. "Mean Height";
run;

For those times when you need to create more heavily formatted statistics (summarising range as min-max rather showing min and max separately, for example), you might feel that the GROUP/ACROSS set-up’s non-display of character values puts a stop to using that approach. However, I found that making every value combination unique and attaching a cell ID helps to work around the problem. Then, you can create a format control data set from the data like in the code below and create a format from that which you can apply to the cell ID’s to display things as you need them. This method does make things more portable from situation to situation than adding or removing columns depending on the values of a classification variable.

proc sql noprint;
create table cntlin as
select distinct "fmtname" as fmtname, cellid as start, cellid as end, decode as label
from report;
quit;

proc format lib=work cntlin=cnlin;
run;

A look at Emacs

10th August 2010

It’s amazing what work can bring your way in terms of technology. For me, (GNU) Emacs Has proved to be such a thing recently. It may have been around since 1975, long before my adventures in computing ever started, in fact, but I am asking myself why I never really have used it much. There are vague recollections of my being aware of its existence in the early days of my using UNIX over a decade ago. Was it a shortcut card with loads of seemingly esoteric keyboard shortcuts and commands that put me off it back then? The truth may have been that I got bedazzled with the world of Microsoft Windows instead, and so began a distraction that lingered until very recently. As unlikely as it looks now, Word and Office would have been part of the allure of what some consider as the dark side these days. O how OpenOffice.org and their ilk have changed that state of affairs…

The unfortunate part of the Emacs story might be that its innovations were never taken up as conventions by mainstream computing. If its counterparts elsewhere used the same keyboard shortcuts, it would feel like learning such an unfamiliar tool. Still, it’s not as if there isn’t logic behind it because it will work both in a terminal session (where I may have met it for the first time) and a desktop application GUI. The latter is the easier to learn, and the menus list equivalent keyboard shortcuts for many of their entries, too. For a fuller experience though, I can recommend the online manual, and you can buy it in paper form too if you prefer.

One thing that I discovered recently is that external factors can sour the impressions of a piece of software. For instance, I was using a UNIX session where the keyboard mapping weren’t optimal. There’s nothing like unfamiliar behaviour for throwing you off track because you felt your usual habits were being obstructed. For instance, finding that a Backspace key is behaving like a Delete one is such an obstruction. It wasn’t the fault of Emacs, and I have found that using Ctrl+K (C-k in the documentation) to delete whole lines is invaluable.

Apart from keyboard mapping niggles, Emacs has to be respected as a powerful piece of software in its own right. It may not have the syntax highlighting capabilities of some, like gedit or NEdit for instance, but I have a hunch that a spot of Lisp programming would address that need. What you get instead is support for version control systems like RCS or CVS, along with integration with GDB for debugging programs written in a number of languages. Then, there are features like file management, email handling, newsgroup browsing, a calendar and calculator that make you wonder if they tried to turn a text editor into something like an operating system. With Google trying to use Chrome as the basis of one, it almost feels as is Emacs was ahead of its time, though that may have been more due to its needing to work within a UNIX shell in those far-off pre-GUI days. It really is saying something that it has stood the test of time when so much has fallen by the wayside. Like Vi, it looks as if the esteemable piece of software is showing no signs of going away just yet. Maybe it was well-designed in the beginning, and the thing certainly seems more than a text editor with its extras. Well, it has to offer a good reason for making its way into Linux too…

ERROR 22-322: Syntax error, expecting one of the following: a name, *.

14th June 2010

This is one of the classic SAS errors that you can get from PROC SQL and it can be thrown by a number of things. Missing out a comma in a list of variables on a SELECT statement is one situation that will do it, as will having an extraneous one. As I discovered recently, an ill-defined SAS function nesting like LEFT(TRIM(PERIOD,BEST.)) will have the same effect; notice the missing PUT function in the example. The latter surprised me because I might have expected something more descriptive for this as would be the case in data step code. In the event, it took some looking before the problem hit me because it’s amazing how blind you can become to things that are staring you in the face. Familiarity really can make you pay less attention.

ERROR: Invalid value for width specified – width out of range

8th June 2010

This could be the beginning of a series on error messages from PROC SQL that may appear unclear to a programmer more familiar with Data Step. The cause of my getting the message that heads this posting is that there was a numeric variable with a length less that the default of 8, not the best of situations. Sadly, the message doesn’t pin point the affected variable so it took some commenting out of pieces of code before I found the cause of the problem. That’s never to say that PROC SQL does not have debugging functionality in the form of FEEDBACK, NOEXEC, _METHOD and _TREE options on the PROC SQL line itself or the validation statement but neither of these seemed to help in this instance. Still, they’re worth keeping in mind for the future as is SAS Institute’s own page on SQL query debugging. Of course, now that I know what might be the cause, a simple PROC SQL report using the dictionary tables should help. The following code should do the needful:

proc sql;
select memname, name, type, length
from dictionary.columns
where libname="DATA" and type="num" and length ne 8;
quit;

  • All the views that you find expressed on here in postings and articles are mine alone and not those of any organisation with which I have any association, through work or otherwise. As regards editorial policy, whatever appears here is entirely of my own choice and not that of any other person or organisation.

  • Please note that everything you find here is copyrighted material. The content may be available to read without charge and without advertising but it is not to be reproduced without attribution. As it happens, a number of the images are sourced from stock libraries like iStockPhoto so they certainly are not for abstraction.

  • With regards to any comments left on the site, I expect them to be civil in tone of voice and reserve the right to reject any that are either inappropriate or irrelevant. Comment review is subject to automated processing as well as manual inspection but whatever is said is the sole responsibility of the individual contributor.