Data Step | Technology Tales

ERROR 22-322: Syntax error, expecting one of the following: a name, *.

14th June 2010

This is one of the classic SAS errors that you can get from PROC SQL and it can be thrown by a number of things. Missing out a comma in a list of variables on a SELECT statement is one situation that will do it, as will having an extraneous one. As I discovered recently, an ill-defined SAS function nesting like LEFT(TRIM(PERIOD,BEST.)) will have the same effect; notice the missing PUT function in the example. The latter surprised me because I might have expected something more descriptive for this as would be the case in data step code. In the event, it took some looking before the problem hit me because it’s amazing how blind you can become to things that are staring you in the face. Familiarity really can make you pay less attention.

ERROR: Invalid value for width specified – width out of range

8th June 2010

This could be the beginning of a series on error messages from PROC SQL that may appear unclear to a programmer more familiar with Data Step. The cause of my getting the message that heads this posting is that there was a numeric variable with a length less that the default of 8, not the best of situations. Sadly, the message doesn’t pin point the affected variable so it took some commenting out of pieces of code before I found the cause of the problem. That’s never to say that PROC SQL does not have debugging functionality in the form of FEEDBACK, NOEXEC, _METHOD and _TREE options on the PROC SQL line itself or the validation statement but neither of these seemed to help in this instance. Still, they’re worth keeping in mind for the future as is SAS Institute’s own page on SQL query debugging. Of course, now that I know what might be the cause, a simple PROC SQL report using the dictionary tables should help. The following code should do the needful:

proc sql; select memname, name, type, length from dictionary.columns where libname="DATA" and type="num" and length ne 8; quit;

SAS Macro and Dataline/Cards Statements in Data Step

28th October 2008

Recently, I tried code like this in a SAS macro:

data sections; infile datalines dlm=","; input graph_table_number $15. text_line @1 @; datalines; "11.1 ,Section 11.1", "11.2 ,Section 11.2", "11.3 ,Section 11.3" ; run;

While it works in its own right, including it as part of a macro yielded this type of result:

ERROR: The macro X generated CARDS (data lines) for the DATA step, which could cause incorrect results. The DATA step and the macro will stop executing.

A bit of googling landed me on SAS-L where I spotted a solution like this one that didn’t involving throwing everything out:

filename temp temp;

data _null_; file temp; put; run;

data sections; length graph_table_number $15 text_line $100; infile temp dlm=","; input @; do _infile_= "11.1 ,Section 11.1", "11.2 ,Section 11.2", "11.3 ,Section 11.3" ; input graph_table_number $15. text_line @1 @; output; end; run;

filename temp clear;

The filename statement and ensuing data step creates a dummy file in the SAS work area that gets cleared at the end of every session. That seems to fool the macro engine into thinking that input is from a file and not the CARDS/DATALINES method to which it takes grave exception. The trailing @’s hold an input record for the execution of the next INPUT statement within the same iteration of the DATA step so that the automatic variable _infile_ can be fed as part of the input process in a do block with the output statement ensure that new records from the input buffer reach the data set being created.

This method does work but I would like to know the underlying reason as to why SAS Macro won’t play well with included data entry using DATALINES or CARDS statements in a data step, particularly when it allows other methods that using either SQL insert statements or standard variable assignment in data step. I find it such a curious behaviour that I remain on the lookout for the explanation why it is like this.

SAS Data Step Hash Objects and Memory

3rd June 2008

Using hash objects in SAS data step code offers some great advantages from the speed point of view; having a set of data in memory rather than on disk makes things much faster. However, that means that you need to keep more of an eye on the amount of memory that’s being used. The first thing is to work out how much memory is available and it’s not necessarily the total amount installed on the system or, for that matter, the amount of memory per processor on a multi-processor system. What you really need is the number, in bytes, that is stored in the XMRLMEM system option and here’s a piece of code that’ll do just that:

data _null_; mem=getoption('xmrlmem'); put mem; run;

The XMRLMEM is itself an option that you can only declare in the system call that starts SAS up in the first place and there are advantages to keeping it under control, particularly on large multi-user servers. However, if your hash objects start to exceed what is available, here’s the sort of thing that you can expect to see:

ERROR: Hash object added 49136 items when memory failure occurred. FATAL: Insufficient memory to execute data step program. Aborted during the EXECUTION phase. NOTE: The SAS System stopped processing this step because of insufficient memory. NOTE: SAS set option OBS=0 and will continue to check statements. This may cause NOTE: No observations in data set.

Those messages are a cue for you to learn to keep those hash objects and to only ever make them as large as your memory settings will allow. Another thing to note is that hash objects are best retained for rather fixed data volumes instead of ones that could outgrow their limits. There’s a certain amount of common sense in operation here but it may be that promoters of hash objects don’t mention their limitations as much as they should. If you want to find out more, SAS have a useful paper on their website and the their Knowledge Base has more on the error messages that you can get.

Append or update?

25th November 2007

SAS can generate many types of output: plain text, XML, PDF, RTF, Excel, etc. With all of these and the SAS procedures like PROC REPORT, PROC TABULATE and so on, it might seem surprising for me to say that I have been generating output with data step PUT and FILE statements. There was, of course, a reason for this: creating text files for loading into a new database-driven software application. At one stage, I also did some data interleaving at the output stage and that’s when I discovered that the default behaviour for SAS FILE statements is to completely overwrite a file unless the MOD option was specified. Adding that switches on APPEND behaviour. The code below adds a header in one step while adding data below it in another. I know that there are slicker ways to achieve this like setting up your data as you want it or using _N_ to ensure that something only appears once but here’s another way. As per the Perl, there’s often more than one way to do something with SAS.

data _null_; file ds_data; put "fieldtype;datasetname;datasetlabel;datasetlayout;datasetclass;datasetstandardversion"; run;

data _null_; set ds_ispec; file ds_data mod; line="datasetstandard;"||trim(memname)||";"||trim(memlabel)||";;;"||trim(memver); put line; run;

SAS9 SQL Constraints

23rd July 2007

With SAS 9, SAS Institute have introduced the sort sort of integrity constraints that have been bread and butter for relational database SQL programs but some SAS programmers may find them more restrictive than they might like. The main one that comes to my mind is the following:

proc sql noprint; create table a as select a.*,b.var from a left join b on a.index=b.index; quit;

Before SAS 9, that worked merrily with nary a comment but you now will see a warning like this:

WARNING: This CREATE TABLE statement recursively references the target table. A consequence of this is a possible data integrity problem.

In data step, the following still runs without a complaint:

data a; merge a b(keep=index var); by index; run;

On the surface of it this does look inconsistent. From a database programmer’s point of view having to use different source and target datasets is no hardship but seems a little surplus to requirements for a SAS programmer trained to keep down the number of temporary datasets in an effort to reduce I/O and keep things tidy, an academic concept perhaps in these days of high processing power and large disks. Adding UNDO_POLICY=NONE to the PROC SQL line does make everything consistent again but I see this as being anathema to a database programming type. I do admit to indulging in the override for personal quick and dirty purposes but abiding by the constraint is how I do things for formal purposes like inclusion in an application.

» Newer Entries »

Technology Tales