Technology Tales

Adventures & experiences in contemporary technology

SAS9 SQL Constraints

23rd July 2007

With SAS 9, SAS Institute have introduced the sort sort of integrity constraints that have been bread and butter for relational database SQL programs but some SAS programmers may find them more restrictive than they might like. The main one that comes to my mind is the following:

proc sql noprint;
create table a as select a.*,b.var from a left join b on a.index=b.index;
quit;

Before SAS 9, that worked merrily with nary a comment but you now will see a warning like this:

WARNING: This CREATE TABLE statement recursively references the target table. A consequence of this is a possible data integrity problem.

In data step, the following still runs without a complaint:

data a;
merge a b(keep=index var);
by index;
run;

On the surface of it this does look inconsistent. From a database programmer’s point of view having to use different source and target datasets is no hardship but seems a little surplus to requirements for a SAS programmer trained to keep down the number of temporary datasets in an effort to reduce I/O and keep things tidy, an academic concept perhaps in these days of high processing power and large disks. Adding UNDO_POLICY=NONE to the PROC SQL line does make everything consistent again but I see this as being anathema to a database programming type. I do admit to indulging in the override for personal quick and dirty purposes but abiding by the constraint is how I do things for formal purposes like inclusion in an application.

WARNING: The quoted string currently being processed has become more than 262 characters long…

20th June 2007

This is a SAS error that can be seen from time to time:

WARNING: The quoted string currently being processed has become more than 262 characters long. You may have unbalanced quotation marks.

In the days prior to SAS version 8, this was something that needed to be immediately corrected. In these days of SAS character variables extending beyond 200 characters in length, it becomes a potential millstone around a SAS programmer’s neck. If you run a piece of code like this:

data _null_;
x="[string with more than 262 characters (putting in an actual string wrecks the appearance of the website)]";
run;

What you get back is the warning message at the heart of the matter. The code is legitimate and works fine but the spurious error is returned because SAS hasn’t found a closing quote by the required position and the 262 character limit is a hard constraint that cannot be extended. There is another way though: the new QUOTELENMAX option in SAS9. Setting it as follows removes the messages in most situations (yes, I did find one where it didn’t play ball):

options noquotelenmax;

This does however beg the question as to how you check for unbalanced quotes in SAS logs these days; clearly, looking for a closing quote is an outmoded approach. Thanks to code highlighting, it is far easier to pick them out before the code gets submitted. The other question that arises is why you would cause this to happen anyway but there are occasions where you assign the value of a macro variable to a data set one and the string is longer than the limit set by SAS. Here’s some example code:

data _null_;
length y $400;
y=repeat("f",400);
call symput("y",y)
run;

data _null_;
x="&y";
run;

My own weakness is where I use PROC SQL to combine strings into a macro variable, a lazy man’s method of combining all distinct values for a variable into a delimited list like this:

proc sql noprint;
select distinct compress(string_var) into :vals separated by " " from dataset;
quit;

Of course, creating a long delimited string using the CATX (new to SAS9) function avoids the whole situation and there are other means but there may be occasions, like the use of system macro variables, where it is unavoidable and NOQUOTELENMAX makes a much better impression when these arise.

SQL Developer Java error

6th June 2007

I tried starting up Oracle’s SQL Developer last night so that I could add a listing if my hillwalking blog posts to my website’s site map with a spot of PHP scripting. However, all that I got was something like that which you see below:

Java Error returned on launching Oracle SQL Developer

I must confess that this one threw me. The solution, though hard to find (they often are, even with the abilities of Google) was to use a batch file called sqldeveloper.bat than you can find in the [installation directory]\sqldeveloper\bin directory. It does start the thing when all else seems to fail and got me up and running again. I did get that blog post listing added to the site map after all; Having more visibility of the MySQL tables was a definite plus point.

Finding the number of observations in a SAS dataset

16th May 2007

There are a number of ways of finding out the number of observations (also known as records or rows) in a SAS data set and, while they are documented in a number of different places, I have decided to collect them together in one place. At the very least, it means that I can find them again.

First up is the most basic and least efficient method: read the whole data set and increment a counter to pick up its last value. The END option allows you to find the last value of count without recourse to FIRST.x/LAST.x logic.

data _null_;
set test end=eof;
count+1;
if eof then call symput(”nobs”,count);
run;

The next option is a more succinct SQL variation on the same idea. The colon prefix denotes a macro variable whose value is to be assigned in the SELECT statement; there should be no surprise as to what the COUNT(*) does…

proc sql noprint;
select count(*) into :nobs from test;
quit;

Continuing the SQL theme, accessing the dictionary tables is another route to the same end and has the advantage of needing to access the actual data set in question. You may have an efficiency saving when you are testing large datasets, but you are still reading some data here.

proc sql noprint;
select nobs into :nobs from dictionary.tables where libname=”WORK” and memname=”TEST”;
quit;

The most efficient way to do the trick is just to access the data set header. Here’s the data step way to do it:

data _null_;
if 0 then set test nobs=nobs;
call symputx(”nobs”,nobs);
stop;
run;

The IF/STOP logic stops the data set read in its tracks so that only the header is accessed, saving the time otherwise used to read the data from the data set. Using the SYMPUTX routine avoids the need to explicitly code a numeric to character transformation; it’s a SAS 9 feature, though.

I’ll finish with the most succinct and efficient way of all: the use of macro and SCL functions. It’s my preferred option, and you don’t need a SAS/AF licence to do it, either.

%let dsid=%sysfunc(open(work.test,in));
%let nobs=%sysfunc(attrn(&dsid,nobs));
%if &dsid > 0 %then %let rc=%sysfunc(close(&dsid));

The first line opens the data set, and the last one closes it; this is needed because you are not using data step or SCL and could leave a data set open, causing problems later. The second line is what captures the number of observations from the header of the data set using the SCL ATTRN function called by %SYSFUNC.

Updating Oracle data tables that have associated sequence objects

3rd May 2007

Here’s something that I want to put somewhere for future reference before I forget it: keep sequences associated with Oracle data tables up to date while adding records. Given that it took me a while to find it, it might come in useful for someone else too.

The first thing is to update the sequence itself:

SELECT TABLE_SEQ.NEXTVAL FROM DUAL;

Dual is a handy one record table that you can use to update sequences. Use the actual associated table itself if you want to see that sequence number rocket…

The next thing is to use the new value to assign a table ID as part of an INSERT statement:

INSERT INTO “TABLE” VALUES (TABLE_SEQ.CURRVAL, 1, ‘Test value’);

Quoted strings in Oracle SQL

2nd May 2007

Here’s a gotcha that caught up with me on my journey into the world of Oracle SQL: string quoting. Anything enclosed in double-quotes (") is the name of an Oracle object (variable, table and so on) while values are enclosed in single quotes (‘). The reason that this one caught me out is that I have a preference for double quotes because of my SAS programming background; SAS macro variables resolve only when enclosed in double-quotes, hence the convention.

SAS/Access and Oracle Timestamp format

25th April 2007

Until SAS 9.2, SAS/Access will not support the Oracle timestamp format. There still is no word on when 9.2 might appear so it’s over to the SQL pass-through facility…

Learning about Oracle

20th April 2007

My work in the last week has put me on something of a learning about Oracle. This is down my needing to add file metadata to database as part of an application that I am developing. The application is written in SAS but I am using SAS/Access for Oracle to update the database using SQL pass-through statements written in Oracle SQL. I am used to SAS SQL and there is commonality between it and Oracle’s implementation, which is a big help. Nevertheless, there of course are things specific to the Oracle world about which I have needed to learn. My experiences have introduced me to concepts like triggers, sequences, constraints, primary keys, foreign keys and the like. In addition, I have also seen the results of database normalisation at first hand.

Using Oracle’s SQL Developer has been a great help in my endeavours thanks to its online help and the way that you can view database objects in an easy to use manner. It also runs SQL scripts, giving you a feel for how Oracle works, and anyone can download it for free upon registration on the Oracle website. Also useful is the Express edition of the Oracle 10g database that I now have at home for personal learning purposes. That is another free download from Oracle’s website.

My Safari bookshelf has been another invaluable resource, providing access to O’ Reilly’s Oracle books. Of these, Mastering Oracle SQL has proved particularly useful and I made a journey to Manchester after work this evening (Waterstone’s on Deansgate is open until 21:00 on weekdays) to see if I could acquire a copy. That quest was to prove fruitless but I now have got the doorstop that is Oracle Database 10g: The Complete Reference from The Oracle Press, an imprint of Osborne and McGraw Hill. I needed a broader grounding in all things Oracle so this should help and it also covers SQL but the aforementioned O’ Reilly volume could return to the wish list if that provision is insufficient.

SAS and Oracle

19th April 2007

It seems that SAS have put a good deal of effort into making their software work with Oracle. Admittedly, you have got to buy SAS/Access for Oracle in addition to the other components that you already have but it is worth it. The Oracle library engine makes things easy so long as there are no incompatibilities on the database side. For instance, SAS has no plans to support the Oracle timestamp format until the forthcoming SAS 9.2 and this does make things a little interesting. However, this can be resolved with the Oracle SQL pass-through facility where you pass Oracle SQL through to the database itself for processing, avoiding incompatibilities. A more pressing issue is using PROC APPEND to add records to the data tables without updating any sequences that are associated with table ID’s. The SQL pass-through facility is the best way around this so that you can update the sequence with a SELECT statement and use the current value for the ID in the following EXECUTE statement. It may sound far from ideal but you need to process your data row by row; once set up though, everything works well.

Oracle SQL Developer and MySQL

17th April 2007

Because of my work, I recently have had a bit of exposure to Oracle SQL Developer, which I have been using as part of application development and testing activities. For further investigation, I decided to have a copy at home for further perusal (it’s a free download) and it was with some interest that I found out that it could access MySQL databases. To do this, you need Connector/J for MySQL so that communication can occur between the two. Though you quickly notice the differences in feature sets between Oracle and MySQL, it seems a good tool for exploring MySQL data tables and issuing queries.

Oracle SQL Developer

  • All the views that you find expressed on here in postings and articles are mine alone and not those of any organisation with which I have any association, through work or otherwise. As regards editorial policy, whatever appears here is entirely of my own choice and not that of any other person or organisation.

  • Please note that everything you find here is copyrighted material. The content may be available to read without charge and without advertising but it is not to be reproduced without attribution. As it happens, a number of the images are sourced from stock libraries like iStockPhoto so they certainly are not for abstraction.

  • With regards to any comments left on the site, I expect them to be civil in tone of voice and reserve the right to reject any that are either inappropriate or irrelevant. Comment review is subject to automated processing as well as manual inspection but whatever is said is the sole responsibility of the individual contributor.