Comparison of programming languages

TOPIC: COMPARISON OF PROGRAMMING LANGUAGES

Using the LIKE operator in PROC SQL WHERE clauses in SAS

26^th November 2025

Recently, I was working in SAS and decided to trying picking out datasets and variables from its dictionary tables, eventually picking out the maximum length of a variable type for assigning the length of a new variable. This could have been done using a long-established technique:

proc sql;
    select distinct memname into :dsns separated by '#'
        from dictionary.tables
            where lowcase(libname) = 'work'
            and index(lowcase(memname), "r_") = 1
            and index(lowcase(memname), "visit") = 0;
quit;

The result is that it creates a macro variable containing a delimited list of work datasets with names beginning with r_ and not containing the string visit. As well as using the index function to find the placing of one string within another, I have seen the count function used for similar purposes, albeit without the placement specificity. Since the =: operator which looks for a search string at the start of a larger is not something that works in SQL (data step is more than fine), you cannot do something like this:

proc print data = sashelp.vmember noobs;
    where lowcase(libname) = 'work'
    and lowcase(memname) =: "r_";
run;

While the contains operator works similarly to the count function when it comes to search text positioning, yet another option is the like operator, and that is shown in the example below:

proc sql;
    select distinct memname into :dsns separated by '#'
        from dictionary.tables
            where lowcase(libname) = 'work'
            and lowcase(memname) like 'r\_%' escape '\'
            and lowcase(memname) not like '%visit%';
quit;

Here, % and _ are placeholder characters, with the first matching zero or more characters and the second matching one character. Thus, the underscore in r_ needs escaping to look for that pattern (otherwise, it will look for the letter r at the start of a string and a single character after it) and a backslash character (\) will cover that duty. To ensure that it does what you want, adding escape '\' after the expression tells SAS what is happening.

Another thing to watch is that the percent (%) character needs a form escaping from the SAS Macro language processor, and placing the search term in single quotes attends to that. That means that %visit% does not cause any errors when you are looking for visit within a dataset name and using negation (the not operator) to exclude that possibility from the search results. However, using _%visit% might be a better pattern for finding visit at the end of a name, though.

Should you wish to play around with the above to see what happens for your own learning, try using something like this to give you a few test datasets:

data r_test r_visit visit;
    set sashelp.class;
run;

Otherwise, feel free to add your own test cases to cement the ideas even further. All too often, we look up something, deploy it and then forget about, especially when AI is involved. Nevertheless, the fastest way to write code can be to use what is embedded in your memory.

Using multi-line commenting in Perl to inactivate blocks of code during testing

26^th December 2019

Recently, I needed to inactivate blocks of code in a Perl script while doing some testing. Since this is something that I often do in other computing languages, I sought the same in Perl. To accomplish that, I need to use the POD methodology. This meant enclosing the code as follows.

=start

<< Code to be inactivated by inclusion in a comment >>

=cut

While the =start line could use any word after the equality sign, it seems that =cut is required to close the multi-line comment. If this was actual programming documentation, then the comment block should include some meaningful text for use with perldoc. However, that was not a concern here because the commenting statements would be removed afterwards anyway. It also is good practice not to leave commented code in a production script or program to avoid any later confusion.

In my case, this facility allowed me to isolate the code that I had to alter and test before putting everything back as needed. It also saved time since I did not need to individually comment out every executable line because multiple lines could be inactivated at a time.

Searching file contents using PowerShell

25^th October 2018

Having made plenty of use of grep on the Linux/UNIX command and findstr on the legacy Windows command line, I wondered if PowerShell could be used to search the contents of files for a text string. Usefully, this turns out to be the case, but I found that the native functionality does not use what I have used before. The form of the command is given below:

Select-String -Path <filename search expression> -Pattern "<search expression>" > <output file>

While you can have the output appear on the screen, it always seems easier to send it to a file for subsequent use, and that is what I am doing above. The input to the -Path switch can be a filename or a wildcard expression, while that to the -Pattern can be a text string enclosed in quotes or a regular expression. Given that it works well once you know what to do, here is an example:

Select-String -Path *.sas -Pattern "proc report" > c:\temp\search.txt

The search.txt file then includes both the file information and the text that has been found for the sake of checking that you have what you want. What you do next is up to you.

WARNING: The quoted string currently being processed has become more than 262 characters long…

20^th June 2007

This is a SAS error that can be seen from time to time:

WARNING: The quoted string currently being processed has become more than 262 characters long. You may have unbalanced quotation marks.

In the days before SAS version 8, this was something that needed to be immediately corrected. In these days of SAS character variables extending beyond 200 characters in length, it becomes a potential millstone around a SAS programmer's neck. If you run a piece of code like this:

data _null_;
    x="aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa";
run;

What you get back is the warning message at the heart of the matter. While the code is legitimate and works fine, the spurious error is returned because SAS hasn't found a closing quote by the required position and the 262-character limit is a hard constraint that cannot be extended. There is another way, though: the new QUOTELENMAX option in SAS9. Setting it as follows removes the messages in most situations (yes, I did find one where it didn't play ball):

options noquotelenmax;

This does, however, beg the question as to how you check for unbalanced quotes in SAS logs these days; clearly, looking for a closing quote is an outmoded approach. Thanks to code highlighting, it is far easier to pick them out before the code gets submitted. The other question that arises is why you would cause this to happen anyway, but there are occasions where you assign the value of a macro variable to a data set one and the string is longer than the limit set by SAS. Here's some example code:

data _null_;
    length y $400;
    y=repeat("f",400);
    call symput("y",y)
run;

data _null_;
    x="&y";
run;

My own weakness is where I use PROC SQL to combine strings into a macro variable, a lazy man's method of combining all distinct values for a variable into a delimited list like this:

proc sql noprint;
    select distinct compress(string_var) into :vals separated by " " from dataset;
quit;

Of course, creating a long delimited string using the CATX (new to SAS9) function avoids the whole situation and there are other means, but there may be occasions, like the use of system macro variables, where it is unavoidable and NOQUOTELENMAX makes a much better impression when these arise.