Technology Tales

Adventures in consumer and enterprise technology

TOPIC: PROCEDURAL PROGRAMMING LANGUAGES

Escaping brackets in SAS macro language

14th November 2007

Rendering opening and closing brackets as pieces in SAS macro language programming caused me a bit of grief until I got it sorted a few months back. All the usual suspects for macro quoting (or escaping in other computer languages) let me down: even the likes of %SUPERQ or %NRBQUOTE didn't do the trick. The honours were left to %NRQUOTE(%(), which performed what was required very respectably indeed. The second % escapes the bracket for %NRQUOTE to do the rest.

Setting up a test web server on Ubuntu

1st November 2007

Installing all the bits and pieces is painless enough so long as you know what's what; Synaptic does make it thus. Interestingly, Ubuntu's default installation is a lightweight affair with the addition of any additional components involving downloading the packages from the web. The whole process is all very well integrated and doesn't make you sweat every time you need to install additional software. In fact, it resolves any dependencies for you so that those packages can be put in place too; it lists them, you select them and Synaptic does the rest.

Returning to the job in hand, my shopping list included Apache, Perl, PHP and MySQL, the usual suspects in other words. Perl was already there, as it is on many UNIX systems, so installing the appropriate Apache module was all that was needed. PHP needed the base installation as well as the additional Apache module. MySQL needed the full treatment too, though its being split up into different pieces confounded things a little for my tired mind. Then, there were the MySQL modules for PHP to be set in place too.

The addition of Apache preceded all of these, but I have left it until now to describe its configuration, something that took longer than for the others; the installation itself was as easy as it was for the others. However, what surprised me were the differences in its configuration set up when compared with Windows. There are times when we get the same software but on different operating systems, which means that configuration files get set up differently. The first difference is that the main configuration file is called apache2.conf on Ubuntu rather than httpd.conf as on Windows. Like its Windows counterpart, Ubuntu's Apache does use subsidiary configuration files. However, there is an additional layer of configurability added courtesy of a standard feature of UNIX operating systems: symbolic links. Rather than having a single folder with the all configuration files stored therein, there are two pairs of folders, one pair for module configuration and another for site settings: mods-available/mods-enabled and sites-available/sites-enabled, respectively. In each pair, there is a folder with all the files and another containing symbolic links. It is the presence of a symbolic link for a given configuration file in the latter that activates it. I learned all this when trying to get mod_rewrite going and changing the web server folder from the default to somewhere less susceptible to wrecking during a re-installation or, heaven forbid, a destructive system crash. It's unusual, but it does work, even if it takes that little bit longer to get things sorted out when you first meet up with it.

Apart from the Apache set up and finding the right things to install, getting a test web server up and running was a fairly uneventful process. All's working well now, and I'll be taking things forward from here; making website Perl scripts compatible with their new world will be one of the next things that need to be done.

Numeric for loops in Korn shell scripting: from ksh88 to ksh93

18th October 2007

The time-honoured syntax for a for loop in a UNIX script is what you see below, and that is what works with the default shell in Sun's Solaris UNIX operating system, ksh88.

for i in 1 2 3 4 5 6 7 8 9 10
do
    if [[ -d dir$i ]]
    then
        :
    else
        mkdir dir$i
    fi
done

However, there is a much nicer syntax supported since the advent of ksh93. It follows C language conventions found in all sorts of places like Java, Perl, PHP and so on. Here is an example:

for (( i=1; i<11; i++ ))
do
    if [[ -d dir$i ]]
    then
        :
    else
        mkdir dir$i
    fi
done

WARNING: The quoted string currently being processed has become more than 262 characters long…

20th June 2007

This is a SAS error that can be seen from time to time:

WARNING: The quoted string currently being processed has become more than 262 characters long. You may have unbalanced quotation marks.

In the days before SAS version 8, this was something that needed to be immediately corrected. In these days of SAS character variables extending beyond 200 characters in length, it becomes a potential millstone around a SAS programmer's neck. If you run a piece of code like this:

data _null_;
    x="aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa";
run;

What you get back is the warning message at the heart of the matter. While the code is legitimate and works fine, the spurious error is returned because SAS hasn't found a closing quote by the required position and the 262-character limit is a hard constraint that cannot be extended. There is another way, though: the new QUOTELENMAX option in SAS9. Setting it as follows removes the messages in most situations (yes, I did find one where it didn't play ball):

options noquotelenmax;

This does, however, beg the question as to how you check for unbalanced quotes in SAS logs these days; clearly, looking for a closing quote is an outmoded approach. Thanks to code highlighting, it is far easier to pick them out before the code gets submitted. The other question that arises is why you would cause this to happen anyway, but there are occasions where you assign the value of a macro variable to a data set one and the string is longer than the limit set by SAS. Here's some example code:

data _null_;
    length y $400;
    y=repeat("f",400);
    call symput("y",y)
run;
data _null_;
    x="&y";
run;

My own weakness is where I use PROC SQL to combine strings into a macro variable, a lazy man's method of combining all distinct values for a variable into a delimited list like this:

proc sql noprint;
    select distinct compress(string_var) into :vals separated by " " from dataset;
quit;

Of course, creating a long delimited string using the CATX (new to SAS9) function avoids the whole situation and there are other means, but there may be occasions, like the use of system macro variables, where it is unavoidable and NOQUOTELENMAX makes a much better impression when these arise.

Tidying dynamic URL’s

15th June 2007

A few years back, I came across a very nice article discussing how you would make a dynamic URL more palatable to a search engine, and I made good use of its content for my online photo gallery. The premise was that URL's that look like that below are no help to search engines indexing a website. Though this is received wisdom in some quarters, it doesn't seem to have done much to stall the rise of WordPress as a blogging platform.

http://www.mywebsite.com/serversidescript.php?id=394

That said, WordPress does offer a friendlier URL display option too, which you can see in use on this blog; they look a little like the example URL that you see below, and the approach is equally valid for both Perl and PHP. Since I have been using the same approach for the Perl scripts powering my online phone gallery, now want to apply the same thinking to a gallery written in PHP:

http://www.mywebsite.com/serversidescript.pl/id/394

The way that both expressions work is that a web server will chop pieces from a URL until it reaches a physical file. For a query URL, the extra information after the question mark is retained in its QUERY_STRING variable, while extraneous directory path information is passed in the variable PATH_INFO. For both Perl and PHP, these are extracted from the entries in an array; for Perl, this array is called is $ENV and $_SERVER is the PHP equivalent. Thus, $ENV{QUERY_STRING} and $_SERVER{'QUERY_STRING'} traps what comes after the ? while $ENV{PATH_INFO} and $_SERVER{'PATH_INFO'} picks up the extra information following the file name (/id/394/ in the example). From there on, the usual rules apply regarding cleaning of any input but changing from one to another should be too arduous.

Getting one’s HTTP headers in a twist

9th June 2007

I am in the process of further linking the content of hillwalking blog into the fabric of other parts of my website. To date, this has included adding a list of all the posts to the site map and a dossier of the latest entries to the site's welcome page. One thing that started to crop up were a series of warnings of the form:

Cannot modify header information - headers already sent by...

The cause of this was my calling a PHP script that was part of my hillwalking blog installation, and this caused Bad Behaviour to be invoked. The result was that an HTTP header was being sent to create a cookie after one already had been sent to display a web page. A spot of conditional coding and the use of an extra flag resolved the problem.

Apparently, there is another surefire way of getting the same result: whitespace before or after the <?php ... ?> tag in an included script. The cookie issue is one that I can understand, but it does seem strange that an attempt is made to send HTTP header information when the latter arises. It causes loads of questions, though...

Restrictions on SAS libraries when macro catalogs are used

8th June 2007

When you open up a SAS macro catalogue so that its entries for use by other programs, it has a major impact on the ability to change the library reference used to access the catalogue after it has apparently been unlocked.

options mstored sasmstore=bld_v001;

Using the line above will open the catalogue for reading, but there is no way to close it to change the library reference or unassign it until the SAS session is shut down. Even this line will not do the trick:

options nomstored sasmstore='';

What it means in practice is that if you have a standard macro setting up access to a number of standard macro libraries, then that setup macro needs to check for any library references used and not try to reassign them, causing errors in the process.

HennessyBlog theme update

12th February 2007

Over the weekend, I have been updating the theme on my other blog, HennessyBlog. It has been a task that projected me onto a learning curve with the WordPress 2.1 codebase. Thus, I have collected what I encountered, so I know that it’s out there on the web for you (and I) to use and peruse. It took some digging to get to know some of what you find below. Since any function used to power WordPress takes some finding, I need to find one place on the web where the code for WordPress is more fully documented. The sites presenting tutorials on how to use WordPress are more often than not geared towards non-techies rather than code cutters like myself. Then again, they might be waiting for someone to do it for them…

The changes made are as follows:

Tweaks to the interface

These are subtle, with the addition of navigation controls to the sidebar and the change in location of the post metadata being the most obvious enhancements. “Decoration” with solid and dashed lines (using CSS border attributes rather than the deprecated hr tagset) and standards compliance links.

Standards compliance

Adding standards compliance links does mean that you’d better check that all is in order; it was then that I discovered that there was work to be done. There is an issue with the WordPress wpautop function (it lives in the formatting.php file) in that it sometimes doesn’t add closing tags. Finding out that it was this function that is implicated took a trip to the WordPress.org website; while a good rummage in the wp-includes folder does a lot, it can’t achieve everything.

Like many things in the WordPress code, the wpautop function isn’t half buried. The the_content function (see template-functions-post.php) used to output blog entries calls the get_content function (also in template-functions-post.php) to extract the data from MySQL. The add_filter function (in plugin.php) associates the wpautop function and others with the get_the_content function to add the p tags to the output.

To return to the non-ideal behaviour that caused me to start out on the above quest, an example is where you have an img tag enclosed by div tags. The required substitution involves the use of regular expressions that work most of the time but get confused here. So adding a hack to the wpautop function was needed to change the code so that the p end tag got inserted. I’ll be keeping an eye out for any more scenarios like this that slip through the net and for any side effects. Otherwise, compliance is just making sure that all those img tags have their alt attributes completed.

Tweaks to navigation code

Most of my time has been spent on tweaking of the PHP code supporting the navigation. Because different functions were being called in different places, I wanted to harmonise things. To accomplish this, I created new functions in the functions.php for my theme and needed to resolve a number of issues along the way. Not least among these were regular expressions used for subsetting with the preg_match function; these were not Perl-compliant to my eyes, as would be implied by the choice of function. Now that I have found that PCRE’s in PHP use a more pragmatic syntax, there appeared to be issues with the expressions that were being used. These seemed to behave OK in their native environment but fell out of favour within the environs of my theme. Being acquainted with Perl, I went for a more familiar expression style and the issue has been resolved.

Along the way, I broke the RSS feed. This was on my off-line test blog so no one, apart from myself, that is, would have noticed. After a bit of searching, I realised that some stray white-space from the end of a PHP file (wp-config.php being a favourite culprit), after the PHP end tag in the script file as it happens, was finding its way into the feed and causing things to fall over. Feed readers don’t take too kindly to the idea of the XML declaration not making an appearance on the first line of the file. Some confusion was caused by the refusal of Firefox to refresh things as it should before I realised that a forced refresh of the feed display was needed. Sometimes, it takes a while for an addled brain to think of these kinds of things.

  • The content, images, and materials on this website are protected by copyright law and may not be reproduced, distributed, transmitted, displayed, or published in any form without the prior written permission of the copyright holder. All trademarks, logos, and brand names mentioned on this website are the property of their respective owners. Unauthorised use or duplication of these materials may violate copyright, trademark and other applicable laws, and could result in criminal or civil penalties.

  • All comments on this website are moderated and should contribute meaningfully to the discussion. We welcome diverse viewpoints expressed respectfully, but reserve the right to remove any comments containing hate speech, profanity, personal attacks, spam, promotional content or other inappropriate material without notice. Please note that comment moderation may take up to 24 hours, and that repeatedly violating these guidelines may result in being banned from future participation.

  • By submitting a comment, you grant us the right to publish and edit it as needed, whilst retaining your ownership of the content. Your email address will never be published or shared, though it is required for moderation purposes.