Technology Tales

Adventures in consumer and enterprise technology

TOPIC: PROCEDURAL PROGRAMMING LANGUAGES

Some PowerShell fundamentals for practical automation

27th October 2025

In the last few months, I have taken to using PowerShell for automating tasks while working on a new contract. There has been an element of vibe programming some of the scripts, which is why I wished to collate a reference guide that anyone can have to hand. While working with PowerShell every day does help to reinforce the learning, it also helps to look up granular concepts on a more bite-sized level. This especially matters given PowerShell's object-oriented approach. After all, many of us build things up iteratively from little steps, which also allows for more flexibility. Using an AI is all very well, yet the fastest recall is always from your on head.

1. Variables and Basic Data Types

Variables start with a dollar sign and hold values you intend to reuse, so names like $date, $outDir and $finalDir become anchors for later operations. Dates are a frequent companion in filenames and logs, and PowerShell's Get-Date makes this straightforward. A format string such as Get-Date -Format "yyyy-MM-dd" yields values like 2025-10-27, while Get-Date -Format "yyyy-MM-dd HH:mm:ss" adds a precise timestamp that helps when tracing the order of events. Because these commands return text when a format is specified, you can stitch the results into other strings without fuss.

2. File System Operations

As soon as you start handling files, you will meet a cluster of commands that make navigation robust rather than fragile. Join-Path assembles folder segments without worrying about stray slashes, Test-Path checks for the existence of a target, and New-Item creates folders when needed. Moving items with Move-Item keeps the momentum going once the structure exists.

Environment variables give cross-machine resilience; reading $env:TEMP finds the system's temporary area, and [Environment]::GetFolderPath("MyDocuments") retrieves a well-known Windows location without hard-coding. Setting context helps too, so Set-Location acts much like cd to make a directory the default focus for subsequent file operations. You can combine these approaches, as in cd ([Environment]::GetFolderPath("MyDocuments")), which navigates directly to the My Documents folder without hard-coded paths.

Scripts are often paired with nearby files, and Split-Path $ScriptPath -Parent extracts a parent folder from a full path so you can create companions in the same place. Network locations behave like local ones, with Universal Naming Convention paths beginning \ supported throughout, and Windows paths do not require careful case matching because the file systems are generally case-insensitive, which differs from many Unix-based systems. Even simple details matter, so constructing strings such as "$Folder*" is enough for wildcard searches, with backslashes treated correctly and whitespace handled sensibly.

3. Arrays and Collections

Arrays are created with @() and make it easy to keep related items together, whether those are folders in a $locs array or filenames gathered into $progs1, $progs2 and others. Indexing retrieves specific positions with square brackets, so $locs[0] returns the first entry, and a variable index like $outFiles[$i] supports loop counters.

A single value can still sit in an array using syntax such as @("bm_rc_report.sas"), which keeps your code consistent when functions always expect sequences. Any collection advertises how many items it holds using the Count property, so checking $files.Count equals zero tells you whether there is anything to process.

4. Hash Tables

When you need fast lookups, a hash table works as a dictionary that associates keys and values. Creating one with @{$locs[0] = $progs1} ties the first location to its corresponding programme list and then $locsProgs[$loc] retrieves the associated filenames for whichever folder you are considering. This is a neat stepping stone to loops and conditionals because it organises data around meaningful relationships rather than leaving you to juggle parallel arrays.

5. Control Flow

Control flow is where scripts begin to feel purposeful. A foreach loop steps through the items in a collection and is comfortable with nested passes, so you might iterate through folders, then the files inside each folder, and then a set of search patterns for those files. A for loop offers a counting pattern with initialisation, a condition and an increment written as for ($i = 0; $i -lt 5; $i++). It differs from foreach by focusing on the numeric progression rather than the items themselves.

Counters are introduced with $i = 0 and advanced with $i++, which in turn blends well with array indexing. Conditions gate work to what needs doing. Patterns such as if (-not (Test-Path ...)) reduce needless operations by creating folders only when they do not exist, and an else branch can note alternative outcomes, such as a message that a search pattern was not found.

Sometimes there is nothing to gain from proceeding, and break exits the current loop immediately, which is an efficient way to stop retrying once a log write succeeds. At other times it is better to skip just the current iteration, and continue moves directly to the next pass, which proves useful when a file list turns out to be empty for a given pattern.

6. String Operations

Strings support much of this work, so several operations are worth learning well. String interpolation allows variables to be embedded inside text using "$variable" or by wrapping expressions as "$($expression)", which becomes handy when constructing paths like "psoutput$($date)".

Splitting text is as simple as -split, and a statement such as $stub, $type = $File -split "." divides a filename around its dot, assigning the parts to two variables in one step. This demonstrates multiple variable assignment, where the first part goes to $stub and the second to $type, allowing you to decompose strings efficiently.

When transforming text, the -Replace operator substitutes all occurrences of one pattern with another, and you can chain replacements, as in -replace $Match, $Replace -replace $Match2, $Replace2, so each change applies to the modified output of the previous one.

Building new names clearly is easier with braces, as in "${stub}_${date}.txt", which prevents ambiguity when variable names abut other characters. Escaping characters is sometimes needed, so using "." treats a dot as a literal in a split operation. The backtick character ` serves as PowerShell's escape character and introduces special sequences like a newline written as `n, a tab as `t and a carriage return as `r. When you need to preserve formatting across lines without worrying about escapes, here-strings created with @" ... "@ keep indentation and line breaks intact.

7. Pipeline Operations

PowerShell's pipeline threads operations together so that the output of one command flows to the next. The pipe character | links each stage, and commands such as ForEach-Object (which processes each item), Where-Object (which filters items based on conditions) and Sort-Object -Unique (which removes duplicates) become building blocks that shape data progressively.

Within these blocks, the current item appears as $_, and properties exposed by commands can be read with syntax like $_.InputObject or $_.SideIndicator, the latter being especially relevant when handling comparison results. With pipeline formatting, you can emit compact summaries, as in ForEach-Object { "$($_.SideIndicator) $($_.InputObject)" }, which brings together multiple properties into a single line of output.

A multi-stage pipeline filtering approach often follows three stages: Select-String finds matches, ForEach-Object extracts only the values you need, and Where-Object discards anything that fails your criteria. This progressive refinement lets you start broad and narrow results step by step. There is no compulsion to over-engineer, though; a simplified pipeline might omit filtering stages if the initial search is already precise enough to return only what you need.

8. Comparison and Matching

Behind many of these steps sit comparison and matching operators that extend beyond simple equality. Pattern matching appears through -notmatch, which uses regular expressions to decide whether a value does not fit a given pattern, and it sits alongside -eq, -ne and -lt for equality, inequality and numeric comparison.

Complex conditions chain with -and, so an expression such as $_ -notmatch '^%macro$' -and $_ -notmatch '^%mend$' ensures both constraints are satisfied before an item passes through. Negative matching in particular helps exclude unwanted lines while leaving the rest untouched.

9. Regular Expressions

Regular expressions define patterns that match or search for text, often surfacing through operators such as -match and -replace. Simple patterns like .log$ identify strings ending with .log, while more elaborate ones capture groups using parentheses, as in (sdtm.|adam.), which finds two alternative prefixes.

Anchors matter, so ^ pins a match to the start of a line and $ pins it to the end, which is why ^%macro$ means an entire line consists of nothing but %macro. Character classes provide shortcuts such as w for word characters (letters, digits or underscores) and s for whitespace. The pattern "GRCw*" matches "GRC" followed by zero or more word characters, demonstrating how * controls repetition. Other quantifiers like + (one or more) and ? (zero or one) offer further control.

Escaping special characters with a backslash turns them into literals, so . matches a dot rather than any character. More complex patterns like '%(m|v).*?(?=[,(;s])' combine alternation with non-greedy matching and lookaheads to define precise search criteria.

When working with matches in pipelines, $_.Matches.Value extracts the actual text that matched the pattern, rather than returning the entire line where the match was found. This proves essential when you need just the matching portion for further processing. The syntax can appear dense at first, but PowerShell's integration means you can test patterns quickly within a pipeline or conditional, refining as you go.

10. File Content Operations

Searching file content with Select-String applies regular expressions to lines and returns match objects, while Out-File writes text to files with options such as -Append and -Encoding UTF8 to control how content is persisted.

11. File and Directory Searching

Commands for locating files typically combine path operations with filters. Get-ChildItem retrieves items from a folder, and parameters like -Filter or -Include narrow results by pattern. Wildcards such as * are often enough, but regular expressions provide finer control when integrated with pipeline operations. Recursion through subdirectories is available with -Recurse, and combining these techniques allows you to find specific files scattered across a directory tree. Once items are located, properties like FullName, Name and LastWriteTime let you decide what to do next.

12. Object Properties

Objects exposed by commands carry properties that you can access directly. $File.FullName retrieves an absolute path from a file object, while names, sizes and modification timestamps are all available as well. Subexpressions introduced with $() evaluate an inner expression within a larger string or command, which is why $($File.FullName) is often seen when embedding property values in strings. However, subexpressions are not always required; direct property access works cleanly in many contexts. For instance, $File.FullName -Replace ... reads naturally and works as you would expect because the property access is unambiguous when used as a command argument rather than embedded within a string.

13. Output and Logging

Producing output that can be read later is easier if you apply a few conventions. Write-Output sends structured lines to the console or pipeline, while Write-Warning signals notable conditions without halting execution, a helpful way to flag missing files. There are times when command output is unnecessary, and piping to Out-Null discards it quietly, for example when creating directories. Larger scripts benefit from consistency and a short custom function such as Write-Log establishes a uniform format for messages, optionally pairing console output with a line written to a file.

14. Functions

Functions tie these pieces together as reusable blocks with a clear interface. Defining one with function Get-UniquePatternMatches { } sets the structure, and a param() block declares the inputs. Strongly typed parameters like [string[]] make it clear that a function accepts an array of strings, and naming parameters $Folder and $Pattern describes their roles without additional comments.

Functions are called using named parameters in the format Get-UniquePatternMatches -Folder $loc -Pattern '(sdtm.|adam.)', which makes the intent explicit. It is common to pass several arrays into similar functions, so a function might have many parameters of the same type. Using clear, descriptive names such as $Match, $Replace, $Match2 and $Replace2 leaves little doubt about intent, even if an array of replacement rules would sometimes reduce repetition.

Positional parameters are also available; when calling Do-Compare you can omit parameter names and rely on the order defined in param(). PowerShell follows verb-noun naming conventions for functions, with common verbs including Get, Set, New, Remove, Copy, Move and Test. Following this pattern, as in Multiply-Files, places your code in the mainstream of PowerShell conventions.

It is worth avoiding a common pitfall where a function declares param([string[]]$Files) but inadvertently reads a variable like $progs from outside the function. PowerShell allows this via scope inheritance, where functions can access variables from parent scopes, but it makes maintenance harder and disguises the function's true dependencies. Being explicit about parameters creates more maintainable code.

Simple functions can still do useful work without complexity. A minimal function implementation with basic looping and conditional logic can accomplish useful tasks, and a recurring structure can be reused with minor revisions, swapping one regular expression for another while leaving the looping and logging intact. Replacement chains are flexible; add as many -replace steps as are needed, and no more.

Parameters can be reused meaningfully too, demonstrating parameter reuse where a $Match variable serves double duty: first as a filename filter in -Include, then as a text pattern for -replace. Nested function calls tie output and input together, as when piping a here-string to Out-File (Join-Path ...) to construct a file path at the moment of writing.

15. Comments

Comments play a quiet but essential role. A line starting with # explains why something is the way it is or temporarily disables a line without deleting it, which is invaluable when testing and refining.

16. File Comparison

Comparison across datasets rounds out common tasks, and Compare-Object identifies differences between two sets, telling you which items are unique to each side or shared by both. Side indicators in the output are compact: <= shows the first set, => the second, and == indicates an item present in both.

17. Common Parameters

Across many commands, common parameters behave consistently. -Force allows operations that would otherwise be blocked and overwrites existing items without prompting in contexts that support it, -LiteralPath treats a path exactly as written without interpreting wildcards, and -Append adds content to existing files rather than overwriting them. These options smooth edges when you know what you want a command to do and prefer to avoid interactive questions or unintended pattern expansion.

18. Advanced Scripting Features

A number of advanced features make scripts sturdier. Automatic variables such as $MyInvocation.MyCommand.Path provide information about the running script, including its full path, which is practical for locating resources relative to the script file. Set-StrictMode -Version Latest enforces stricter rules that turn common mistakes into immediate errors, such as using an uninitialised variable or referencing a property that does not exist. Clearing the console at the outset with Clear-Host gives a clean slate for output when a script begins.

19. .NET Framework Integration

Integration with the .NET Framework extends PowerShell's reach, and here are some examples. For instance, calling [System.IO.Path]::GetFileNameWithoutExtension() extracts a base filename using a tested library method. To gain more control over file I/O, [System.IO.File]::Open() and System.IO.StreamWriter expose low-level handles that specify sharing and access modes, which can help when you need to coordinate writing without blocking other readers. File sharing options like [System.IO.FileShare]::Read allow other processes to read a log file while the script writes to it, reducing contention and surprises.

20. Error Handling

Error handling deserves a clear pattern. Wrapping risky operations in try { } catch { } blocks captures exceptions, so a script can respond gracefully, perhaps by writing a warning and moving on. A finally block can be added for clean-up operations that must run regardless of success or failure.

When transient conditions are expected, a retry logic pattern is often enough, pairing a counter with Start-Sleep to attempt an operation several times before giving up. Waiting for a brief period such as Start-Sleep -Milliseconds 200 gives other processes time to release locks or for temporary conditions to clear.

Alongside this, checking for null values keeps assumptions in check, so conditions like if ($null -ne $process) ensure that you only read properties when an object was created successfully. This defensive approach prevents cascading errors when operations fail to return expected objects.

21. External Process Management

Managing external programmes is a common requirement, and PowerShell's Start-Process offers a controlled route. Several parameters control its behaviour precisely:

The -Wait parameter makes PowerShell pause until the external process completes, essential for sequential processing where later steps depend on earlier ones. The -PassThru parameter returns a process object, allowing you to inspect properties like exit codes after execution completes. The -NoNewWindow parameter runs the external process in the current console rather than opening a new window, keeping output consolidated. If a command expects the Command Prompt environment, calling it via cmd.exe /c $cmd integrates cleanly, ensuring compatibility with programmes designed for the CMD shell.

Exit codes reported with $process.ExitCode indicate success with zero and errors with non-zero values in most tools, so checking these numbers preserves confidence in the sequence of steps. The script demonstrates synchronous execution, processing files one at a time rather than in parallel, which can be an advantage when dependencies exist between stages or when you need to ensure ordered completion.

22. Script Termination

Scripts need to finish in a way that other tools understand. Exiting with Exit 0 signals success to schedulers and orchestrators that depend on numeric codes, while non-zero values indicate error conditions that trigger alerts or retries.

Bringing It All Together

Because this is a granular selection, it leaves it to us to piece everything together to accomplish the tasks that we have to complete. In that way, we can embed the knowledge so that we are vibe coding all the time, ensuring that a more deterministic path is followed.

Keeping a graphical eye on CPU temperature and power consumption on the Linux command line

20th March 2025

Following my main workstation upgrade in January, some extra monitoring has been needed. This follows on from the experience with building its predecessor more than three years ago.

Being able to do this in a terminal session keeps things lightweight, and I have done that with text displays like what you see below using a combination of sensors and nvidia-smi in the following command:

watch -n 2 "sensors | grep -i 'k10'; sensors | grep -i 'tdie'; sensors | grep -i 'tctl'; echo "" | tee /dev/fd/2; nvidia-smi"

Everything is done within a watch command that refreshes the display every two seconds. Then, the panels are built up by a succession of commands separated with semicolons, one for each portion of the display. The grep command is used to pick out the desired output of the sensors command that is piped to it; doing that twice gets us two lines. The next command, echo "" | tee /dev/fd/2, adds an extra line by sending a space to STDERR output before the output of nvidia-smi is displayed. The result can be seen in the screenshot below.

However, I also came across a more graphical way to do things using commands like turbostat or sensors along with AWK programming and ttyplot. Using the temperature output from the above and converting that needs the following:

while true; do sensors | grep -i 'tctl' | awk '{ printf("%.2f\n", $2); fflush(); }'; sleep 2; done | ttyplot -s 100 -t "CPU Temperature (Tctl)" -u "°C"

This is done in an infinite while loop to keep things refreshing; the watch command does not work for piping output from the sensors command to both the awk and ttyplot commands in sequence and on a repeating, periodic basis. The awk command takes the second field from the input text, formats it to two places of decimals and prints it before flushing the output buffer afterwards. The ttyplot command then plots those numbers on the plot seen below in the screenshot with a y-axis scaled to a maximum of 100 (-s), units of °C (-u) and a title of CPU Temperature (Tctl) (-t).

A similar thing can be done for the CPU wattage, which is how I learned of the graphical display possibilities in the first place. The command follows:

sudo turbostat --Summary --quiet --show PkgWatt --interval 1 | sudo awk '{ printf("%.2f\n", $1); fflush(); }' | sudo ttyplot -s 200 -t "Turbostat - CPU Power (watts)" -u "watts"

Handily, the turbostat can be made to update every so often (every second in the command above), avoiding the need for any infinite while loop. Since only a summary is needed for the wattage, all other output can be suppressed, though everything needs to work using superuser privileges, unlike the sensors command earlier. Then, awk is used like before to process the wattage for plotting; the first field is what is being picked out here. After that, ttyplot displays the plot seen in the screenshot below with appropriate title, units and scaling. All works with output from one command acting as input to another using pipes.

All of this offers a lightweight way to keep an eye on system load, with the top command showing the impact of different processes if required. While there are graphical tools for some things, command line possibilities cannot be overlooked either.

Upgrading Julia packages

23rd January 2024

Whenever a new version of Julia is released, I have certain actions to perform. With Julia 1.10, installing and updating it has become more automated thanks to shell scripting or the use of WINGET, depending on your operating system. Because my environment predates this, I found that the manual route still works best for me, and I will continue to do that.

Returning to what needs doing after an update, this includes updating Julia packages. In the REPL, this involves dropping to the PKG subshell using the ] key if you want to avoid using longer commands or filling your history with what is less important for everyday usage.

Previously, I often ran code to find a package was missing after updating Julia, so the add command was needed to reinstate it. That may raise its head again, but there also is the up command for upgrading all packages that were installed. This could be a time saver when only a single command is needed for all packages and not one command for each package as otherwise might be the case.

Rendering Markdown into HTML using PHP

3rd December 2022

One of the good things about using virtual private servers for hosting websites instead of shared hosting or using a web application service like WordPress.com or Tumblr is that you get added control and flexibility. There was a time when HTML, CSS and client-side scripting were all that was available from the shared hosting providers that I was using back then. Then, static websites were my lot until it became possible to use Perl server side scripting. PHP predominates now, but Python or Ruby cannot be discounted either.

Being able to install whatever you want is a bonus as well, though it means that you also are responsible for the security of the containers that you use. There will be infrastructure security, but that of your own machine will be your own concern. Added power always means added responsibility, as many might say.

The reason that these thought emerge here is that getting PHP to render Markdown as HTML needs the installation of Composer. Without that, you cannot use the CommonMark package to do the required back-work. All the command that you see here will work on Ubuntu 22.04. First, you need to download Composer and executing the following command will accomplish this:

curl https://getcomposer.org/installer -o /tmp/composer-setup.php

Before the installation, it does no harm to ensure that all is well with the script before proceeding. That means that capturing the signature for the script using the following command is wise:

HASH=`curl https://composer.github.io/installer.sig`

Once you have the script signature, then you can check its integrity using this command:

php -r "if (hash_file('SHA384', '/tmp/composer-setup.php') === '$HASH') { echo 'Installer verified'; } else { echo 'Installer corrupt'; unlink('composer-setup.php'); } echo PHP_EOL;"

The result that you want is "Installer verified". If not, you have some investigating to do. Otherwise, just execute the installation command:

sudo php /tmp/composer-setup.php --install-dir=/usr/local/bin --filename=composer

With Composer installed, the next step is to run the following command in the area where your web server expects files to be stored. That is important when calling the package in a PHP script.

composer require league/commonmark

Then, you can use it in a PHP script like so:

define("ROOT_LOC",$_SERVER['DOCUMENT_ROOT']);
include ROOT_LOC . '/vendor/autoload.php';
use League\CommonMark\CommonMarkConverter;
$converter = new CommonMarkConverter();
echo $converter->convertToHtml(file_get_contents(ROOT_LOC . '<location of markdown file>));

The first line finds the absolute location of your web server file directory before using it when defining the locations of the autoload script and the required markdown file. The third line then calls in the CommonMark package, while the fourth sets up a new object for the desired transformation. The last line converts the input to HTML and outputs the result.

If you need to render the output of more than one Markdown file, then repeating the last line from the preceding block with a different file location is all you need to do. The CommonMark object persists and can be used like a variable without needing the reinitialisation to be repeated every time.

The idea of building a website using PHP to render Markdown has come to mind, but I will leave it at custom web pages for now. If an opportunity comes, then I can examine the idea again. Before, I had to edit HTML, but Markdown is friendlier to edit, so that is a small advance for now.

Using multi-line commenting in Perl to inactivate blocks of code during testing

26th December 2019

Recently, I needed to inactivate blocks of code in a Perl script while doing some testing. Since this is something that I often do in other computing languages, I sought the same in Perl. To accomplish that, I need to use the POD methodology. This meant enclosing the code as follows.

=start

<< Code to be inactivated by inclusion in a comment >>

=cut

While the =start line could use any word after the equality sign, it seems that =cut is required to close the multi-line comment. If this was actual programming documentation, then the comment block should include some meaningful text for use with perldoc. However, that was not a concern here because the commenting statements would be removed afterwards anyway. It also is good practice not to leave commented code in a production script or program to avoid any later confusion.

In my case, this facility allowed me to isolate the code that I had to alter and test before putting everything back as needed. It also saved time since I did not need to individually comment out every executable line because multiple lines could be inactivated at a time.

Using the IN operator in SAS Macro programming

8th October 2012

This useful addition came in SAS 9.2, and I am amazed that it isn’t enabled by default. To accomplish that, you need to set the MINOPERATOR option, unless someone has done it for you in the SAS AUTOEXEC or another configuration program. Thus, the safety first approach is to have code like the following:

options minoperator;

%macro inop(x);
    %if &x in (a b c) %then %do;
        %put Value is included;
    %end;
    %else %do;
        %put Value not included;
    %end;
%mend inop;

%inop(a);

Also, the default delimiter is the space, so if you need to change that, then the MINDELIMITER option needs setting. Adjusting the above code so that the delimiter now is the comma character gives us the following:

options minoperator mindelimiter=",";

%macro inop(x);
    %if &x in (a b c) %then %do;
        %put Value is included;
    %end;
    %else %do;
        %put Value not included;
    %end;
%mend inop;

%inop(a);

Without any of the above, the only approach is to have the following, and that is what we had to do for SAS versions before 9.2:

%macro inop(x);
    %if &x=a or &x=b or &x=c %then %do;
        %put Value is included;
    %end;
    %else %do;
        %put Value not included;
    %end;
%mend inop;

%inop(a);

While it may be clunky, it does work and remains a fallback in newer versions of SAS. Saying that, having the IN operator available makes writing SAS Macro code that little bit more swish, so it's a good thing to know.

On Making PROC REPORT Work Harder

1st September 2010

In the early years of my SAS programming career, there seemed to be just the one procedure to use if you wanted to create a summary table. That was TABULATE and it was great for generating columns according to the value of a variable such as the treatment received by a subject in a clinical study. To a point, it could generate statistics for you too, and I often used it to sum frequency and percentage variables. Since then, it seems to have been enhanced a little and surprised me with the statistics it could produce when I had a recent play. Here's the code:

proc tabulate data=sashelp.class;
    class sex;
    var age;
    table age*(n median*f=8. mean*f=8.1 std*f=8.1 min*f=8. max*f=8. lclm*f=8.1 uclm*f=8.1),sex
	  / misstext="0";
run;

When you compare that with the idea of creating one variable per column and then defining them in PROC REPORT as many do, it looks more elegant and the results aren't bad either, though they can be tweaked further from the quick example that I generated. That last comment brings me to the point that PROC REPORT seems to have taken over from TABULATE wherever I care to look these days, and I do ask myself if it is the right tool for that for which it is being used or if it is being used in the best way.

While using Data Step to create one variable per column in a PROC REPORT output doesn't strike me as the best way to write reusable code, there are ways to make PROC REPORT do more for you. For example, by defining GROUP, ACROSS and ANALYSIS columns in an output, you can persuade the procedure to do the summarising for you and there's some example code below with the comma nesting height under sex in the resulting table. Sums are created by default if you do this, and forgoing an analysis column definition means that you get a frequency table, not at all a useless thing in numerous instances.

proc report data=sashelp.class nowd missing;
    columns age sex,height;
    define age / group "Age";
    define sex / across "Sex";
    define height / analysis mean f=missing. "Mean Height";
run;

For those times when you need to create more heavily formatted statistics (summarising range as min-max rather showing min and max separately, for example), you might feel that the GROUP/ACROSS set-up's non-display of character values puts a stop to using that approach. However, I found that making every value combination unique and attaching a cell ID helps to work around the problem. Then, you can create a format control data set from the data like in the code below and create a format from that which you can apply to the cell ID's to display things as you need them. This method does make things more portable from situation to situation than adding or removing columns depending on the values of a classification variable.

proc sql noprint;
    create table cntlin as
        select distinct "fmtname" as fmtname, cellid as start, cellid as end, decode as label
            from report;
quit;

proc format lib=work cntlin=cnlin;
run;

Understanding Perl binding operators for pattern matching

20th May 2009

While this piece is as much an aide de memoire for myself as anything else, putting it here seems worthwhile if it answers questions for others. The binding operators, =~ or !~, come in handy when you are framing conditional statements in Perl using Regular Expressions, for example, testing whether x =~ /\d+/ or not. The =~ variant is also used for changing strings using the s/[pattern1]/[pattern2]/ regular expression construct (here, s stands for "substitute"). What has brought this to mind is that I wanted to ensure that something was done for strings that did not contain a certain pattern, and that's where the !~ binding operator came in useful; ^~ might have come to mind for some reason, but it wasn't what I needed.

When web hosting limits become too restrictive

27th April 2009

Fasthosts has, in their wisdom, decided to limit the execution time for ASP scripts to 15 seconds and 10 seconds for any others. I haven't used Perl sufficiently in this shared hosting set up to determine how that is affected. In contrast, I can share my experiences on the PHP side and you may have noticed occasional glitches. They have also disabled the set_time_limit PHP function, so you cannot easily address the matter yourself where you need to do it. You almost get the feeling that they don't trust the abilities, actions and oversight of their users. Personally, I reckon that the ten-second limit is too short and that something of the order of 20 or 30 seconds would be better. If it all gets too restrictive, I suppose that there are other providers, though I think that I would avoid resellers after a previous less than glorious experience. There's the dedicated server option too if I was feeling flush, not so likely given the economic times in which we live.

AND & OR, a cautionary tale

27th March 2009

The inspiration for this post is a situation where having the string "OR" or "AND" as an input to a piece of SAS Macro code, breaking a program that I had written. Here is a simplified example of what I was doing:

%macro test;
    %let doms=GE GT NE LT LE AND OR;
    %let lv_count=1;
    %do %while (%scan(&doms,&lv_count,' ') ne );
        %put &lv_count;
        %let lv_count=%eval(&lv_count+1);
    %end;
%mend test;

%test;

The loop proceeds well until the string "AND" is met and "OR" has the same effect. The result is the following message appears in the log:

ERROR: A character operand was found in the %EVAL function or %IF condition where a numeric operand is required. The condition was: %scan(&doms,&lv_count,' ') ne
ERROR: The condition in the %DO %WHILE loop, , yielded an invalid or missing value, . The macro will stop executing.
ERROR: The macro TEST will stop executing.

Both AND & OR (case doesn't matter, but I am sticking with upper case for sake of clarity) seem to be reserved words in a macro DO WHILE loop, while equality mnemonics like GE cause no problem. Perhaps, the fact that and equality operator is already in the expression helps. Regardless, the fix is simple:

%macro test;
    %let doms=GE GT NE LT LE AND OR;
    %let lv_count=1;
    %do %while ("%scan(&doms,&lv_count,' ')" ne "");
        %put &lv_count;
        %let lv_count=%eval(&lv_count+1);
    %end;
%mend test;

%test;

Now none of the strings extracted from the macro variable &DOMS will appear as bare words and confuse the SAS Macro processor, but you do have to make sure that you are testing for the null string ("" or '') or you'll send your program into an infinite loop, always a potential problem with DO WHILE loops so they need to be used with care. All in all, an odd-looking message gets an easy solution without recourse to macro quoting functions like %NRSTR or %SUPERQ.

  • The content, images, and materials on this website are protected by copyright law and may not be reproduced, distributed, transmitted, displayed, or published in any form without the prior written permission of the copyright holder. All trademarks, logos, and brand names mentioned on this website are the property of their respective owners. Unauthorised use or duplication of these materials may violate copyright, trademark and other applicable laws, and could result in criminal or civil penalties.

  • All comments on this website are moderated and should contribute meaningfully to the discussion. We welcome diverse viewpoints expressed respectfully, but reserve the right to remove any comments containing hate speech, profanity, personal attacks, spam, promotional content or other inappropriate material without notice. Please note that comment moderation may take up to 24 hours, and that repeatedly violating these guidelines may result in being banned from future participation.

  • By submitting a comment, you grant us the right to publish and edit it as needed, whilst retaining your ownership of the content. Your email address will never be published or shared, though it is required for moderation purposes.