TOPIC: VARIABLE
String replacement in BASH scripting
28th April 2023During creation of new posts for a Hugo deployed website, I found myself using the same directories again and again. Since I invariably ended up making typing mistakes when I did so, I fancied the idea of using shortcodes instead.
Because I wanted to turn the shortcode into the actual directory name, I chose the use of text replacement in BASH scripting. Thankfully, this is simple and avoids the use of regular expressions, which can bring their own problems. The essential syntax is as follows:
variable="${variable/search text/replacement}"
For the variable, the search text is substituted with the replacement straightforwardly. It is even possible to include the search and replacement text in variables. In the example below, this is achieved using variables called original and replacement.
variable="${variable/$original/$replacement}"
Doing this got me my translatable shortcodes and converted them into actual directory names for the hugo
command to process. There may be other uses yet.
Creating a data-driven informat in SAS
27th September 2019Recently, I needed to create some example data with an extra numeric identifier variable that would be assigned according to the value of a character identifier variable. Not wanting to add another dataset merge or join to the code, I decided to create an informat from data. Initially, I looked into creating a format instead, but it did not accomplish what I wanted to do.
data patient;
keep fmtname start end label type;
set test.dm;
by subject;
fmtname="PATIENT";
start=subject;
end=start;
label=patient;
type="I";
run;
The input data needed a little processing as shown above. The format name was defined in the variable FMTNAME
and the TYPE
variable was assigned a value of I
to make this a numeric informat; to make character equivalent, a value of J
was assigned. The START
and END
variables declare the value range associated with the value of the LABEL
variable that would become the actual value of the numeric identifier variable. The variable names are fixed because the next step will not work with different ones.
proc format lib=work cntlin=patient;
run;
quit;
To create the actual informat, the dataset is read by the FORMAT
procedure with the CNTLIN
parameter specifying the name of the input dataset and LIB
defining the library where the format catalogue is stored. When this in complete, the informat is available for use with an input function as shown in the code excerpt below.
data ae1;
set ae;
patient=input(subject,patient.);
run;
Dealing with variable length warnings in SAS 9.2
11th January 2012A habit of mine is to put a LENGTH
or ATTRIB
statement between DATA and SET statements in a SAS data step to reset variable lengths. By default, it appears that this triggers truncation warnings in SAS 9.2 or SAS 9.3 when it didn't in previous versions. SAS 9.1.3, for instance, allowed you to have something like the following for shortening a variable length without issuing any messages at all:
data b;
length x $100;
set a;
run;
In this case, x could have a length of 200 previously and SAS 9.1.3 wouldn't have complained. Now, SAS 9.2 and 9.3 will issue a warning if the new length is less than the old length. This can be useful to know, but it can be changed using the VARLENCHK
system option. Though the default value is WARN, it can be set to ERROR if you really want to ensure that there is no chance of truncation. Then, you get error messages and the program fails where it normally would run with warnings. Setting the value of the option to NOWARN
restores the type of behaviour seen in SAS 9.1.3 and versions before that.
The SAS documentation says that the ability to change VARLENCHK
can be restricted by an administrator, so you might need to deal with this situation in a more locked down environment. Then, one option would be to do something like the following:
data b;
drop x;
rename _x=x;
set a;
length _x $100;
_x=strip(x);
run;
While It's a bit more laborious than setting the VARLENCHK
option to NOWARN
, the idea is that you create a new variable of the right length and replace the old one with it. That gets rid of warnings or errors in the log and resets the variable length as needed. Of course, you have to ensure that there is no value truncation with either remedy. If any is found, then the dataset specification probably needs updating to accommodate the length of the values in the data. After all, there is no substitute for getting to know your data and doing your own checking should you decide to take matters into your hands.
There is a use for the default behaviour, though. If you use a specification to specify a shell for a dataset, then you will be warned when the shell shortens variable lengths. That allows you to either adjust the dataset or your program. Also, it provides additional information when you get variable length mismatch warnings when concatenating or merging datasets. There was a time when SAS wasn't so communicative in these situations and some investigation was needed to establish which variable was affected. Now, that has changed without leaving the option to work differently if you so do desire. Sometimes, what can seem like an added restriction can have its uses.