TOPIC: CONTROL FLOW
Catching keyboard interruptions in a Python script for a more orderly exit
17th April 2024A while back, I was using a Python script to watch a folder and process photos in there, whenever a new one was added. Eventually, I ended up with a few of these because I was unable to work out a way to get multiple folders watched in the same script.
In each of them, though, I needed a tidy way to exit a running script in place of the stream of consciousness that often emerges when you do such things to it. Because I knew what was happening anyway, I needed a script to terminate quietly and set to uncover a way to achieve this.
What came up was something like the code that you see below. While I naturally did some customisations, I kept the essential construct to capture keyboard interruption shortcuts, like the use of CTRL + C
in a Linux command line interface.
if __name__ == '__main__':
try:
main()
except KeyboardInterrupt:
print('Interrupted')
try:
sys.exit(130)
except SystemExit:
os._exit(130)
What is happening above is that the interruption operation is captured in a nested TRY/EXCEPT
block. The outer block catches the interruption, while the inner one runs through different ways of performing the script termination. For the first termination function call, you need to call the SYS
package and the second needs the OS
one, so you need to declare these at the start of your script.
Of the two, SYS.EXIT
is preferable because it invokes clean-up activities while OS._EXIT
does not, which might explain the "_" prefix in the second of these. In fact, calling SYS.EXIT
is not that different to issuing RAISE SYSTEMEXIT
instead because that lies underneath it. Here OS._EXIT
is the fallback for when SYS.EXIT
fails, and it is not all that desirable given the abrupt action that it invokes.
The exit code of 130 is fed to both, since that is what is issued when you terminate a running instance of an application on Linux anyway. Using 0 could negate any means of picking up what has happened if you have downstream processing. In my use case, everything was standalone, so that did not matter so much.
Changing the number of lines produced by the tail command in a Linux or UNIX session
25th April 2023Since I often use the tail command to look at the end of a log file and occasionally in combination with the watch command for constant updates, I got to wondering if the number of lines issued by the tail command could be changed. That is a simple thing to do with the -n switch. All you need is something like the following:
tail -n 20 logfile.log
Here the value of 20 is the number of lines produced when it would be 10 by default, and logfile.log gets replaced by the path and name of what you are examining. One thing to watch is that your terminal emulator can show all the lines being displayed. If you find that you are not seeing all the lines that you expect, then that might be the cause.
While you could find this by looking through the documentation, things do not always register with you during dry reading of something laden with lists of parameters or switches. That is an affliction with tools that do a lot and/or allow a lot of customisation.
Performing parallel processing in Perl scripting with the Parallel::ForkManager module
30th September 2019In a previous post, I described how to add Perl modules in Linux Mint, while mentioning that I hoped to add another that discusses the use of the Parallel::ForkManager
module. This is that second post, and I am going to keep things as simple and generic as they can be. There are other articles like one on the Perl Maven website that go into more detail.
The first thing to do is ensure that the Parallel::ForkManager
module is called by your script; having the line of code presented below near the top of the file will do just that. Without this step, the script will not be able to find the required module by itself and errors will be generated.
use Parallel::ForkManager;
Then, the maximum number of threads needs to be specified. While that can be achieved using a simple variable declaration, the following line reads this from the command used to invoke the script. It even tells a forgetful user what they need to do in its own terse manner. Here $0 is the name of the script and N is the number of threads. Not all these threads will get used and processing capacity will limit how many actually are in use, which means that there is less chance of overwhelming a CPU.
my $forks = shift or die "Usage: $0 N\n";
Once the maximum number of available threads is known, the next step is to instantiate the Parallel::ForkManager
object as follows to use these child processes:
my $pm = Parallel::ForkManager->new($forks);
With the Parallel::ForkManager
object available, it is now possible to use it as part of a loop. A foreach
loop works well, though only a single array can be used, with hashes being needed when other collections need interrogation. Two extra statements are needed, with one to start a child process and another to end it.
foreach $t (@array) {
my $pid = $pm->start and next;
<< Other code to be processed >>
$pm->finish;
}
Since there is often other processing performed by script, and it is possible to have multiple threaded loops in one, there needs to be a way of getting the parent process to wait until all the child processes have completed before moving from one step to another in the main script and that is what the following statement does. In short, it adds more control.
$pm->wait_all_children;
To close, there needs to be a comment on the advantages of parallel processing. Modern multicore processors often get used in single threaded operations, which leaves most of the capacity unused. Utilising this extra power then shortens processing times markedly. To give you an idea of what can be achieved, I had a single script taking around 2.5 minutes to complete in single threaded mode, while setting the maximum number of threads to 24 reduced this to just over half a minute while taking up 80% of the processing capacity. This was with an AMD Ryzen 7 2700X CPU with eight cores and a maximum of 16 processor threads. Surprisingly, using 16 as the maximum thread number only used half the processor capacity, so it seems to be a matter of performing one's own measurements when making these decisions.
Creating a data-driven informat in SAS
27th September 2019Recently, I needed to create some example data with an extra numeric identifier variable that would be assigned according to the value of a character identifier variable. Not wanting to add another dataset merge or join to the code, I decided to create an informat from data. Initially, I looked into creating a format instead, but it did not accomplish what I wanted to do.
data patient;
keep fmtname start end label type;
set test.dm;
by subject;
fmtname="PATIENT";
start=subject;
end=start;
label=patient;
type="I";
run;
The input data needed a little processing as shown above. The format name was defined in the variable FMTNAME
and the TYPE
variable was assigned a value of I
to make this a numeric informat; to make character equivalent, a value of J
was assigned. The START
and END
variables declare the value range associated with the value of the LABEL
variable that would become the actual value of the numeric identifier variable. The variable names are fixed because the next step will not work with different ones.
proc format lib=work cntlin=patient;
run;
quit;
To create the actual informat, the dataset is read by the FORMAT
procedure with the CNTLIN
parameter specifying the name of the input dataset and LIB
defining the library where the format catalogue is stored. When this in complete, the informat is available for use with an input function as shown in the code excerpt below.
data ae1;
set ae;
patient=input(subject,patient.);
run;
Fixing an update error in OpenMediaVault 4.0
10th June 2019For a time, I found that executing the command omv-update
in OpenMediaVault 4.0 produced the following Python errors appeared, among other more benign messages:
Exception ignored in: <function WeakValueDictionary.__init__.<locals>.remove at 0xb7099d64>
Traceback (most recent call last):
File "/usr/lib/python3.5/weakref.py", line 117, in remove
TypeError: 'NoneType' object is not callable
Exception ignored in: <function WeakValueDictionary.__init__.<locals>.remove at 0xb7099d64>
Traceback (most recent call last):
File "/usr/lib/python3.5/weakref.py", line 117, in remove
TypeError: 'NoneType' object is not callable
Not wanting a failed update, I decided that I needed to investigate this and found that /usr/lib/python3.5/weakref.py
required the following updates to lines 109 and 117, respectively:
def remove(wr, selfref=ref(self), _atomic_removal=_remove_dead_weakref):
_atomic_removal(d, wr.key)
To be more clear, the line beginning with "def" is how line 109 should appear, while the line beginning with _atomic_removal is how line 117 should appear. Once the required edits were made and the file closed, re-running omv-update
revealed that the problem was fixed and that is how things remain at the time of writing.
ERROR: This range is repeated, or values overlap: - .
15th September 2012This is another posting in an occasional series on SAS error and warning messages that aren't as clear as they'd need to be. What produced the message was my creation of a control data set that I then wished to use to create a data-driven (in)format. It was the PROC FORMAT step that issued the message, and I got no (in)format created. However, there were no duplicate entries in the control data set as the message suggested to me, so a little more investigation was needed.
What that revealed was that there might be one variable missing from the data set that I needed to have. The SAS documentation has defined FMTNAME
, START
and LABEL
as compulsory variables, with each of them containing the following: format name, initial value and displayed value. My intention was this: to create a numeric code variable for one containing character strings using my data-driven format, with then numbers specified within a character variable as it should be. What was missing then was TYPE
.
This variable can be one of the following values: C
for character formats, I
for numeric informats, J
for character informats, N
for numeric formats and P
for picture formats. Due to it being a conversion from character values to numeric ones, I set the values of TYPE
to I and used an input function to do the required operations. The code for successfully creating the informat is below:
proc sql noprint;
create table tpts as
select distinct "_vstpt" as fmtname,
"I" as type,
vstpt as start,
vstpt as end,
strip(put(vstptnum,best.)) as label
from test
where not missing(vstptnum);
quit;
proc format library=work cntlin=tpts;
run;
quit;
Though I didn't need to do it, I added an END variable too for the sake of completeness. In this case, the range is such that its start and end are the same and there are cases where that will not be the case, though I am not dwelling on those.
AND & OR, a cautionary tale
27th March 2009The inspiration for this post is a situation where having the string "OR" or "AND" as an input to a piece of SAS Macro code, breaking a program that I had written. Here is a simplified example of what I was doing:
%macro test;
%let doms=GE GT NE LT LE AND OR;
%let lv_count=1;
%do %while (%scan(&doms,&lv_count,' ') ne );
%put &lv_count;
%let lv_count=%eval(&lv_count+1);
%end;
%mend test;
%test;
The loop proceeds well until the string "AND" is met and "OR" has the same effect. The result is the following message appears in the log:
ERROR: A character operand was found in the %EVAL function or %IF condition where a numeric operand is required. The condition was: %scan(&doms,&lv_count,' ') ne
ERROR: The condition in the %DO %WHILE loop, , yielded an invalid or missing value, . The macro will stop executing.
ERROR: The macro TEST will stop executing.
Both AND & OR (case doesn't matter, but I am sticking with upper case for sake of clarity) seem to be reserved words in a macro DO WHILE loop, while equality mnemonics like GE cause no problem. Perhaps, the fact that and equality operator is already in the expression helps. Regardless, the fix is simple:
%macro test;
%let doms=GE GT NE LT LE AND OR;
%let lv_count=1;
%do %while ("%scan(&doms,&lv_count,' ')" ne "");
%put &lv_count;
%let lv_count=%eval(&lv_count+1);
%end;
%mend test;
%test;
Now none of the strings extracted from the macro variable &DOMS
will appear as bare words and confuse the SAS Macro processor, but you do have to make sure that you are testing for the null string ("" or '') or you'll send your program into an infinite loop, always a potential problem with DO WHILE
loops so they need to be used with care. All in all, an odd-looking message gets an easy solution without recourse to macro quoting functions like %NRSTR
or %SUPERQ
.
Killing those runaway processes that refuse to die
5th July 2008I must admit that there have been times when I logged off from my main Ubuntu box at home to dispatch a runaway process that I couldn't kill, and then log back in again. The standard signal being sent to the process by the very useful kill command just wasn't sending the nefarious CPU-eating nuisance the right kind of signal. Thankfully, there is a way to control the signal being sent and there is one that does what's needed:
kill -9 [ID of nuisance process]
For Linux users, there appears to be another option for terminating a process that doesn't need the ps
and grep
command combination: it's killall
. Generally, killall
terminates all processes and its own has no immunity to its quest. Hence, it's an administrator only tool with a very definite and perhaps rarely required use. The Linux variant is more useful because it also will terminate all instances of a named process at a stroke and has the same signal control as the kill command. It is used as follows:
killall -9 nuisanceprocess
I'll certainly be continuing to use both of the above; it appears that Wine needs termination like this at times and VMware Workstation lapsed into the same sort of antisocial behaviour while running a VM running a development version of Ubuntu's Intrepid Ibex (or 8.10 if you prefer). Anything that keeps you from constantly needing to restart Linux sessions on your PC has to be good.