Technology Tales

Catching keyboard interruptions in a Python script for a more orderly exit

17th April 2024

A while back, I was using a Python script to watch a folder and process photos in there, whenever a new one was added. Eventually, I ended up with a few of these because I was unable to work out a way to get multiple folders watched in the same script.

In each of them, though, I needed a tidy way to exit a running script in place of the stream of consciousness that often emerges when you do such things to it. Because I knew what was happening anyway, I needed a script to terminate quietly and set to uncover a way to achieve this.

What came up was something like the code that you see below. While I naturally did some customisations, I kept the essential construct to capture keyboard interruption shortcuts, like the use of CTRL + C in a Linux command line interface.

if __name__ == '__main__':
    try:
        main()
    except KeyboardInterrupt:
        print('Interrupted')
        try:
            sys.exit(130)
        except SystemExit:
            os._exit(130)

What is happening above is that the interruption operation is captured in a nested TRY/EXCEPT block. The outer block catches the interruption, while the inner one runs through different ways of performing the script termination. For the first termination function call, you need to call the SYS package and the second needs the OS one, so you need to declare these at the start of your script.

Of the two, SYS.EXIT is preferable because it invokes clean-up activities while OS._EXIT does not, which might explain the “_” prefix in the second of these. In fact, calling SYS.EXIT is not that different to issuing RAISE SYSTEMEXIT instead because that lies underneath it. Here OS._EXIT is the fallback for when SYS.EXIT fails, and it is not all that desirable given the abrupt action that it invokes.

The exit code of 130 is fed to both, since that is what is issued when you terminate a running instance of an application on Linux anyway. Using 0 could negate any means of picking up what has happened if you have downstream processing. In my use case, everything was standalone, so that did not matter so much.

Automating writing using R and Claude

16th April 2024

Automation of writing using AI has become prominent recently, especially since GPT came to everyone’s notice. It is more than automation of proofreading but of producing the content itself, as Mark Hinkle and Luke Matthews can testify. Figuring out how to use Generative AI needs more than one line prompts, so knowing what multi-line ones will work is what is earning six digit annual salaries for some.

Recently, I gave this a go when writing a post that used content from a Reddit post thread. The first step was to extract the content from the thread, and I found that I could use R to do this. That meant installing the RedditExtractoR package using the following command:

install.packages("RedditExtractoR")

Then, I created a short script containing the following lines of code with placeholders added in place of the actual locations:

library("RedditExtractoR")

write.csv(get_thread_content("<URL for Reddit post thread>"), "<location of text file>")

The first line above calls the RedditExtractoR package for use so that its get_thread_content function could be used to scape the thread’s textual content. This was then fed to write.csv for writing out a text file with content.

Once I had the text file, I could upload it to Anthropic’s Claude for the next steps. Firstly, I got it to give me a summary of the thread discussion before I asked it to give me the suggested solutions to the issue. Impressively, it capably provided me with the latter.

Now armed with these answers, I set to crafting the post from them. Claude did not do all the work for me, but it certainly helped with the writing. This is something that I fancy exploring further, especially given how business computing is likely to proceed in the next few years.

Getting rid of the “Get more security upgrades through Ubuntu Pro with ‘esm-apps’ enabled” message when performing a system update

15th April 2024

Not so long ago, I got the above message while running sudo apt upgrade on an Ubuntu Server system. This was not the first time that this kind of thing happened to me, so I started searching the web for a solution. You do get to see complaints about advertising, but these are never useful.

Accordingly, here are some possible ways of remediating the situation:

Execute the following commands to disable the responsible services, renaming the configuration file to prevent it from being used (deleting or editing the configuration file to remove the unwanted content are other options):
sudo systemctl mask apt-news.service
sudo systemctl mask esm-cache.service
sudo mv /etc/apt/apt.conf.d/20apt-esm-hook.conf /etc/apt/apt.conf.d/20apt-esm-hook.conf.disabled
Alternatively, simply remove the ubuntu-advantage-tools package, which contains the /etc/apt/apt.conf.d/20apt-esm-hook.conf file.
Another option is to remove the ubuntu-pro-client package.
Lastly, there also is the possibility of enabling ESM, though that was not desirable for me.

In my case, it may have been the penultimate option on the list that I chose. In any case, I was rid of the unwanted message.

Dealing with the following message issued when using Certbot on Apache: “Unable to find corresponding HTTP vhost; Unable to create one as intended addresses conflict; Current configuration does not support automated redirection”

12th April 2024

When doing something with Certbot on another website not so long ago, I encountered the above message when executing the following command (semicolons have been added to separate the lines):

sudo certbot --apache

The solution was to open /etc/apache2/sites-available/000-default.conf using nano and update the ServerName field (or the line containing this keyword) so it matched the address used for setting up Let’s Encrypt SSL certificates. The mention of Apache in the above does make the solution specific to this web server software, so you will need another solution if you meet this kind of problem when using Nginx or another web server.

Using a BASH command to count the files in a directory

12th March 2024

As part of my backup workflow, I maintain a machine running OpenMediaVault that I only power up when backups are to be performed. Typically, this often happens when I have new photography images to load, and I have a NAS that acts as an online backup system. The OpenMediaVault machine is a near-offline counterpart to the NAS for added safety.

Recently, I needed to check on the number of image files in a directory from an SSH session because of a need to create a new repository for 2024. Some files from this year had ended up in the 2023 one, and I needed to be sure that nothing from last year ended in the 2024 folder, or vice versa. Getting a file count from a trusted source was a quick way of doing exactly this.

Due to clumsiness with the NAS, I had to do this using the OpenMediaVault machine. While I could go mounting drives on an interim basis, it was quicker to work from a BASH session. The trick was to use the wc command for counting the lines output by an invocation of the ls command. An example follows:

ls -l | wc -l

The -l (as in l for Lima) switch forces wc to count lines, while the counterpart (same letter) for ls forces it to list the contents in long form, one item per line. Thus, counting the number of lines gets you the count of the number of files. The call to the ls command can be customised to add other things life the number of dot files, but the above was enough for my purposes. When the files in both 2023 directories matched, I was satisfied that all was in order.

AttributeError: module ‘PIL’ has no attribute ‘Image’

11th March 2024

One of my websites has an online photo gallery. This has been a long-term activity that has taken several forms over the years. Once HTML and JavaScript based, it then was powered by Perl before PHP and MySQL came along to take things from there.

While that remains how it works, the publishing side of things has used its own selection of mechanisms over the same time span. Perl and XML were the backbone until Python and Markdown took over. There was a time when ImageMagick and GraphicsMagick handled image processing, but Python now does that as well.

That was when the error message gracing the title of this post came to my notice. Everything was working well when executed in Spyder, but the message appears when I tried running things using Python on the command line. PIL is the abbreviated name for the Python 3 pillow package; there was one called PIL in the Python 2 days.

For me, pillow loads, resizes and creates new images, which is handy for adding borders and copyright/source information to each image as well as creating thumbnails. All this happens in memory and that makes everything go quickly, much faster than disk-based tools like ImageMagick and GraphicsMagick.

Of course, nothing is going to happen if the package cannot be loaded, and that is what the error message is about. Linux is what I mainly use, so that is the context for this scenario. What I was doing was something like the following in the Python script:

import PIL

Then, I referred to PIL.Image when I needed it, and this could not be found when the script was run from the command line (BASH). The solution was to add something like the following:

from PIL import Image

That sorted it, and I must have run into trouble with PIL.ImageFilter too, since I now load it in the same manner. In both cases, I could just refer to Image or ImageFilter as I required and without the dot syntax. However, you need to make sure that there is no clash with anything in another loaded Python package when doing this.

Upgrading from OpenMediaVault 6.x to OpenMediaVault 7.x

8th March 2024

Having an older PC to upgrade, I decided to install OpenMediaVault on there a few years ago after adding in 6 TB and 4 TB hard drives for storage, a Gigabit network card to speed up backups and a new BeQuiet! power supply to make it quieter. It has been working smoothly since then, and the release of OpenMediaVault 6.x had me wondering how to move to it.

Usefully, I enabled an SSH service for remote logins and set up an account for anything that I needed to do. This includes upgrades, taking backups of what is on my NAS drives, and even shutting down the machine when I am done with what I need to do with it.

Using an SSH session, the first step was to switch to the administrator account and issue the following command to ensure that my OpenMediaVault 6.x installation was as up-to-date as it could be:

omv-update

Once that had completed what it needed to do, the next step was to do the upgrade itself with the following command:

omv-release-upgrade

With that complete, it was time to reboot the system, and I fired up the web administration interface and spotted a kernel update that I applied. Again, the system was restarted, and further updates were noticed and these were applied, again through the web interface. The whole thing is based on Debian 12.x, but I am not complaining since it quietly does exactly what I need of it. There was one slight glitch when doing an update after the changeover, and that was quickly sorted.

Upgrading Julia packages

23rd January 2024

Whenever a new version of Julia is released, I have certain actions to perform. With Julia 1.10, installing and updating it has become more automated thanks to shell scripting or the use of WINGET, depending on your operating system. Because my environment predates this, I found that the manual route still works best for me, and I will continue to do that.

Returning to what needs doing after an update, this includes updating Julia packages. In the REPL, this involves dropping to the PKG subshell using the “]” key if you want to avoid using longer commands or filling your history with what is less important for everyday usage.

Previously, I often ran code to find a package was missing after updating Julia, so the add command was needed to reinstate it. That may raise its head again, but there also is the up command for upgrading all packages that were installed. This could be a time saver when only a single command is needed for all packages and not one command for each package as otherwise might be the case.

OWASP Top 10 for Large Language Model Applications

21st January 2024

OWASP stands for Open Web Application Security Project, and it is an online community dedicated to web application security. They are well known for their Top 10 Web Application Security Risks and late last year, they added a Top 10 for
Large Language Model (LLM) Applications.

Given that large language models made quite a splash last year, this was not before time. ChatGPT gained a lot of attention (OpenAI also has had DALL-E for generation of images for quite a while now), there are many others with Anthropic Claude and Perplexity also being mentioned more widely.

Figuring out what to do with any of these is not as easy as one might think. For someone more used to working with computer code, using natural language requests is quite a shift when you no longer have documentation that tells what can and what cannot be done. It is little wonder that prompt engineering has emerged as a way to deal with this.

Others have been plugging in LLM capability into chatbots and other applications, so security concerns have come to light, so far, I have not heard anything about a major security incident, but some are thinking already about how to deal with AI-suggested code that other already are using more and more.

Given all that, here is OWASP’s summary of their Top 10 for LLM Applications. This is a subject that is sure to draw more and more interest with the increasing presence of artificial intelligence in our everyday working and no-working lives.

LLM01: Prompt Injection

This manipulates an LLM through crafty inputs, causing unintended actions by the LLM. Direct injections overwrite system prompts, while indirect ones manipulate inputs from external sources.

LLM02: Insecure Output Handling

This vulnerability occurs when an LLM output is accepted without scrutiny, exposing backend systems. Misuse may lead to severe consequences such as Cross-Site Scripting (XSS), Cross-Site Request Forgery (CSRF), Server-Side Request Forgery (SSRF), privilege escalation, or remote code execution.

LLM03: Training Data Poisoning

This occurs when LLM training data is tampered, introducing vulnerabilities or biases that compromise security, effectiveness, or ethical behaviour. Sources include Common Crawl, WebText, OpenWebText and books.

LLM04: Model Denial of Service

Attackers cause resource-heavy operations on LLMs, leading to service degradation or high costs. The vulnerability is magnified due to the resource-intensive nature of LLMs and the unpredictability of user inputs.

LLM05: Supply Chain Vulnerabilities

LLM application lifecycle can be compromised by vulnerable components or services, leading to security attacks. Using third-party datasets, pre-trained models, and plugins can add vulnerabilities.

LLM06: Sensitive Information Disclosure

LLMs may inadvertently reveal confidential data in its responses, leading to unauthorized data access, privacy violations, and security breaches. It’s crucial to implement data sanitization and strict user policies to mitigate this.

LLM07: Insecure Plugin Design

LLM plugins can have insecure inputs and insufficient access control. This lack of application control makes them easier to exploit and can result in consequences such as remote code execution.

LLM08: Excessive Agency

LLM-based systems may undertake actions leading to unintended consequences. The issue arises from excessive functionality, permissions, or autonomy granted to the LLM-based systems.

LLM09: Overreliance

Systems or people overly depending on LLMs without oversight may face misinformation, miscommunication, legal issues, and security vulnerabilities due to incorrect or inappropriate content generated by LLMs.

LLM10: Model Theft

This involves unauthorized access, copying, or exfiltration of proprietary LLM models. The impact includes economic losses, compromised competitive advantage, and potential access to sensitive information.

Adding titles and footnotes to Excel files created using SAS

14th August 2023

Using the Excel and ExcelXP destinations in the Output Delivery System (ODS), SAS can generate reports as XLSX workbooks with one or more worksheets. Recently, I was updating a SAS Macro that created one of these and noticed that there were no footnotes. The fix was a simple: add to the options specified on the initial ODS Excel statement.

ods excel file="&outdir./&file_name..xlsx" options(embedded_titles="yes" embedded_footnotes="yes");

Notice in the code above that there are EMBEDDED_TITLES and EMBEDDED_FOOTNOTES options. Without both of these being set to YES, no titles or footnotes will appear in a given worksheet even if they have been specified in a program using TITLE or FOOTNOTE statements. In my case, it was the EMBEDDED_FOOTNOTES option that was missing, so adding that set things to rights.

The thing applies to the ExcelXP tag set as you will find from a code sample that SAS has shared on their website. That was what led me to the solution to what was happening in the Excel ODS destination in my case.

« Older Entries «