shell script | Technology Tales

File comparison using PowerShell

16^th August 2025

In the past, I have compared files on the Linux/UNIX command line as well as the legacy Windows command line. Recently, I decided to try it using PowerShell. Here is the command structure:

Compare-Object (Get-Content ".\[name of one text file]") (Get-Content ".\[name of another text file]") > [path and name of output file]

Admittedly, this is more verbose than the others that I have mentioned above. Nevertheless, it does the job and sends everything to a text file for review. The Compare-Object piece does the comparison once the Get-Content portions have read in the content.

Finding human balance in an age of AI code generation

12^th March 2025

Recently, I was asked about how I felt about AI. Given that the other person was not an enthusiast, I picked on something that happened to me, not so long ago. It involved both Perplexity and Google Gemini when I was trying to debug something: both produced too much code. The experience almost inspired a LinkedIn post, only for some of the thinking to go online here for now. A spot of brainstorming using an LLM sounds like a useful exercise.

Going back to the original question, it happened during a meeting about potential freelance work. Thus, I tapped into experiences with code generators over several decades. The first one involved a metadata-driven tool that I developed; users reported that there was too much imperfect code to debug with the added complexity that dealing with clinical study data brings. That challenge resurfaced with another bespoke tool that someone else developed, and I opted to make things simpler: produce some boilerplate code and let users take things from there. Later, someone else again decided to have another go, seemingly with more success.

It is even more challenging when you are insufficiently familiar with the code that is being produced. That happened to me with shell scripting code from Google Gemini that was peppered with some Awk code. There was no alternative but to learn a bit more about the language from Tutorials Point and seek out an online book elsewhere. That did get me up to speed, and I will return to these when I am in need again.

Then, there was the time when I was trying to get a Julia script to deal with Google Drive needing permissions to be set. This started Google Gemini into adding more and more error checking code with try catch blocks. Since I did not have the issue at that point, I opted to halt and wait for its recurrence. When it did, I opted for a simpler approach, especially with the gdrive CLI tool starting up a web server for completing the process of reactivation. While there are times when shell scripting is better than Julia for these things, I added extra robustness and user-friendliness anyway.

During that second task, I was using VS Code with the GitHub Copilot plugin. There is a need to be careful, yet that can save time when it adds suggestions for you to include or reject. The latter may apply when it adds conditional logic that needs more checking, while simple code outputting useful text to the console can be approved. While that certainly is how I approach things for now, it brings up an increasingly relevant question for me.

How do we deal with all this code production? In an environment with myriads of unit tests and a great deal of automation, there may be more capacity for handling the output than mere human inspection and review, which can overwhelm the limitations of a human context window. A quick search revealed that there are automated tools for just this purpose, possibly with their own learning curves; otherwise, manual working could be a better option in some cases.

After all, we need to do our own thinking too. That was brought home to me during the Julia script editing. To come up with a solution, I had to step away from LLM output and think creatively to come up with something simpler. There was a tension between the two needs during the exercise, which highlighted how important it is to learn not to be distracted by all the new technology. Being an introvert in the first place, I need that solo space, only to have to step away from technology to get that when it was a refuge in the first place.

For anyone with a programming hobby, they have to limit all this input to avoid being overwhelmed; learning a programming language could involve stripping out AI extensions from a code editor, for instance, LLM output has its place, yet it has to be at a human scale too. That perhaps is the genius of a chat interface, and we now have Agentic AI too. It is as if the technology curve never slackens, at least not until the current boom ends, possibly when things break because they go too far beyond us. All this acceleration is fine until we need to catch up with what is happening.

Clearing the Julia REPL

23^rd September 2024

During development, there are times when you need to clear the Julia REPL. It can become so laden with content that it becomes hard to perform debugging of your code. One way to accomplish this is issuing the CTRL + L keyboard shortcut while focus is within the REPL; you need to click on it first. Another is to issue the following in the REPL itself:

print("\033c")

Here \033 is an escape code in octal format. It is often used in terminal control sequences. The c character is what resets the terminal to its initial state. Printing this sequence is what does the clearance, and variations can be used to clear other kinds of console screens too. That makes it a more generic solution.

Dropping to an underlying shell using the ; character is another possibility. Then, you can use the clear or cls commands as needed; the latter is for Windows systems.

One last option is to define a Julia function for doing this:

function clear_console()
    run(`clear`)  # or `cls` for Windows
end

Calling the clear_console function then clears the screen programmatically, allowing for greater automation. The run function is the one that sends that command in backticks to the underlying shell for execution. Even using that alone should work too.

Unzipping more than one file at a time in Linux and macOS

10^th September 2024

To me, it sounded like a task for shell scripting, but I wanted to extract three zip archives in one go. They had come from Google Drive and contained different splits of the files that I needed, raw images from a camera. However, I found a more succinct method than the line of code that you see below (it is intended for the BASH shell):

for z in *.zip; do; unzip "$z"; done

That loops through each file that matches a glob string. All I needed was something like this:

unzip '*.zip'

Without embarking on a search, I got close but have not quoted the search string. Without the quoting, it was not working for me. To be sure that I was extracting more than I needed, I made the wildcard string more specific for my case.

Once the extraction was complete, I moved the files into a Lightroom Classic repository for working on them later. All this happened on an iMac, but the extraction itself should work on any UNIX-based operating system, so long as the shell supports it.

Upgrading Julia packages

23^rd January 2024

Whenever a new version of Julia is released, I have certain actions to perform. With Julia 1.10, installing and updating it has become more automated thanks to shell scripting or the use of WINGET, depending on your operating system. Because my environment predates this, I found that the manual route still works best for me, and I will continue to do that.

Returning to what needs doing after an update, this includes updating Julia packages. In the REPL, this involves dropping to the PKG subshell using the ] key if you want to avoid using longer commands or filling your history with what is less important for everyday usage.

Previously, I often ran code to find a package was missing after updating Julia, so the add command was needed to reinstate it. That may raise its head again, but there also is the up command for upgrading all packages that were installed. This could be a time saver when only a single command is needed for all packages and not one command for each package as otherwise might be the case.

Expanding the coding toolkit: Adding R and Python in a changing landscape

10^th April 2021

Over the years, I have taught myself a number of computing languages, with some coming in useful for professional work while others came in handy for website development and maintenance. The collection has grown to include HTML, CSS, XML, Perl, PHP and UNIX Shell Scripting. The ongoing pandemic allowed to me add two more to the repertoire: R and Python.

My interest in these arose from my work as an information professional concerned with standardisation and automation of statistical results delivery. To date, the main focus has been on clinical study data, but ongoing changes in the life sciences sector could mean that I may need to look further afield, so having extra knowledge never hurts. Though I have been a SAS programmer for more than twenty years, its predominance in the clinical research field is not what it was, which causes me to rethink things.

As it happens, I would like to continue working with SAS since it does so much and thoughts of leaving it after me bring sadness. It also helps to know what the alternatives might be and to reject some management hopes about any newcomers, especially regarding the amount of code being produced and the quality of graphs being created. Use cases need to be assessed dispassionately, even when emotions loom behind the scenes.

Since both R and Python bring large scripting ecosystems with active communities, the attraction of their adoption makes a deal of sense. SAS is comparable in the scale of its own ecosystem, though there are considerable differences and the platform is catching up when it comes to Data Science. While the aforementioned open-source languages may have had a head start, it appears that others are not standing still either. It is a time to have wider awareness, and online conference attendance helps with that.

The breadth of what is available for any programming language more than stymies any attempt to create a truly all encompassing starting point, and I have abandoned thoughts of doing anything like that for R. Similarly, I will not even try such a thing for Python. Consequently, this means that my sharing of anything learned will be in the form of discrete postings from time to time, especially given ho easy it is to collect numerous website links for sharing.

The learning has been facilitated by ongoing pandemic restrictions, though things are opening up a little now. The pandemic also has given us public data that can be used for practice, since much can be gained from having one's own project instead of completing exercises from a book. Having an interesting data set with which to work is a must, and COVID-19 data contain a certain self-interest as well, while one remains mindful of the suffering and loss of life that has been happening since the pandemic first took hold.

Searching file contents using PowerShell

25^th October 2018

Having made plenty of use of grep on the Linux/UNIX command and findstr on the legacy Windows command line, I wondered if PowerShell could be used to search the contents of files for a text string. Usefully, this turns out to be the case, but I found that the native functionality does not use what I have used before. The form of the command is given below:

Select-String -Path <filename search expression> -Pattern "<search expression>" > <output file>

While you can have the output appear on the screen, it always seems easier to send it to a file for subsequent use, and that is what I am doing above. The input to the -Path switch can be a filename or a wildcard expression, while that to the -Pattern can be a text string enclosed in quotes or a regular expression. Given that it works well once you know what to do, here is an example:

Select-String -Path *.sas -Pattern "proc report" > c:\temp\search.txt

The search.txt file then includes both the file information and the text that has been found for the sake of checking that you have what you want. What you do next is up to you.

Smarter file renaming using PowerShell

14^th November 2014

It appears that the Rename-Item commandlet in PowerShell is a very useful tool when it comes to smarter renaming of files. Even text substitution is a possibility, and what follows is an example that takes the output of the Dir command for listing the files in a directory and replaces hyphens with underscores in each one.

Dir | Rename-Item –NewName { $_.name –replace “-“,”_” }

The result is that something like the-file.txt becomes the_file.txt. This behaviour is reminiscent of the rename command found on Linux and UNIX systems, where regular expressions can be used, like in the following example that has the same result as the above:

rename 's/-/_/g' *

In both cases, you do need to be careful as to what files are in a directory for this, though the wildcard syntax on Linux or UNIX will be more familiar to anyone who has worked with files via almost any command line. Another thing to watch in the UNIX world is that * parses the whole directory structure, and that could be something that is not wanted for much of the time.

All of this is a far cry from the capabilities of the ren or rename command used in the days of MS-DOS and what has become the legacy Windows command line. Apart from simple renaming, any attempt at tweaking a filename through substitution ended up with the extra string getting appended to filenames when I tried it. Thus, the PowerShell option looks better in comparison.

Shell swapping in Windows: PowerShell and the legacy command prompt

28^th April 2010

Until the advent of PowerShell, Windows had been the poor relation when it came to working from the command line when compared with UNIX, Linux and so on. A recent bit of fiddling had me trying to run FTP from the legacy command prompt when I ran into problems with UNC address resolution (it's unsupported by the old technology) and mapping of network drives. It turned out that my error 85 was being caused by an unavailable drive letter that the net use command didn't reveal as being in use. Reassuringly, this wasn't a Vista issue that I couldn't circumvent.

During this spot of debugging, I tried running batch files in PowerShell and discovered that you cannot run them there like you would from the old command prompt. In fact, you need a line like the following:

cmd /c script.bat

In other words, you have to call cmd.exe like perl.exe, wscript.exe and cscript.exe for batch files to execute. If I had time, I might have got to exploring the use ps1 files for setting up PowerShell commandlets, but that is something that needs to wait until another time. What I discovered though is that UNC addressing can be used with PowerShell without the need for drive letter mappings, not a bad development at all. While on the subject of discoveries, I discovered that the following command opens up a command prompt shell from PowerShell without any need to resort to the Start Menu:

cmd /k

Entering the exit command returns you to the PowerShell command line again, and entering cmd /? reveals the available options for the command, so you need never be constrained by your own knowledge or its limitations.

Taming raw images with ImageMagick: A virtual workaround for Ubuntu 9.04

5^th May 2009

While using a command line tool like ImageMagick for image processing may sound a really counter-intuitive thing to do, there's no need to do everything on a case by case interactive basis. Image resizing and format conversion come to mind here. Helper programs are used behind the scenes too, with Ghostscript being used to create Postscript files, for example.

The subject of helper programs brings me to an issue that has hampered me recently. While I am aware that there are tools like F-Spot available, I am also wont to use a combination of shell scripting (BASH & KSH), Perl and ImageMagick for organising my digital photos. My preference for using Raw camera files (DNG & CRW) means that ImageMagick cannot access these without a little helper. In the case of Ubuntu, it's UFRaw. However, Jaunty Jackalope appears to have seen UFRaw updated to a version that is incompatible with the included version of ImageMagick (6.4.5 as opposed to 3.5.2 at the time of writing). The result is that the command issued by ImageMagick to UFRaw - issue the command man ufraw-batch to see the details - is not accepted by the included version of the latter, 0.15 if you're interested. It appears that an older release of UFRaw accepted the output device ppm16 (16-bit PPM files) but this should now be specified as ppm for the output device and 16 for the output depth. In a nutshell, where the parameter output-type did the lot, you now need both output-type and output-depth.

While I thought of decoupling things by using UFRaw to create 16-bit PPM files for processing by ImageMagick, it was to no avail. The identify command wouldn't return the date on which the image was taken. Though I even changed the type to 8-bit JPEG's with added EXIF information, no progress was made. In the end, a mad plan came to mind: creating a VirtualBox VM running Debian. The logic was that if Debian deserves its reputation for solidity, dependencies like ImageMagick and UFRaw shouldn't be broken, and I wasn't wrong. To make it work well, I needed to see if I could get Guest Additions installed on Debian. Out of the box, the supported kernel version must be at least 2.6.27 and Debian's is 2.6.26, so additional work was on the cards. First, GCC, Make and the correct kernel header files need to be installed. Once those are in place, the installation works smoothly and a restart sets the goodies in motion. To make the necessary shared folder to be available, a command like the following was executed:

mount -t vboxfs [Shared Folder name] [mount point]

Once that deed was done and ImageMagick instated, the processing that I have been doing for new DSLR images was reinstated. Ironically, Debian's version of ImageMagick, 6.3.7, is even older than Ubuntu's, but it works and that's the main thing. Since there is an Ubuntu bug report for this on Launchpad, I hope that it gets fixed at some point in the near future. However, that may mean awaiting 9.10 or Karmic Koala, so I'm glad to have this workaround for now.

« Older Entries «