Technology Tales

Adventures & experiences in contemporary technology

Removing duplicate characters from strings using BASH scripting

30th March 2023

Recently, I wanted to extract some text from the Linux command by word number only for multiple spaces to make things less predictable. The solution was to remove the duplicate spaces. This can be done using sed but you add the complexity of regular expressions if you opt for that solution. Instead, the tr command offers a neater approach. For removing duplicate spaces, the command takes the following form:

echo "test   test" | tr -s " "

Since I was piping some text to the command, that is what I have above. The tr command is intended to replace or delete characters and the -s switch is a shorthand for --squeeze-repeats. The actual character to be deduplicated is passed in quotes at the end; here, it is a space but it could be anything that is duplicated. The resulting text in this example becomes:

test test

After the processing, there is now only one space separating the two words, which is the solution that I sought. It certainly cut out any variability that I was encountering in my usage.

Resolving a clash between Homebrew and Python

22nd November 2022

For reasons that I cannot recall now, I installed the Hugo static website generator on my Linux system and web servers using Homebrew. The only reason that I suggest is that it might have been a way to get the latest version at the time since Linux Mint only does major changes every two years, keeping it in line with long-term support editions of Ubuntu.

When Homebrew was installed, it changed the lookup path for command line executables by adding the following line to my .bashrc file:

eval "$(/home/linuxbrew/.linuxbrew/bin/brew shellenv)"

This executed the following lines:

export HOMEBREW_PREFIX="/home/linuxbrew/.linuxbrew";
export HOMEBREW_CELLAR="/home/linuxbrew/.linuxbrew/Cellar";
export HOMEBREW_REPOSITORY="/home/linuxbrew/.linuxbrew/Homebrew";
export PATH="/home/linuxbrew/.linuxbrew/bin:/home/linuxbrew/.linuxbrew/sbin${PATH+:$PATH}";
export MANPATH="/home/linuxbrew/.linuxbrew/share/man${MANPATH+:$MANPATH}:";
export INFOPATH="/home/linuxbrew/.linuxbrew/share/info:${INFOPATH:-}";

While the result suits Homebrew, it changed the setup of Python and its packages on my system. Eventually, this had undesirable consequences like messing up how Spyder started so I wanted to change this. There are other things that I have automated using Python and these were not working either.

One way that I have seen suggested is to execute the following command but I cannot vouch for this:

brew unlink python

What I did was to comment out the offending line in .bashrc and replace it with the following:

export PATH="$PATH:/home/linuxbrew/.linuxbrew/bin:/home/linuxbrew/.linuxbrew/sbin"

export HOMEBREW_PREFIX="/home/linuxbrew/.linuxbrew";
export HOMEBREW_CELLAR="/home/linuxbrew/.linuxbrew/Cellar";
export HOMEBREW_REPOSITORY="/home/linuxbrew/.linuxbrew/Homebrew";

export MANPATH="/home/linuxbrew/.linuxbrew/share/man${MANPATH+:$MANPATH}:";
export INFOPATH="${INFOPATH:-}/home/linuxbrew/.linuxbrew/share/info";

The first of these adds the Homebrew paths to the end of the PATH variable instead of the start of the same as was the case before. This means that system folders get searched for executable files before the Homebrew ones. It also means that Python packages are loaded from my user area and not the Homebrew one as was the case under its own terms. There are other things to remember with Python packages such as not having a version installed at the system level and another at the user one since these will conflict with one another.

So far, the result of the Homebrew changes is not unsatisfactory and I will watch for any rough edges that need addressing. If something comes up, then I will set things up in another way.

Converting QEMU disk images to VirtualBox images on Linux Mint 21

30th October 2022

Recently, VirtualBox gained fuller support for Windows 11 and I successively set up a new Windows 11 virtual machine that I hope will supplant a Windows 10 counterpart in time. The setup itself was streamlined but I ran into such stability issues that I set the new VM aside until a new version of VirtualBox got released. That has happened with the appearance of version 7.0.2 but Windows 11 remains prone to freezing on my Linux Mint machine. Thankfully, that now is much less frequent but the need for added stability remains outstanding.

While I was thinking about trying our Virtualbox 7.0.0, I remembered a QEMU machine that I had running Windows 11. Though QEMU proved more limited than VirtualBox when it came to having easy availability of functionality like moving data in and out of the virtual machine or support for sound, there was no problem with TPM support or system stability. Since it did contain some useful data, I wondered about converting its virtual hard disk to VirtualBox format and it is easy to do. First, you need to install qemu-img and other utilities as follows:

sudo apt-get install qemu-utils

With that in place, executing a command like the following performs the required conversion. Here, the -O switch specifies the required file type of vdi in this case.

qemu-img convert -O vdi [virtual hard disk].qcow2 [virtual hard disk].vdi

While I have yet to mount it on the new Virtualbox Windows 11 virtual machine, it is good to have the old virtual hard disk available for doing so. The thought of using it as a boot drive in VirtualBox did enter my mind but the required change of drivers and other incompatibilities dissuaded me from doing so.

Removing redundant kernels from Ubuntu

29th October 2022

Recently, a message appear on some web servers that I have that exhorted me to upgrade to Ubuntu 22.04.1 using the do-release-upgrade command. In the interests of remaining current, I did just that to get another message, one like the following:

The upgrade needs a total of [amount of space with units] free space on disk `/boot`.
Please free at least an additional [amount of space with units] of disk space on `/boot`.
Empty your trash and remove temporary packages of former installations
using `sudo apt-get clean`.

Using sudo apt-get clean did not resolve the problem so the advice given was of no use. The actual problem was that there were too many old kernels cluttering up /boot and searching around the web provided that wisdom. What also came up was a single command for fixing the problem. However, removing the wrong kernel can trash a system so I took a more cautious approach. First, I listed the kernels to be removed and checked that they did not include the currently running one. This was done with the following command (broken up over several lines for clarity using the backslash character to denote continuation) and running uname -r found the details of the running kernel:

dpkg -l linux-{image,headers}-"[0-9]*" \

| awk '/ii/{print $2}' \

| grep -ve "$(uname -r \

| sed -r 's/-[a-z]+//')"

The dpkg command listed the installed kernels with awk, grep and sed filtering out unwanted sections of the text. The awk command takes the tabular output from dpkg and turns it into a list. The -v switch on the grep command gets the lines that do not match the search expression created by the sed command, while the -e switch makes grep look for patterns. The sed command removes all letters from the output of the uname command, where the -r switch produces the kernel release details, to leave on the release number of the current kernel. On being satisfied that nothing untoward would happen, the full command below (also broken up over several lines for clarity using the backslash character to denote continuation) could be executed.

sudo apt purge $(dpkg -l linux-{image,headers}-"[0-9]*" \

| awk '/ii/{print $2}' \

| grep -ve "$(uname -r \

| sed -r 's/-[a-z]+//')")

This apt to purge the unwanted kernels, thus freeing up enough space for the upgrade to continue. That happened without significant incident though there were some remediations needed on the PHP side to get the website working smoothly again.

Using inventory files with Ansible

28th October 2022

This is the second post on Ansible following my main system’s upgrade to Linux Mint 21. Then, I manually ran some Ansible playbooks only to spot messages that I had not noticed before. Here, I discuss two messages issued because of an issue with an inventory file, which is where one defines lists of servers against which playbooks are executed. The default is called hosts and is located at /etc/ansible but the system upgrade had renamed the existing one so Ansible could not find it. The solution was to take a copy and put somewhere safer. Then, I needed to add the location of the new file to the affected ansible-playbook commands using the following construct:

ansible-playbook [playbook path] -i [inventory file path]

Before I did this, I was seeing messages including the text “Could not match supplied host pattern” or others with the following text:

[WARNING]: No inventory was parsed, only implicit localhost is available

[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'

The cause was the same in each case and attending to the inventory file got rid of the unwanted messages. The new file also should not be affected by system upgrades in the future.

Fixing an Ansible warning about boolean type conversion

27th October 2022

My primary use for Ansible is doing system updates using the inbuilt apt module. Recently, I updated my main system to Linux Mint 21 and a few things like Ansible stopped working. Removing instances that I had added with pip3 sorted the problem but I then ran playbooks manually only for various warning messages to appear that I had not noticed before. What follows below is one of these.

[WARNING]: The value True (type bool) in a string field was converted to u'True' (type string). If this does not look like what you expect, quote the entire value to ensure it does not change.

The message is not so clear in some ways, not least because it had me looking for a boolean value of True when it should have been yes. A search on the web revealed something about the apt module that surprised me.: the value of the upgrade parameter is a string when others like it take boolean values of yes or no. Thus, I had passed a bareword of yes when it should have been declared in quotes as “yes”. To my mind, this is an inconsistency but I have changed things anyway to get rid of the message.

Changing the Ansible Vault editor from Vi to Nano

15th August 2022

Recently, I got to experimenting with Ansible after reading about the orchestration tool in a copy of Admin magazine. It came in handy for updating a few web servers that I have as well as updating my main Linux workstation. For the former, automated entry of SSH passwords sufficed but the same did not apply for sudo usage on my local machine. This meant that I needed to use Ansible Vault to store the administrator password and doing so opened up a file in the Vi editor. since I am not familiar with Vi and wanted to get things sorted quickly, I fancied using something more user-friendly like Nano.

Doing this meant adding the following line to .bashrc:

export EDITOR=nano

Saving and closing the file followed by reloading the session set me up for what was needed.

Controlling display of users on the logon screen in Linux Mint 20.3

15th February 2022

Recently, I tried using Commento with a static website that I was developing and this needed PostgreSQL rather than MySQL or MariaDB, which many content management tools use. That meant a learning curve that made me buy a book as well as the creation of a system account for administering PostgreSQL. These are not the kind of things that you want to be too visible so I wanted to hide them.

Since Linux Mint uses AccountsService, you cannot use lightdm to do this (the comments in /etc/lightdm/users.conf suggest as much). Instead, you need to go to /var/lib/AccountsService/users and look for a file called after the user name. If one exists, all that is needed is for you to add the following line under the [User] section:

SystemAccount=true

If there is no file present for the user in question, then you need to create one with the following lines in there:

[User]
SystemAccount=true

Once the configuration files are set up as needed, AccountsService needs to be restarted and the following command does that deed:

sudo systemctl restart accounts-daemon.service

Logging out should reveal that the user in question is not listed on the logon screen as required.

A quick look at the 7G Firewall

17th October 2021

There is a simple principal with the 7G Firewall from Perishable press: it is a set of mod_rewrite rules for the Apache web server that can be added to a .htaccess file and there also is a version for the Nginx web server as well. These check query strings, request Uniform Resource Identifiers (URI’s), user agents, remote hosts, HTTP referrers and request methods for any anomalies and blocks those that appear dubious.

Unfortunately, I found that the rules heavily slowed down a website with which I tried them so I am going have to wait until that is moved to a faster system before I really can give them a go. This can be a problem with security tools as I also found with adding a modsec jail to a Fail2Ban instance. As it happens, both sets of observations were made using the GTmetrix tool so it seems that there is a trade off between security and speed that needs to be assessed before adding anything to block unwanted web visitors.

Useful Python packages for working with data

14th October 2021

My response to changes in the technology stack used in clinical research is to develop some familiarity with programming and scripting platforms that complement and compete with SAS, a system with which I have been programming since 2000. One of these has been R but Python is another that has taken up my attention and I now also have Julia in my sights as well. There may be others to assess in the fullness of time.

While I first started to explore the Data Science world in the autumn of 2017, it was in the autumn of 2019 that I began to complete LinkedIn training courses on the subject. Good though they were, I find that I need to actually use a tool in order to better understand it. At that time, I did get to hear about Python packages like Pandas, NumPy, SciPy, Scikit-learn, Matplotlib, Seaborn and Beautiful Soup  though it took until of spring of this year for me to start gaining some hands-on experience with using any of these.

During the summer of 2020, I attended a BCS webinar on the CodeGrades initiative, a programming mentoring scheme inspired by the way classical musicianship is assessed. In fact, one of the main progenitors is a trained classical musician and teacher of classical music who turned to Python programming when starting a family so as to have a more stable income. The approach is that a student selects a project and works their way through it with mentoring and periodic assessments carried out in a gentle and discursive manner. Of course, the project has to be engaging for the learning experience to stay the course and that point came through in the webinar.

That is one lesson that resonates with me with subjects as diverse as web server performance and the ongoing pandemic pandemic supplying data and there are other sources of public data to examine as well before looking through my own personal archive gathered over the decades. Some subjects are uplifting while others are more foreboding but the key thing is that they sustain interest and offer opportunities for new learning. Without being able to dream up new things to try, my knowledge of R and Python would not be as extensive as it is and I hope that it will help with learning Julia too.

In the main, my own learning has been a solo effort with consultation of documentation along with web searches that have brought me to the likes of Real Python, Stack Abuse, Data Viz with Python and R and others for longer tutorials as well as threads on Stack Overflow. Usually, the web searching begins when I need a steer on a particular or a way to resolve a particular error or warning message but books always are worth reading even if that is the slower route. Those from the Dummies series or from O’Reilly have proved must useful so far but I do need to read them more completely than I already have; it is all too tempting to go with the try the “programming and search for solutions as you go” approach instead.

To get going, many choose the Anaconda distribution to get Jupyter notebook functionality but I prefer a more traditional editor so Spyder has been my tool of choice for Python programming and there are others like PyCharm as well. Spyder itself is written in Python so it can be installed using pip from PyPi like other Python packages. It has other dependencies like Pylint for code management activities but these get installed behind the scenes.

The packages that I first met in 2019 may be the mainstays for doing data science but I have discovered others since then. It also seems that there is porosity between the worlds of R an Python so you get some Python packages aping R packages and R has the Reticulate package for executing Python code. There are Python counterparts to such Tidyverse stables as dply and ggplot2 in the form of Siuba and Plotnine, respectively. The syntax of these packages are not direct copies of what is executed in R but they are close enough for there to be enough familiarity for added user friendliness compared to Pandas or Matplotlib. The interoperability does not stop there for there is SQLAlchemy for connecting to MySQL and other databases (PyMySQL is needed as well) and there also is SASPy for interacting with SAS Viya.

Pyhton may not have the speed of Julia but there are plenty of packages for working with larger workloads. Of these, Dask, Modin and RAPIDS all have there uses for dealing with data volumes that make Pandas code crawl. As if to prove that there are plenty of libraries for various forms of data analytics, data science, artificial intelligence and machine learning, there also are the likes of Keras, TensorFlow and NetworkX. These are just a selection of what is available and there is no need not to check out more. It may be tempting to stick with the most popular packages all the time, especially when they do so much, but it never hurst to keep an open mind either.

  • All the views that you find expressed on here in postings and articles are mine alone and not those of any organisation with which I have any association, through work or otherwise. As regards editorial policy, whatever appears here is entirely of my own choice and not that of any other person or organisation.

  • Please note that everything you find here is copyrighted material. The content may be available to read without charge and without advertising but it is not to be reproduced without attribution. As it happens, a number of the images are sourced from stock libraries like iStockPhoto so they certainly are not for abstraction.

  • With regards to any comments left on the site, I expect them to be civil in tone of voice and reserve the right to reject any that are either inappropriate or irrelevant. Comment review is subject to automated processing as well as manual inspection but whatever is said is the sole responsibility of the individual contributor.