-->
Adventures & experiences in contemporary technology
Recently, I wanted to extract some text from the Linux command by word number only for multiple spaces to make things less predictable. The solution was to remove the duplicate spaces. This can be done using sed but you add the complexity of regular expressions if you opt for that solution. Instead, the tr command offers a neater approach. For removing duplicate spaces, the command takes the following form:
echo "test test" | tr -s " "
Since I was piping some text to the command, that is what I have above. The tr command is intended to replace or delete characters and the -s switch is a shorthand for --squeeze-repeats. The actual character to be deduplicated is passed in quotes at the end; here, it is a space but it could be anything that is duplicated. The resulting text in this example becomes:
test test
After the processing, there is now only one space separating the two words, which is the solution that I sought. It certainly cut out any variability that I was encountering in my usage.
My wanting to execute one command using the text output of another recently got me wondering about picking out a block of characters using its position in a space-delimited list. All this needed to be done from the Linux command line or in a shell script. The output text took a form like the following:
text1 text2 text3 text4
What I wanted in my case was something like the third word above. The solution was to use the cut command with the -d (for delimiter) and -f (for field number) switches. The following yields text3 as the output:
echo "text1 text2 text3 text4" | cut -d " " -f 3
Here the delimiter is the space character but it can be anything that is relevant for the string in question. Then, the “3” picks out the required block of text. For this to work, the text needs to be organised consistently and for the delimiters never to be duplicated, though there is a way of dealing with the latter as well.
Recently, I encountered the following kind of message when reading an Excel file into SAS using PROC IMPORT:
ERROR: Error opening XLSX file -> xxx-.xlsx . It is either not an Excel spreadsheet or it is damaged. Error code=8000101D
Requested Input File Is Invalid
ERROR: Import unsuccessful. See SAS Log for details.
Naturally, thoughts arise regarding the state of the Excel file when you see a message like this but that was not the case because the file opened successfully in Excel and looked OK to me. After searching on the web, I found that it was a file permissions issue. The actual environment that I was using at the time was entimICE and I had forgotten to set up a link that granted read access to the file. Once that was added, the problem got resolved. In other systems, checking on file system permissions is needed even if the message seems to suggest that you are experiencing a file integrity problem.
Earlier in the year, I upgraded my monitor to a 34-inch widescreen Iiyama XUB3493WQSU. At the time, I was in wonderment at what I was doing even if I have grown used to it now. For one thing, it made the onscreen text too small so I ended up having to scale things up in both Linux and Windows. The former proved to be more malleable than the latter and that impression also applies to the main subject of this piece.
What I also found is that I needed to scale the user interface font sizes within Adobe Lightroom Classic running within a Windows virtual machine on VirtualBox. That can be done by going to Edit > Preferences through the menus and then going to the Interface tab in the dialogue box that appears where you can change the Font Size setting using the dropdown menu and confirm changes using the OK button.
However, the range of options is limited. Medium appears to be the default setting while the others include Small, Large, Larger and Largest. Large scales by 150%, Larger by 200% and Largest by 250%. Of these, Large was the setting that I chose though it always felt too big to me.
Out of curiosity, I decided to probe further only to find extra possibilities that could be selected by direct editing of a configuration file. This file can be found in C:\Users\[user account]\AppData\Roaming\Adobe\Lightroom\Preferences and is called Lightroom Classic CC 7 Preferences.agprefs. In there, you need to find the line containing AgPanel_baseFontSize and change the value enclosed within quotes and save the file. Taking a backup beforehand is wise even if the modification is not a major one.
The available choices are scale125, scale140, scale150, scale175, scale180, scale200 and scale250. Some of these may be recognisable as those available through the Lightroom Classic user interface. In my case, I chose the first on the list so the line in the configuration file became:
AgPanel_baseFontSize="scale125"
There may be good reasons for the additional options not being available through the user interface but things are working out OK for me for now. It is another tweak that helps me to get used to the larger screen size and its higher resolution.
While there are many tools that will build XML site maps, there is some satisfaction to be had in creating your own. This is in spite of there being a multitude of search engine optimisation plugins for content management systems like WordPress or what is built into static site generators like Hugo. Sometimes, building your own allows for added simplicity and that is shared with recent efforts in WordPress theme development.
The sitemap XML protocol is simple enough to offer a short coding project. The basis was what Hugo generates and I used Python to create the XML files. The only libraries that I needed were configparser, SQLAlchemy and pandas. the first two of these allowed databases to be queried and the last on the list was used for data processing. Otherwise, it was a case of using what is built into the Python language like file writing and looping.
Once the scripts were ready, they could be uploaded to web servers and executed by scheduled jobs using CRON to keep things up to date. along the way, I also uncovered a way to publicise the locations of the sitemap files to search engine bots using robots.txt. The structure of the instruction is the following:
User-agent: *
Sitemap: sitemap.xml
This means that it announces to all bots the location of the sitemap file. In my case, I always included the full URL for the XML file and that clearly varies by website location.
One thing that I have been doing across my websites is to move away from using Google web fonts using the advice on the Switching.Software website. While I have looked at free web font directories like 1001 Free Fonts or DaFont, they do not have the full range of bolding and character sets that I desire so I opted instead for the Google Webfonts Helper website. That not only offered copies of what Google has but also created a portion of CSS that I could add to a stylesheet on a website, making things more streamlined. At the same time, I also took the opportunity to change some of the fonts that were being used for sake of added variety. Open Sans is good but there are other acceptable sans-serif options like Mulish or Nunita as well, so these got used.
Earlier this year, I changed over two websites from dynamic versions using content management systems to static ones by using Hugo to build them from Markdown files. That meant that I needed to look at the editing of MarkDown even if it is a fairly simple file format. For one thing, Grammarly can be incorporated into WordPress so I did not want to lose something like that.
The latter point meant that I was steered away from plain text editors. Otherwise, there are online ones like StackEdit and Dillinger but the Firefox Grammarly plugin only appears to work on the first of these, and even then only partially in my experience. Dillinger does offer connections to online file storage providers like Google, Dropbox and OneDrive but I wanted to store files on my desktop for upload to a web server. It also works with Github but I prefer to use another web hosting provider.
There are various specialised MarkDown editors for desktop usage like Typora, ReText, Formiko or Ghostwriter but I chose none of these. My actual choice may surprise many: it was Visual Studio Code. The availability of a Grammarly plug-in was what swayed it for me even if it did need to be switched on for MarkDown files. In many ways, it does work as smoothly as elsewhere because it gets fooled by links and other code-like pieces of text. Also, having the added ability to add words to a custom dictionary would be ideal. Some rule overriding is available but I am not sure that everything is covered even if the list of options is lengthy. Some time is needed to inspect all of them before I proceed any further. Thus far, things are working well enough for me.
In the BASH shell used on Linux and UNIX, the history command calls up a list of recent commands used and has many uses. There is a .bash_history file in the root of the user folder that logs and provides all this information so there are times when you need to exclude some commands from there but that is another story.
The Julia REPL environment works similarly to many operating system command line interfaces, so I wondered if there was a way to recall or refer to the history of commands issued. So far, I have not come across an equivalent to the BASH history command for the REPL itself but there the command history is retained in a file like .bash_history. The location varies on different operating systems though. On Linux, it is ~/.julia/logs/repl_history.jl while it is %USERPROFILE%\.julia\logs\repl_history.jl on Windows. While I tend to use scripts that I have written in VSCode rather than entering pieces of code in the REPL, the history retains its uses and I am sharing it here for others. In the past, the location changed but these are the ones with Julia 1.8.2, the version that I have at the time of writing.
Having had a mishap that lost me some photos in the early days of my dalliance with digital photography, I have been far more careful since then and that now applies to other files as well. Doing regular backups is a must that you find reiterated by many different authors and the current computing climate makes doing that more vital than it ever was.
So, as well as having various local backups, I also have remote ones in the form of OneDrive, Dropbox and Google Drive. These more correctly are file synchronisation services but disciplined use can make them useful as additional storage facilities in the interests of maintaining added resilience. There also are dedicated backup services that I have seen reviewed in the likes of PC Pro magazine but I have to make use of those.
Part of my process for dealing with new digital photo files is to back them up to Google Drive and I did that with a Windows client in the early days but then moved to Insync running on Linux Mint. One drawback to the approach is that this hogs the upload bandwidth of an internet connection that has yet to move to fibre from copper cabling. Having fibre connections to a local cabinet helps but a 100 KiB/s upload speed is easily overwhelmed and digital photo file sizes keep increasing. It does not help that I insist on using more flexible raw formats like DNG, CR2 or CR3 either.
Making fewer images could help to cut the load but I still come away from an excursion with many files because I get so besotted with my surroundings. This means that upload sessions take numerous hours and can extend across calendar days. Ultimately, this makes my internet connection far less usable so I want to throttle upload speed much like what is possible in the Transmission BitTorrent client or in the Dropbox client. Unfortunately, this is not available in Insync so I have tried using the trickle command instead and an example is below:
trickle -d 2000 -u 50 insync
Here, the upload speed is limited to 50 KiB/s while the download speed is limited to 2000 KiB/s. In my case, the latter of these hardly matters while the former leaves me with acceptable internet usability. Insync does not work smoothly with this, however, so occasional restarts are needed to keep file uploads progressing and CPU load also is higher. As rough as the user experience feels, uploads can continue in parallel with other work.
One other option that I am exploring is the use of the command-line tool gdrive and this appears to work well with trickle. After downloading and installing the tool, getting going is a matter of issuing the following command and following the instructions:
gdrive about
On web servers, I even have the tool backing up things to Google Drive on a scheduled basis. Because of a Google Drive limitation that I have encountered not only with gdrive but also with Insync and Google’s own Windows Google Drive client, synchronisation only can happen with two new folders, one local and the other remote. Handily, gdrive supports the usual bash style commands for working with remote directories so something like the following will create a directory on Google Drive:
gdrive mkdir ttdc [ID for parent folder]
Here, the ID for the parent folder may be omitted but it can be obtained by going to Google Drive online and getting a link location by right-clicking on a folder and choosing the appropriate context menu item. This gets you something like the following and the required identifier is found between the last slash and the first question mark in the address string (so as not to share any real links, I made the address more general below):
https://drive.google.com/drive/folders/[remote folder ID]?usp=sharing
Then, synchronisation uses a command like the following:
gdrive sync upload [local folder or file path] [remote folder ID]
There also is the option to do a one-way upload and this is the form of the command used:
gdrive upload [local folder or file path] -p [remote folder ID]
Because every file or folder object has its own ID on Google Drive, it is possible to create two objects on there that appear to have the same name though that is sure to cause confusion even if you know what is happening. It is possible in each of the above to throttle them using trickle as well:
trickle -d 2000 -u 50 gdrive sync upload [local folder or file path] [remote folder ID]
trickle -d 2000 -u 50 gdrive upload [local folder or file path] -p [remote folder ID]
Handily, this works without the added drama seen with Insync and lends itself to scripting as well so it could be something that I will incorporate into my current workflow. One thing that needs to be watched is file upload failures but there may be ways to catch those and retry them so that would another thing that needs doing. This is built into Insync and it would be a learning opportunity if I was to stick with gdrive instead.
During the past week, I rebooted my system only to find that a number of things no longer worked and my Pi-hole DNS server was among them. Having exhausted other possibilities by testing out things on another machine, I did a status check when I spotted a line like the following in my system logs and went investigating further:
cron[322]: (root) INSECURE MODE (mode 0600 expected) (crontabs/root)
It turned out to be more significant than I had expected because this was why every CRON job was failing and that included the network set up needed by Pi-hole; a script is executed using the @reboot directive to accomplish this and I got Pi-hole working again by manually executing it. The evening before, I did make some changes to file permissions under /var/www but I was not expecting it to affect other parts of /var though that may have something to do with some forgotten heavy-handedness. The cure was to issue a command like the following for execution in a terminal session:
sudo chmod -R 600 /var/spool/cron/crontabs/
Then, CRON itself needed starting since it had not not running at all and executing this command did the needful without restarting the system:
sudo systemctl start cron
That outcome was proved by executing the following command to issue some terminal output that include the welcome text “active (running)” highlighted in green:
sudo systemctl status cron
There was newly updated output from a frequently executing job that checked on web servers for me but this was added confirmation. It was a simple solution to a perplexing situation that led up all sorts of blind alleys before I alighted on the right solution to the problem.