Technology Tales

Adventures & experiences in contemporary technology

Updating fail2ban filters for WordPress

18th April 2024

Not so long ago, WordPress warned me that some of its Fail2ban filters were obsolete because I have the corresponding WP-fail2ban plugin installed, and the software is present on the underlying Ubuntu Server system. The solution was to connect to the server by SSH and execute the following commands.

wget https://plugins.svn.wordpress.org/wp-fail2ban/trunk/filters.d/wordpress-hard.conf
wget https://plugins.svn.wordpress.org/wp-fail2ban/trunk/filters.d/wordpress-soft.conf
wget https://plugins.svn.wordpress.org/wp-fail2ban/trunk/filters.d/wordpress-extra.conf
sudo mv wordpress-*.conf /etc/fail2ban/filter.d/

The first three commands download the updated configuration files before the last moves them to their final location. It is tempting to download the files directly to that final location, only for WGET to create new files instead of overwriting the old ones as required.

Automating writing using R and Claude

16th April 2024

Automation of writing using AI has become prominent recently, especially since GPT came to everyone’s notice. It is more than automation of proofreading but of producing the content itself, as Mark Hinkle and Luke Matthews can testify. Figuring out how to use Generative AI needs more than one line prompts, so knowing what multi-line ones will work is what is earning six digit annual salaries for some.

Recently, I gave this a go when writing a post that used content from a Reddit post thread. The first step was to extract the content from the thread, and I found that I could use R to do this. That meant installing the RedditExtractoR package using the following command:

install.packages("RedditExtractoR")

Then, I created a short script containing the following lines of code with placeholders added in place of the actual locations:

library("RedditExtractoR")

write.csv(get_thread_content("<URL for Reddit post thread>"), "<location of text file>")

The first line above calls the RedditExtractoR package for use so that its get_thread_content function could be used to scape the thread’s textual content. This was then fed to write.csv for writing out a text file with content.

Once I had the text file, I could upload it to Anthropic’s Claude for the next steps. Firstly, I got it to give me a summary of the thread discussion before I asked it to give me the suggested solutions to the issue. Impressively, it capably provided me with the latter.

Now armed with these answers, I set to crafting the post from them. Claude did not do all the work for me, but it certainly helped with the writing. This is something that I fancy exploring further, especially given how business computing is likely to proceed in the next few years.

Dealing with the following message issued when using Certbot on Apache: “Unable to find corresponding HTTP vhost; Unable to create one as intended addresses conflict; Current configuration does not support automated redirection”

12th April 2024

When doing something with Certbot on another website not so long ago, I encountered the above message when executing the following command (semicolons have been added to separate the lines):

sudo certbot --apache

The solution was to open /etc/apache2/sites-available/000-default.conf using nano and update the ServerName field (or the line containing this keyword) so it matched the address used for setting up Let’s Encrypt SSL certificates. The mention of Apache in the above does make the solution specific to this web server software, so you will need another solution if you meet this kind of problem when using Nginx or another web server.

String replacement in BASH scripting

28th April 2023

During creation of new posts for a Hugo deployed website, I found myself using the same directories again and again. Since I invariably ended up making typing mistakes when I did so, I fancied the idea of using shortcodes instead.

Because I wanted to turn the shortcode into the actual directory name, I chose the use of text replacement in BASH scripting. Thankfully, this is simple and avoids the use of regular expressions, which can bring their own problems. The essential syntax is as follows:

variable="${variable/search text/replacement}"

For the variable, the search text is substituted with the replacement very easily. It is even possible to include the search and replacement text in variables. In the example below, this is achieved using variables called original and replacement.

variable="${variable/$original/$replacement}"

Doing this got me my translatable shortcodes and converted them into actual directory names for the hugo command to process. There may be other uses yet.

Avoiding permissions, times or ownership failure messages when using rsync

22nd April 2023

The rsync command is one that I use heavily for doing backups and web publishing. The latter means that it is part of how I update websites built using Hugo because new and/or updated files need uploading. The command also sees usage when uploading files onto other websites as well. During one of these operations, and I am unsure now as to which type is relevant, I encountered errors about being unable to set permissions.

The cause was the encompassing -a option. This is a shorthand for -rltpgoD, and the individual options perform the following:

-r: recursive transfer, copying all contents within a directory hierarchy

-l: symbolic links copied as symbolic links

-t: preserve times

-p: preserve permissions

-g: preserve groups

-o: preserve owners

-D: preserve device and special files

The solution is to some of the options if they are inappropriate. The minimum is to omit the option for permissions preservation, but others may not apply between different servers either, especially when operating systems differ. Removing the options for preserving permissions, groups and owners results in something like this:

rsync -rltD [rest of command]

While it can be good to have a more powerful command with the setting of a single option, it can mean trying to do too much. Another way to avoid permissions and similar errors is to have consistency between source and destination files systems, but that is not always possible.

Why all the commas?

4th December 2022

In recent times, I have been making use of Grammarly for proofreading what I write for online consumption. That has applied as much to what I write in Markdown form as it has for what is authored using content management systems like WordPress and Textpattern.

The free version does nag you to upgrade to a paid subscription, but is not my main irritation. That would be its inflexibility because you cannot turn off rules that you think intrusive, at least in the free version. This comment is particularly applicable to the unofficial plugin that you can install in Visual Studio Code. To me, the add-on for Firefox feels less scrupulous.

There are other options though, and one that I have encountered is LanguageTool. This also offers a Firefox add-on, but there are others not only for other browsers but also Microsoft Word. Recent versions of LibreOffice Writer can connect to a LanguageTool server using in-built functionality, too. There are also dedicated editors for iOS, macOS or Windows.

The one operating that does not get specific add-on support is Linux, but there is another option there. That uses an embedded HTTP server that I installed using Homebrew and set to start automatically using cron. This really helps when using the LanguageTool Linter extension in Visual Studio Code because it can connect to that instead of the public API, which bans your IP address if you overuse it. The extension is also configurable with the ability to add exceptions (both grammatical and spelling), though I appear to have enabled smart formatting only to have it mess up quotes in a Markdown file that then caused Hugo rendering to fail.

Like Grammarly, there is an online editor that offers more if you choose an annual subscription. That is cheaper than the one from Grammarly, so that caused me to go for that instead to get rephrasing suggestions both in the online editor and through a browser add-on. It is better not to get nagged all the time…

The title may surprise you, but I have been using co-ordinating conjunctions without commas for as long as I can remember. Both Grammarly and LanguageTool pick up on these, so I had to do some investigation to find a gap in my education, especially since LanguageTool is so good at finding them. What I also found is how repetitive my writing style can be, which also means that rephrasing has been needed. That, after all, is the point of a proofreading tool, and it can rankle if you have fixed opinions about grammar or enjoy creative writing.

Putting some off-copyright texts from other authors triggers all kinds of messages, but you just have to ignore these. Turning off checks needs care, even if turning them on again is easy to do. There, however, is the danger that artificial intelligence tools could make writing too uniform, since there is only so much that these technologies can do. They should make you look at your text more intently, though, which is never a bad thing because computers still struggle with meaning.

Rendering Markdown into HTML using PHP

3rd December 2022

One of the good things about using virtual private servers for hosting websites in preference to shared hosting or using a web application service like WordPress.com or Tumblr is that you get added control and flexibility. There was a time when HTML, CSS and client-side scripting were all that was available from the shared hosting providers that I was using. Then, static websites were my lot until it became possible to use Perl server side scripting. PHP predominates now, but Python or Ruby cannot be discounted either.

Being able to install whatever you want is a bonus as well, though it means that you also are responsible for the security of the containers that you use. There will be infrastructure security, but that of your own machine will be your own concern. Added power always means added responsibility, as many might say.

The reason that these thought emerge here is that getting PHP to render Markdown as HTML needs the installation of Composer. Without that, you cannot use the CommonMark package to do the required back-work. All the command that you see here will work on Ubuntu 22.04. First, you need to download Composer and executing the following command will accomplish this:

curl https://getcomposer.org/installer -o /tmp/composer-setup.php

Before the installation, it does no harm to ensure that all is well with the script before proceeding. That means that capturing the signature for the script using the following command is wise:

HASH=`curl https://composer.github.io/installer.sig`

Once you have the script signature, then you can check its integrity using this command:

php -r "if (hash_file('SHA384', '/tmp/composer-setup.php') === '$HASH') { echo 'Installer verified'; } else { echo 'Installer corrupt'; unlink('composer-setup.php'); } echo PHP_EOL;"

The result that you want is “Installer verified”. If not, you have some investigating to do. Otherwise, just execute the installation command:

sudo php /tmp/composer-setup.php --install-dir=/usr/local/bin --filename=composer

With Composer installed, the next step is to run the following command in the area where your web server expects files to be stored. That is important when calling the package in a PHP script.

composer require league/commonmark

Then, you can use it in a PHP script like so:

define("ROOT_LOC",$_SERVER['DOCUMENT_ROOT']);
include ROOT_LOC . '/vendor/autoload.php';
use League\CommonMark\CommonMarkConverter;
$converter = new CommonMarkConverter();
echo $converter->convertToHtml(file_get_contents(ROOT_LOC . '<location of markdown file>));

The first line finds the absolute location of your web server file directory before using it when defining the locations of the autoload script and the required markdown file. The third line then calls in the CommonMark package, while the fourth sets up a new object for the desired transformation. The last line converts the input to HTML and outputs the result.

If you need to render the output of more than one Markdown file, then repeating the last line from the preceding block with a different file location is all you need to do. The CommonMark object persists and can be used like a variable without needing the reinitialisation to be repeated every time.

The idea of building a website using PHP to render Markdown has come to mind, but I will leave it at custom web pages for now. If an opportunity comes, then I can examine the idea again. Before, I had to edit HTML, but Markdown is friendlier to edit, so that is a small advance for now.

Moves to Hugo

30th November 2022

What amazes me is how things can become more complicated over time. As long as you knew HTML, CSS and JavaScript, building a website was not as onerous as long as web browsers played ball with it. Since then, things have got easier to use but more complex at the same time. One example is WordPress: in the early days, themes were much simpler than they are now. The web also has got more insecure over time, and that adds to complexity as well. It sometimes feels as if there is a choice to make between ease of use and simplicity.

It is against that background that I reassessed the technology that I was using on my public transport and Irish history websites. The former used WordPress, while the latter used Drupal. The irony was that the simpler website was using the more complex platform, so the act of going simpler probably was not before time. Alternatives to WordPress were being surveyed for the first of the pair, but none had quite the flexibility, pervasiveness and ease of use that WordPress offers.

There is another approach that has been gaining notice recently. One part of this is the use of Markdown for web publishing. This is a simple and distraction-free plain text format that can be transformed into something more readable. It sees usage in blogs hosted on GitHub, but also facilitates the generation of static websites. The clutter is absent for those who have no need of the Gutenberg Editor on WordPress.

With the content written in Markdown, it can be fed to a static website generator like Hugo. Using defined templates and fixed assets like CSS together with images and other static files, it can slot the content into HTML files very speedily since it is written in the Go programming language. Once you get acclimatised, there are no folder structures that cannot be used, so you get full flexibility in how you build out your website. Sitemaps and RSS feeds can be built at the same time, both using the same input as the HTML files.

In a nutshell, it automates what once needed manual effort used a code editor or a visual web page editor. The use of HTML snippets and layouts means that there is no necessity for hand-coding content, like there was at the start of the web. It also helps that Bootstrap can be built in using Node, so that gives a basis for any styling. Then, SCSS can take care of things, giving even more automation.

Given that there is no database involved in any of this, the required information has to be stored somewhere, and neither the Markdown content nor the layout files contain all that is needed. The main site configuration is defined in a single TOML file, and you can have a single one of these for every publishing destination; I have development and production servers, which makes this a very handy feature. Otherwise, every Markdown file needs a YAML header where titles, template references, publishing status and other similar information gets defined. The layouts then are linked to their components, and control logic and other advanced functionality can be added too.

Because static files are being created, it does mean that site searching and commenting or contact pages cannot work they would on a dynamic web platform. Often, external services are plugged in using JavaScript. One that I use for contact forms is Getform.io. Then, Zapier has had its uses in using the RSS feed to tweet site updates on Twitter when new content gets added. Though I made different choices, Disqus can be used for comments and Algolia for site searching. Generally, though, you can find yourself needing to pay, particularly if you need to remove advertising or gain advanced features.

Some comments service providers offer open source self-hosted options, but I found these difficult to set up and ended up not offering commenting at all. That was after I tried out Cactus Comments only to find that it was not discriminating between pages, so it showed the same comments everywhere. There are numerous alternatives like Remark42, Hyvor Talk, Commento, FastComments, Utterances, Isso, Mouthful, Muut and HyperComments but trying them all out was too time-consuming for what commenting was worth to me. It also explains why some static websites even send readers to Twitter if they have something to say, though I have not followed this way of working.

For searching, I added a JavaScript/JSON self-hosted component to the transport website, and it works well. However, it adds to the size of what a browser needs to download. That is not a major issue for desktop browsers, but the situation with mobile browsers is such that it has a sizeable effect. Testing with PageSpeed and Lighthouse highlighted this, even if I left things as they are. The solution works well in any case.

One thing that I have yet to work out is how to edit or add content while away from home. Editing files using an SSH connection is as much a possibility as setting up a Hugo publishing setup on a laptop. After that, there is the question of using a tablet or phone, since content management systems make everything web based. These are points that I have yet to explore.

As is natural with a code-based solution, there is a learning curve with Hugo. Reading a book provided some orientation, and looking on the web resolved many conundrums. There is good documentation on the project website, while forum discussions turn up on many a web search. Following any research, there was next to nothing that could not be done in some way.

Migration of content takes some forethought and took quite a bit of time, though there was an opportunity to carry some housekeeping as well. The history website was small, so copying and pasting sufficed. For the transport website, I used Python to convert what was on the database into Markdown files before refining the result. That provided some automation, but left a lot of work to be done afterwards.

The results were satisfactory, and I like the associated simplicity and efficiency. That Hugo works so fast means that it can handle large websites, so it is scalable. The new Markdown method for content production is not problematical so far apart from the need to make it more portable, and it helps that I found a setup that works for me. This also avoids any potential dealbreakers that continued development of publishing platforms like WordPress or Drupal could bring. For the former, I hope to remain with the Classic Editor indefinitely, but now have another option in case things go too far.

Building a sitemap in XML

24th November 2022

While there are many tools that will build XML site maps, there is some satisfaction to be had in creating your own. This is in spite of there being a multitude of search engine optimisation plugins for content management systems like WordPress or what is built into static site generators like Hugo. Sometimes, building your own allows for added simplicity and that is shared with recent efforts in WordPress theme development.

The sitemap XML protocol is simple enough to offer a short coding project. The basis was what Hugo generates and I used Python to create the XML files. The only libraries that I needed were configparser, SQLAlchemy and pandas. the first two of these allowed databases to be queried and the last on the list was used for data processing. Otherwise, it was a case of using what is built into the Python language like file writing and looping.

Once the scripts were ready, they could be uploaded to web servers and executed by scheduled jobs using CRON to keep things up to date. along the way, I also uncovered a way to publicise the locations of the sitemap files to search engine bots using robots.txt.  The structure of the instruction is the following:

User-agent: *
Sitemap: sitemap.xml

This means that it announces to all bots the location of the sitemap file. In my case, I always included the full URL for the XML file and that clearly varies by website location.

Self hosting Google fonts

14th November 2022

One thing that I have been doing across my websites is to move away from using Google web fonts using the advice on the Switching.Software website. While I have looked at free web font directories like 1001 Free Fonts or DaFont, they do not have the full range of bolding and character sets that I desire so I opted instead for the Google Webfonts Helper website. That not only offered copies of what Google has but also created a portion of CSS that I could add to a stylesheet on a website, making things more streamlined. At the same time, I also took the opportunity to change some of the fonts that were being used for sake of added variety. Open Sans is good but there are other acceptable sans-serif options like Mulish or Nunita as well, so these got used.

  • All the views that you find expressed on here in postings and articles are mine alone and not those of any organisation with which I have any association, through work or otherwise. As regards editorial policy, whatever appears here is entirely of my own choice and not that of any other person or organisation.

  • Please note that everything you find here is copyrighted material. The content may be available to read without charge and without advertising but it is not to be reproduced without attribution. As it happens, a number of the images are sourced from stock libraries like iStockPhoto so they certainly are not for abstraction.

  • With regards to any comments left on the site, I expect them to be civil in tone of voice and reserve the right to reject any that are either inappropriate or irrelevant. Comment review is subject to automated processing as well as manual inspection but whatever is said is the sole responsibility of the individual contributor.