Technology Tales

Adventures in consumer and enterprise technology

TOPIC: HUGO

Removing query strings from any URL on an Nginx-powered website

12th April 2025

My public transport website is produced using Hugo and is hosted on a web server with Nginx. Usually, I use Apache, so this is an exception. When Google highlighted some duplication caused by unneeded query strings, I set to work. However, doing anything with URL's like redirection cannot use a .htaccess file or MOD_REWRITE on Nginx. Thus, such clauses have to go somewhere else and take a different form.

In my case, the configuration file to edit is /etc/nginx/sites-available/default because that was what was enabled. Once I had that open, I needed to find the following block:

location / {
        # First attempt to serve request as file, then
        # as directory, then fall back to displaying a 404.
        try_files $uri $uri/ =404;
}

Because I have one section for port 80 and another for port 443, there were two locations that I needed to update due to duplication, though I may have got away without altering the second of these. After adding the redirection clause, the block became:

location / {
        # First attempt to serve request as file, then
        # as directory, then fall back to displaying a 404.
        try_files $uri $uri/ =404;

        # Remove query strings only when necessary
        if ($args) {
                rewrite ^(.*)$ $1? permanent;
        }
}

The result of the addition is a permanent (301) redirection whenever there are arguments passed in a query string. The $1? portion is the rewritten URL without a query string that was retrieved in the initial ^(.*)$ portion. In other words, the redirect it from the original address to a new one with only the part preceding the question mark.

Handily, Nginx allows you to test your updated configuration using the following command:

sudo nginx -t

That helped me with some debugging. Once all was in order, I needed to reload the service by issuing this command:

sudo systemctl reload nginx

With Apache, there is no need to restart the service after updating the .htaccess file, which adds some convenience. The different locations also mean some care with backups when upgrading the operating system or moving from one server to another. Apart from that, all works well, proving that there can be different ways to complete the same task.

Displaying superscripted text in Hugo website content

6th January 2025

In a previous post, there was a discussion about displaying ordinal publishing dates with superscripted suffixes in Hugo and WordPress. Here, I go further with inserting superscripted text into Markdown content. Because of the default set up for the Goldmark Markdown renderer, it is not as simple as adding <sup>...</sup> constructs to your Markdown source file. That will generate a warning like this:

WARN Raw HTML omitted while rendering "[initial location of Markdown file]"; see https://gohugo.io/getting-started/configuration-markup/#rendererunsafe
You can suppress this warning by adding the following to your site configuration:
ignoreLogs = ['warning-goldmark-raw-html']

Because JavaScript can be added using HTML tags, there is an added security hazard that could be overlooked if you switch off the warning as suggested. Also, Goldmark does not interpret Markdown specifications of superscripting without an extension whose incorporation needs some familiarity with Go development.

That leaves using a Shortcode. These go into layouts/shortcodes under your theme area; the file containing mine got called super.html. The content is the following one-liner:

<sup>{{ .Get 0 | markdownify }}⁢/sup>

This then is what is added to the Markdown content:

{{< super "th" >}}

What happens here is that the Shortcode picks up the content within the content within the quotes and encapsulates it with the HTML superscript tags to give the required result. This approach can be extended for subscripts and other similar ways of rendering text, too. All that is required is a use case, and the rest can be put in place.

Adding superscripts to ordinal publishing dates for entries in Hugo and WordPress

5th January 2025

These web publishing tools differ and so the solutions, yet the use case is the same: displaying ordinal dates for entries in a blog or website. Also, the wish is to have the ordinal suffix superscripted as in normal English usage. Let us take each platform in turn.

Hugo

Given that is programming in Go, it is little surprise that Hugo uses Go's time formatting syntax. Thus, my starting point was as follows:

{{ .Date.Format "15:04, January 2, 2006" }}

The result from the above looks like this: 20:56, January 2, 2025. Unfortunately, Go does not support ordinal dates in its time formatting, so adding them needs more extensive conditional logic like what you see below. The default suffix is th, while nd is added for the second and twenty-second days of the month and rd is added for the third and twenty-third days. This how things now look:

{{ .Date.Format "15:04, January 2" }}{{ if eq (.Date.Format "2") "2" }}nd{{ else if eq (.Date.Format "2") "22" }}nd{{ else if eq (.Date.Format "2") "1" }}st{{ else if eq (.Date.Format "2") "21" }}st{{ else if eq (.Date.Format "2") "3" }}rd{{ else if eq (.Date.Format "2") "23" }}rd{{ else }}th{{ end }}, {{ .Date.Format "2006" }}

That gives you something like 20:56, January 2nd, 2025. The handy thing about Hugo is that it primarily is an HTML output engine, so adding superscripting tags in the right places, like below, superscripts the ordinal suffixes as needed. Here, the tags are shown in bold for emphasis.

{{ .Date.Format "15:04, January 2" }}<sup>{{ if eq (.Date.Format "2") "2" }}nd{{ else if eq (.Date.Format "2") "22" }}nd{{ else if eq (.Date.Format "2") "1" }}st{{ else if eq (.Date.Format "2") "21" }}st{{ else if eq (.Date.Format "2") "3" }}rd{{ else if eq (.Date.Format "2") "23" }}rd{{ else }}th{{ end }}</sup>, {{ .Date.Format "2006" }}

Once you have that: this is how things appear: 20:56, January 2nd, 2025, which gets the job done.

WordPress

Hugo produces the website for you to upload it to the web server; it is static after that. In contrast, WordPress is based on PHP and dynamically renders a web page when it is requested. That means that components are generated at that time. Thus, the following code snippet outputs a date when a website post or page has been published:

<?php the_time('jS F Y') ?>

PHP time and date formatting does account for ordinal dates, unlike what you get in Go. Here, the jS F Y format controls how the date gets displayed. The codes do the following: j outputs the day in the month without a leading zero, S adds the ordinal suffix, F adds the full month name and Y adds the four digit year. The result then is something like this: 1st January 2025. To superscript the ordinal suffix, the following change is needed (the addition is emboldened for emphasis):

<?php the_time('j\<\s\u\p\>S\<\/\s\u\p\> F Y') ?>

Here, the HTML superscript tags are inserted into the format with every character escaped by a leading backslash (\). While PHP does generate HTML, it needs the escaping to preserve the intended input here. Security considerations like preventing cross-site scripting also matter, though maybe not so much in this context. Regardless of the technicality, the result becomes 1st January 2025, which is what was sought.

Avoiding permissions, times or ownership failure messages when using rsync

22nd April 2023

The rsync command is one that I use heavily for doing backups and web publishing. The latter means that it is part of how I update websites built using Hugo because new and/or updated files need uploading. The command also sees usage when uploading files onto other websites as well. During one of these operations, and I am unsure now as to which type is relevant, I encountered errors about being unable to set permissions.

The cause was the encompassing -a option. This is a shorthand for -rltpgoD, and the individual options perform the following:

-r: recursive transfer, copying all contents within a directory hierarchy

-l: symbolic links copied as symbolic links

-t: preserve times

-p: preserve permissions

-g: preserve groups

-o: preserve owners

-D: preserve device and special files

The solution is to some of the options if they are inappropriate. The minimum is to omit the option for permissions preservation, but others may not apply between different servers either, especially when operating systems differ. Removing the options for preserving permissions, groups and owners results in something like this:

rsync -rltD [rest of command]

While it can be good to have a more powerful command with the setting of a single option, it can mean trying to do too much. Another way to avoid permissions and similar errors is to have consistency between source and destination files systems, but that is not always possible.

Why all the commas?

4th December 2022

In recent times, I have been making use of Grammarly for proofreading what I write for online consumption. That has applied as much to what I write in Markdown form, as it has with what is authored using content management systems like WordPress and Textpattern.

The free version does nag you to upgrade to a paid subscription, but is not my main irritation. That would be its inflexibility because you cannot turn off rules that you think intrusive, at least in the free version. This comment is particularly applicable to the unofficial plugin that you can install in Visual Studio Code. To me, the add-on for Firefox feels less scrupulous.

There are other options though, and one that I have encountered is LanguageTool. This also offers a Firefox add-on, but there are others not only for other browsers but also Microsoft Word. Recent versions of LibreOffice Writer can connect to a LanguageTool server using in-built functionality, too. There are also dedicated editors for iOS, macOS or Windows.

The one operating that does not get specific add-on support is Linux, but there is another option there. That uses an embedded HTTP server that I installed using Homebrew and set to start automatically using cron. This really helps when using the LanguageTool Linter extension in Visual Studio Code because it can connect to that instead of the public API, which bans your IP address if you overuse it. The extension is also configurable with the ability to add exceptions (both grammatical and spelling), though I appear to have enabled smart formatting only to have it mess up quotes in a Markdown file that then caused Hugo rendering to fail.

Like Grammarly, there is an online editor that offers more if you choose an annual subscription. That is cheaper than the one from Grammarly, so that caused me to go for that instead to get rephrasing suggestions both in the online editor and through a browser add-on. It is better not to get nagged all the time...

The title may surprise you, but I have been using co-ordinating conjunctions without commas for as long as I can remember. Both Grammarly and LanguageTool pick up on these, so I had to do some investigation to find a gap in my education, especially since LanguageTool is so good at finding them. What I also found is how repetitive my writing style can be, which also means that rephrasing has been needed. That, after all, is the point of a proofreading tool, and it can rankle if you have fixed opinions about grammar or enjoy creative writing.

Putting some off-copyright texts from other authors triggers all kinds of messages, but you just have to ignore these. Turning off checks needs care, even if turning them on again is easy to do. There, however, is the danger that artificial intelligence tools could make writing too uniform, since there is only so much that these technologies can do. They should make you look at your text more intently, though, which is never a bad thing because computers still struggle with meaning.

Moves to Hugo

30th November 2022

What amazes me is how things can become more complicated over time. As long as you knew HTML, CSS and JavaScript, building a website was not as onerous as long as web browsers played ball with it. Since then, things have got easier to use but more complex at the same time. One example is WordPress: in the early days, themes were much simpler than they are now. The web also has got more insecure over time, and that adds to complexity as well. It sometimes feels as if there is a choice to make between ease of use and simplicity.

It is against that background that I reassessed the technology that I was using on my public transport and Irish history websites. The former used WordPress, while the latter used Drupal. The irony was that the simpler website was using the more complex platform, so the act of going simpler probably was not before time. Alternatives to WordPress were being surveyed for the first of the pair, but none had quite the flexibility, pervasiveness and ease of use that WordPress offers.

There is another approach that has been gaining notice recently. One part of this is the use of Markdown for web publishing. This is a simple and distraction-free plain text format that can be transformed into something more readable. It sees usage in blogs hosted on GitHub, but also facilitates the generation of static websites. The clutter is absent for those who have no need of the Gutenberg Editor on WordPress.

With the content written in Markdown, it can be fed to a static website generator like Hugo. Using defined templates and fixed assets like CSS together with images and other static files, it can slot the content into HTML files very speedily since it is written in the Go programming language. Once you get acclimatised, there are no folder structures that cannot be used, so you get full flexibility in how you build out your website. Sitemaps and RSS feeds can be built at the same time, both using the same input as the HTML files.

In a nutshell, it automates what once needed manual effort used a code editor or a visual web page editor. The use of HTML snippets and layouts means that there is no necessity for hand-coding content, like there was at the start of the web. It also helps that Bootstrap can be built in using Node, so that gives a basis for any styling. Then, SCSS can take care of things, giving even more automation.

Given that there is no database involved in any of this, the required information has to be stored somewhere, and neither the Markdown content nor the layout files contain all that is needed. The main site configuration is defined in a single TOML file, and you can have a single one of these for every publishing destination; I have development and production servers, which makes this a very handy feature. Otherwise, every Markdown file needs a YAML header where titles, template references, publishing status and other similar information gets defined. The layouts then are linked to their components, and control logic and other advanced functionality can be added too.

Because static files are being created, it does mean that site searching and commenting, or contact pages cannot work like they would on a dynamic web platform. Often, external services are plugged in using JavaScript. One that I use for contact forms is Getform.io. Then, Zapier has had its uses in using the RSS feed to tweet site updates on Twitter when new content gets added. Though I made different choices, Disqus can be used for comments and Algolia for site searching. Generally, though, you can find yourself needing to pay, particularly if you need to remove advertising or gain advanced features.

Some commenting service providers offer open source self-hosted options, but I found these difficult to set up and ended up not offering commenting at all. That was after I tried out Cactus Comments only to find that it was not discriminating between pages, so it showed the same comments everywhere. There are numerous alternatives like Remark42, Hyvor Talk, Commento, FastComments, Utterances, Isso, Mouthful, Muut and HyperComments but trying them all out was too time-consuming for what commenting was worth to me. It also explains why some static websites even send readers to Twitter if they have something to say, though I have not followed this way of working.

For searching, I added a JavaScript/JSON self-hosted component to the transport website, and it works well. However, it adds to the size of what a browser needs to download. That is not a major issue for desktop browsers, but the situation with mobile browsers is such that it has a sizeable effect. Testing with PageSpeed and Lighthouse highlighted this, even if I left things as they are. The solution works well in any case.

One thing that I have yet to work out is how to edit or add content while away from home. Editing files using an SSH connection is as much a possibility as setting up a Hugo publishing setup on a laptop. After that, there is the question of using a tablet or phone, since content management systems make everything web based. These are points that I have yet to explore.

As is natural with a code-based solution, there is a learning curve with Hugo. Reading a book provided some orientation, and looking on the web resolved many conundrums. There is good documentation on the project website, while forum discussions turn up on many a web search. Following any research, there was next to nothing that could not be done in some way.

Migration of content takes some forethought and took quite a bit of time, though there was an opportunity to carry some housekeeping as well. The history website was small, so copying and pasting sufficed. For the transport website, I used Python to convert what was on the database into Markdown files before refining the result. That provided some automation, but left a lot of work to be done afterwards.

The results were satisfactory, and I like the associated simplicity and efficiency. That Hugo works so fast means that it can handle large websites, so it is scalable. The new Markdown method for content production is not problematical so far apart from the need to make it more portable, and it helps that I found a setup that works for me. This also avoids any potential dealbreakers that continued development of publishing platforms like WordPress or Drupal could bring. For the former, I hope to remain with the Classic Editor indefinitely, but now have another option in case things go too far.

Building a sitemap in XML

24th November 2022

While there are many tools that will build XML site maps, there is some satisfaction to be had in creating your own. This is despite there being a multitude of search engine optimisation plugins for content management systems like WordPress or what is built into static site generators like Hugo. Sometimes, building your own allows for added simplicity, and that is shared with recent efforts in WordPress theme development.

The sitemap XML protocol is simple enough to offer a short coding project. The basis was what Hugo generates, and I used Python to create the XML files. The only libraries that I needed were configparser, SQLAlchemy and pandas. The first two of these allowed databases to be queried, and the last on the list was used for data processing. Otherwise, it was a case of using what is built into the Python language, like file writing and looping.

Once the scripts were ready, they could be uploaded to web servers and executed by scheduled jobs using CRON to keep things up to date. Along the way, I also uncovered a way to publicise the locations of the sitemap files to search engine bots using robots.txt.  The structure of the instruction is the following:

User-agent: *
Sitemap: sitemap.xml

This means that it announces to all bots the location of the sitemap file. In my case, I always included the full URL for the XML file, and that clearly varies by website location.

Resolving a clash between Homebrew and Python

22nd November 2022

For reasons that I cannot recall now, I installed the Hugo static website generator on my Linux system and web servers using Homebrew. The only reason that I suggest is that it might have been a way to get the latest version at the time because Linux Mint only does major changes like that every two years, keeping it in line with long-term support editions of Ubuntu.

When Homebrew was installed, it changed the lookup path for command line executables by adding the following line to my .bashrc file:

eval "$(/home/linuxbrew/.linuxbrew/bin/brew shellenv)"

This executed the following lines:

export HOMEBREW_PREFIX="/home/linuxbrew/.linuxbrew";
export HOMEBREW_CELLAR="/home/linuxbrew/.linuxbrew/Cellar";
export HOMEBREW_REPOSITORY="/home/linuxbrew/.linuxbrew/Homebrew";
export PATH="/home/linuxbrew/.linuxbrew/bin:/home/linuxbrew/.linuxbrew/sbin${PATH+:$PATH}";
export MANPATH="/home/linuxbrew/.linuxbrew/share/man${MANPATH+:$MANPATH}:";
export INFOPATH="/home/linuxbrew/.linuxbrew/share/info:${INFOPATH:-}";

While the result suits Homebrew, it changed the setup of Python and its packages on my system. Eventually, this had undesirable consequences, like messing up how Spyder started, so I wanted to change this. There are other things that I have automated using Python and these were not working either.

One way that I have seen suggested is to execute the following command, but I cannot vouch for this:

brew unlink python

What I did was to comment out the offending line in .bashrc and replace it with the following:

export PATH="$PATH:/home/linuxbrew/.linuxbrew/bin:/home/linuxbrew/.linuxbrew/sbin"

export HOMEBREW_PREFIX="/home/linuxbrew/.linuxbrew";
export HOMEBREW_CELLAR="/home/linuxbrew/.linuxbrew/Cellar";
export HOMEBREW_REPOSITORY="/home/linuxbrew/.linuxbrew/Homebrew";

export MANPATH="/home/linuxbrew/.linuxbrew/share/man${MANPATH+:$MANPATH}:";
export INFOPATH="${INFOPATH:-}/home/linuxbrew/.linuxbrew/share/info";

The first command adds Homebrew paths to the end of the PATH variable rather than the beginning, which was the previous arrangement. This ensures system folders are searched for executable files before Homebrew folders. It also means Python packages load from my user area instead of the Homebrew location, which happened under Homebrew's default configuration. When working with Python packages, remember not to install one version at the system level and another in your user area, as this creates conflicts.

So far, the result of the Homebrew changes is not unsatisfactory, and I will watch for any rough edges that need addressing. If something comes up, then I will set things up in another way.

A desktop Markdown editing environment

8th November 2022

Earlier this year, I changed over two websites from dynamic versions using content management systems to static ones by using Hugo to build them from Markdown files. That meant that I needed to look at the editing of Markdown, even if it is a fairly simple file format. For one thing, Grammarly can be incorporated into WordPress, so I did not want to lose something like that.

The latter point meant that I was steered away from plain text editors. Otherwise, there are online ones like StackEdit and Dillinger, but the Firefox Grammarly plugin only appears to work on the first of these, and even then, only partially in my experience. While Dillinger does offer connections to online file storage providers like Google, Dropbox and OneDrive, I wanted to store files on my desktop for upload to a web server. It also works with GitHub, but I prefer to use another web hosting provider.

There are various specialised Markdown editors for desktop usage like Typora, ReText, Formiko or Ghostwriter, yet I chose none of these. My actual choice may surprise many: it was Visual Studio Code. The availability of a Grammarly plug-in was what swayed it for me, even if it did need to be switched on for Markdown files. In many ways, it does work as smoothly as elsewhere because it gets fooled by links and other code-like pieces of text. Also, having the added ability to add words to a custom dictionary would be ideal. Some rule overriding is available, but I am not sure that everything is covered, even if the list of options is lengthy. Some time is needed to inspect all of them before I proceed any further. Thus far, things are working well enough for me.

  • The content, images, and materials on this website are protected by copyright law and may not be reproduced, distributed, transmitted, displayed, or published in any form without the prior written permission of the copyright holder. All trademarks, logos, and brand names mentioned on this website are the property of their respective owners. Unauthorised use or duplication of these materials may violate copyright, trademark and other applicable laws, and could result in criminal or civil penalties.

  • All comments on this website are moderated and should contribute meaningfully to the discussion. We welcome diverse viewpoints expressed respectfully, but reserve the right to remove any comments containing hate speech, profanity, personal attacks, spam, promotional content or other inappropriate material without notice. Please note that comment moderation may take up to 24 hours, and that repeatedly violating these guidelines may result in being banned from future participation.

  • By submitting a comment, you grant us the right to publish and edit it as needed, whilst retaining your ownership of the content. Your email address will never be published or shared, though it is required for moderation purposes.