Technology Tales

Adventures in consumer and enterprise technology

TOPIC: WEB SERVER

How complexity can blind you

17th August 2025

Visitors may not have noticed it, but I was having a lot of trouble with this website. Intermittent slowdowns beset any attempt to add new content or perform any other administration. This was not happening on any other web portal that I had, even one sharing the same publishing software.

Even so, WordPress did get the blame at first, at least when deactivating plugins had no effect. Then, it was the turn of the web server, resulting in a move to something more powerful and my leaving Apache for Nginx at the same time. Redis caching was another suspect, especially when things got in a twist on the new instance. As if that were not enough, MySQL came in for some scrutiny too.

Finally, another suspect emerged: Cloudflare. Either some settings got mangled or something else was happening, but cutting out that intermediary was enough to make things fly again. Now, I use bunny.net for DNS duties instead, and the simplification has helped enormously; the previous layering was no help with debugging. With a bit of care, I might add some other tools behind the scenes while taking things slowly to avoid confusion in the future.

Removing query strings from any URL on an Nginx-powered website

12th April 2025

My public transport website is produced using Hugo and is hosted on a web server with Nginx. Usually, I use Apache, so this is an exception. When Google highlighted some duplication caused by unneeded query strings, I set to work. However, doing anything with URL's like redirection cannot use a .htaccess file or MOD_REWRITE on Nginx. Thus, such clauses have to go somewhere else and take a different form.

In my case, the configuration file to edit is /etc/nginx/sites-available/default because that was what was enabled. Once I had that open, I needed to find the following block:

location / {
        # First attempt to serve request as file, then
        # as directory, then fall back to displaying a 404.
        try_files $uri $uri/ =404;
}

Because I have one section for port 80 and another for port 443, there were two locations that I needed to update due to duplication, though I may have got away without altering the second of these. After adding the redirection clause, the block became:

location / {
        # First attempt to serve request as file, then
        # as directory, then fall back to displaying a 404.
        try_files $uri $uri/ =404;

        # Remove query strings only when necessary
        if ($args) {
                rewrite ^(.*)$ $1? permanent;
        }
}

The result of the addition is a permanent (301) redirection whenever there are arguments passed in a query string. The $1? portion is the rewritten URL without a query string that was retrieved in the initial ^(.*)$ portion. In other words, the redirect it from the original address to a new one with only the part preceding the question mark.

Handily, Nginx allows you to test your updated configuration using the following command:

sudo nginx -t

That helped me with some debugging. Once all was in order, I needed to reload the service by issuing this command:

sudo systemctl reload nginx

With Apache, there is no need to restart the service after updating the .htaccess file, which adds some convenience. The different locations also mean some care with backups when upgrading the operating system or moving from one server to another. Apart from that, all works well, proving that there can be different ways to complete the same task.

Making the LanguageTool embedded HTTP Server work on Windows 11

11th August 2024

My choice of Markdown editor is VS Code or VSCodium, the latter being a fork of the former with Microsoft telemetry removed. In either case, I use the LanguageTool Linter extension for the required grammar and spelling checks. Pointing that to the remote web service offered by LanguageTool could get punitive, even if I am a subscriber. Thus, I use a locally installed equivalent instead.

In my usual Linux system, that is how I work. However, I have replicated the set-up on a Windows laptop for added flexibility. The needed the JRE, so that was downloaded from the Oracle website and then installed. The next step is to download the LanguageTool embedded HTTP Server zip file and decompress it to a chosen location. To run the server, the command like the following is issued from the Windows Terminal (the single line may break over two here):

java -cp "[Chosen Location]\LanguageTool-stable\LanguageTool-6.4\languagetool-server.jar" org.languagetool.server.HTTPServer --port 8081 --allow-origin

That is enough to get things going because it fulfils the default settings of the LanguageTool Linter extension in VS Code or VSCodium. The fastText application is unavailable for Windows, so I did without it. So far, things are operating acceptably, even if there is a way to address more memory should that be required.

Dealing with the following message issued when using Certbot on Apache: "Unable to find corresponding HTTP vhost; Unable to create one as intended addresses conflict; Current configuration does not support automated redirection"

12th April 2024

When doing something with Certbot on another website not so long ago, I encountered the above message when executing the following command (semicolons have been added to separate the lines):

sudo certbot --apache

The solution was to open /etc/apache2/sites-available/000-default.conf using nano and update the ServerName field (or the line containing this keyword) so it matched the address used for setting up Let's Encrypt SSL certificates. The mention of Apache in the above does make the solution specific to this web server software, so you will need another solution if you meet this kind of problem when using Nginx or another web server.

Removing redundant kernels from Ubuntu

29th October 2022

Recently, a message appeared on some web servers that I have that exhorted me to upgrade to Ubuntu 22.04.1 using the do-release-upgrade command. In the interests of remaining current, I did just that to get another message, one like the following:

The upgrade needs a total of [amount of space with units] free space on disk `/boot`.
Please free at least an additional [amount of space with units] of disk space on `/boot`.
Empty your trash and remove temporary packages of former installations
using `sudo apt-get clean`.

Using sudo apt-get clean did not resolve the problem, so the advice given was of no use. The actual problem was that there were too many old kernels cluttering up /boot and searching around the web provided that wisdom. What also came up was a single command for resolving the problem. However, removing the wrong kernel can wreck a system, ensuring that I took a more cautious approach. First, I listed the kernels to be removed and checked that they did not include the currently running one. This was done with the following command (broken up over several lines for clarity using the backslash character to denote continuation) and running uname -r found the details of the running kernel:

dpkg -l linux-{image,headers}-"[0-9]*" \

| awk '/ii/{print $2}' \

| grep -ve "$(uname -r \

| sed -r 's/-[a-z]+//')"

The dpkg command listed the installed kernels with awk, grep and sed filtering out unwanted sections of the text. The awk command takes the tabular output from dpkg and turns it into a list. The -v switch on the grep command gets the lines that do not match the search expression created by the sed command, while the -e switch makes grep look for patterns. The sed command removes all letters from the output of the uname command, where the -r switch produces the kernel release details, to leave on the release number of the current kernel. On being satisfied that nothing untoward would happen, the full command below (also broken up over several lines for clarity, using the backslash character to denote continuation) could be executed.

sudo apt purge $(dpkg -l linux-{image,headers}-"[0-9]*" \

| awk '/ii/{print $2}' \

| grep -ve "$(uname -r \

| sed -r 's/-[a-z]+//')")

This apt to purge the unwanted kernels, thus freeing up enough space for the upgrade to continue. That happened without significant incident, though there were some remediations needed on the PHP side to get the website working smoothly again.

A quick look at the 7G Firewall

17th October 2021

There is a simple principal with the 7G Firewall from Perishable press: it is a set of mod_rewrite rules for the Apache web server that can be added to a .htaccess file, and there also is a version for the Nginx web server as well. These check query strings, request Uniform Resource Identifiers (URI's), user agents, remote hosts, HTTP referrers and request methods for any anomalies and blocks those that appear dubious.

Unfortunately, I found that the rules heavily slowed down a website with which I tried them, so I am going to have to wait until that is moved to a faster system before I really can give them a go. This can be a problem with security tools, as I also found with adding a modsec jail to a Fail2Ban instance. As it happens, both sets of observations were made using the GTMetrix tool, making it appear that there is a trade-off between security and speed that needs to be assessed before adding anything to block unwanted web visitors.

Using .htaccess to control hotlinking

10th October 2020

There are times when blogs cease to exist and the only place to find the content is on the Wayback Machine. Even then, it is in danger of being lost completely. One such example is the subject of this post.

Though this website makes use of the facilities of Cloudflare for various functions that include the blocking of image hot linking, the same outcome can be achieved using .htaccess files on Apache web servers. It may work on Nginx to a point too, but there are other configuration files that ought to be updated instead of using .htaccess when some frown upon the approach. In any case, the lines that need adding to .htaccess are listed below, while the web address needs to include your own domain in place of the dummy example provided:

RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?yourdomain.com(/)?.*$ [NC]
RewriteRule .*\.(gif|jpe?g|png|bmp)$ [F,NC]

The first line activates the mod_rewrite engine, which you might have already done. For this to work, the module must be enabled in your Apache configuration, and you need permission to make these changes. This requires modifying the Apache configuration files. The next two lines examine the HTTP referrer strings. The third line permits images to be served only from your own web domain, not from others. To include additional domains, copy the third line and change the web address as needed. Any new lines should be placed before the final line that specifies which file extensions are blocked for other web addresses.

RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?yourdomain.com(/)?.*$ [NC]
RewriteRule \.(gif|jpe?g|png|bmp)$ /images/image.gif [L,NC]

Another variant of the previous code involves changing the last line to display a default image showing others what is happening. That may not reduce the bandwidth usage as much as complete blocking, but it may be useful for telling others what is happening.

Running cron jobs using the www-data system account

22nd December 2018

When you set up your own web server or use a private server (virtual or physical), you will find that web servers run using the www-data account. That means that website files need to be accessible to that system account if not owned by it. The latter is mandatory if you want WordPress to be able to update itself with needing FTP details.

It also means that you probably need scheduled jobs to be executed using the privileges possessed by the www-data account. For instance, I use WP-CLI to automate spam removal and updates to plugins, themes and WordPress itself. Spam removal can be done without the www-data account, but the updates need file access and cannot be completed without this. Therefore, I got interested in setting up cron jobs to run under that account and the following command helps to address this:

sudo -u www-data crontab -e

For that to work, your own account needs to be listed in /etc/sudoers or be assigned to the sudo group in /etc/group. If it is either of those, then entering your own password will open the cron file for www-data, and it can be edited as for any other account. Closing and saving the session will update cron with the new job details.

In fact, the same approach can be taken for a variety of commands where files only can be accessed using www-data. This includes copying, pasting and deleting files as well as executing WP-CLI commands. The latter issues a striking message if you run a command using the root account, a pervasive temptation given what it allows. Any alternative to the latter has to be better from a security standpoint.

Moving a website from shared hosting to a virtual private server

24th November 2018

This year has seen some optimisation being applied to my web presences, guided by the results of GTMetrix scans. It was then that I realised how slow things were, so server loads were reduced. Anything that slowed response times, such as WordPress plugins, got removed. Usage of Matomo also was curtailed in favour of Google Analytics, while HTML, CSS and JS minification followed. Something that had yet to happen was a search for a faster server. Now, another website has been moved onto a virtual private server (VPS) to see how that would go.

Speed was not the only consideration, since security was a factor too. After all, a VPS is more locked away from other users than a folder on a shared server. There also is the added sense of control, so Let's Encrypt SSL certificates can be added using the Electronic Frontier Foundation's Certbot. That avoids the expense of using an SSL certificate provided through my shared hosting provider, and a successful transition for my travel website may mean that this one undergoes the same move.

For the VPS, I chose Ubuntu 18.04 as its operating system, and it came with the LAMP stack already in place. Have offload development websites, the mix of Apache, MySQL and PHP is more familiar to me than anything using Nginx or Python. It also means that .htaccess files become more useful than they were on my previous Nginx-based platform. Having full access to the operating system using SSH helps too and should mean that I have fewer calls on technical support since I can do more for myself. Any extra tinkering should not affect others either, since this type of setup is well known to me and having an offline counterpart means that anything riskier is tried there beforehand.

Naturally, there were niggles to overcome with the move. The first to fix was to make the MySQL instance accept calls from outside the server so that I could migrate data there from elsewhere, and I even got my shared hosting setup to start using the new database to see what performance boost it might give. To make all this happen, I first found the location of the relevant my.cnf configuration file using the following command:

find / -name my.cnf

Once I had the right file, I commented out the following line that it contained and restarted the database service afterwards, using another command to stop the appearance of any error 111 messages:

bind-address 127.0.0.1
service mysql restart

After that, things worked as required and I moved onto another matter: uploading the requisite files. That meant installing an FTP server, so I chose proftpd since I knew that well from previous tinkering. Once that was in place, file transfer commenced.

When that was done, I could do some testing to see if I had an active web server that loaded the website. Along the way, I also instated some Apache modules like mod-rewrite using the a2enmod command, restarting Apache each time I enabled another module.

Then, I discovered that Textpattern needed php-7.2-xml installed, so the following command was executed to do this:

apt install php7.2-xml

Then, the following line was uncommented in the correct php.ini configuration file that I found using the same method as that described already for the my.cnf configuration and that was followed by yet another Apache restart:

extension=php_xmlrpc.dll

Addressing the above issues yielded enough success for me to change the IP address in my Cloudflare dashboard so it pointed at the VPS and not the shared server. The changeover happened seamlessly without having to await DNS updates as once would have been the case. It had the added advantage of making both WordPress and Textpattern work fully.

With everything working to my satisfaction, I then followed the instructions on Certbot to set up my new Let's Encrypt SSL certificate. Aside from a tweak to a configuration file and another Apache restart, the process was more automated than I had expected, so I was ready to embark on some fine-tuning to embed the new security arrangements. That meant updating .htaccess files and Textpattern has its own, so the following addition was needed there:

RewriteCond %{HTTPS} !=on
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]

This complemented what was already in the main .htaccess file, and WordPress allows you to include http(s) in the address it uses, so that was another task completed. The general .htaccess only needed the following lines to be added:

RewriteCond %{SERVER_PORT} 80
RewriteRule ^(.*)$ https://www.assortedexplorations.com/$1 [R,L]

What all these achieve is to redirect insecure connections to secure ones for every visitor to the website. After that, internal hyperlinks without https needed updating along with any forms so that a padlock sign could be shown for all pages.

With the main work completed, it was time to sort out a lingering niggle regarding the appearance of an FTP login page every time a WordPress installation or update was requested. The main solution was to make the web server account the owner of the files and directories, but the following line was added to wp-config.php as part of the fix even if it probably is not necessary:

define('FS_METHOD', 'direct');

There also was the non-operation of WP Cron and that was addressed using WP-CLI and a script from Bjorn Johansen. To make double sure of its effectiveness, the following was added to wp-config.php to turn off the usual WP-Cron behaviour:

define('DISABLE_WP_CRON', true);

Intriguingly, WP-CLI offers a long list of possible commands that are worth investigating. A few have been examined, but more await attention.

Before those, I still need to get my new VPS to send emails. So far, sendmail has been installed, the hostname changed from localhost and the server restarted. More investigations are needed, but what I have not is faster than what was there before, so the effort has been rewarded already.

Relocating the Apache web server document root directory in Fedora 12

9th April 2010

So as not to deface anything that is available online on the web, I have a tendency to set up an offline Apache server on a home PC to do any tinkering away from the eyes of the unsuspecting public. Though Ubuntu is my mainstay for home computing, I do have a PC with Fedora installed, and I have been trying to get an Apache instance to start automatically on there unsuccessfully for a few months. While I can start it by running the following command as root, I'd rather not have more manual steps than is necessary.

httpd -k start

The command used by the system when it starts is different and, even when manually run as root, it failed with messages saying that it couldn't find the directory while the web server files are stored. Here it is:

service httpd start

The default document root location on any Linux distribution that I have seen is /var/www and all is very well with this, but it isn't a safe place to leave things if ever a re-installation is needed. Having needed to wipe /var after having it on a separate disk or partition for the sake of one installation, it doesn't look so persistent to me. In contrast, you can safeguard /home by having it on another disk or in a dedicated partition, which means that it can be retained even when you change the distro that you're using. Thus, I have got into the habit of having the root of the web server document root folder in my home area, and that is where I have been seeing the problem.

Because of the access message, I tried using chmod and chgrp, but to no avail. The remedy has to do with reassigning the security contexts used by SELinux. In Fedora, Apache will not work with the context user_home_t that is usually associated with home directories, but needs httpd_sys_content_t instead. To find out what contexts are associated with particular folders, issue the following command:

ls -Z

The final solution was to create a user account whose home directory hosts the root of the web server file system, called www in my case. Then, I executed the following command as root to get things going:

chcon -R -h -t httpd_sys_content_t /home/www

It appears that even the root of the home directory has to have an appropriate security context (/home has home_root_t so that might do the needful too). Without that, nothing will work, even if all is well at the next level down. The switches for chcon command translate as follows:

-R : recursive; applies changes to all files and folders within a directory.

-h : changes apply only to symbolic links and not to where they refer in the file system.

-t : alters context type.

It took a while for all of this stuff about SELinux security contexts to percolate through to the point where I was able to solve the problem. A spot of further inspiration was needed too and even guided my search for the information that I needed. It's well worth trying Linux Home Networking if you need further details. Though there are references to an earlier release of Fedora, the content still applies to later versions of Fedora to the current release, if my experience is typical.

  • The content, images, and materials on this website are protected by copyright law and may not be reproduced, distributed, transmitted, displayed, or published in any form without the prior written permission of the copyright holder. All trademarks, logos, and brand names mentioned on this website are the property of their respective owners. Unauthorised use or duplication of these materials may violate copyright, trademark and other applicable laws, and could result in criminal or civil penalties.

  • All comments on this website are moderated and should contribute meaningfully to the discussion. We welcome diverse viewpoints expressed respectfully, but reserve the right to remove any comments containing hate speech, profanity, personal attacks, spam, promotional content or other inappropriate material without notice. Please note that comment moderation may take up to 24 hours, and that repeatedly violating these guidelines may result in being banned from future participation.

  • By submitting a comment, you grant us the right to publish and edit it as needed, whilst retaining your ownership of the content. Your email address will never be published or shared, though it is required for moderation purposes.