Updating fail2ban filters for WordPress
18th April 2024Not so long ago, WordPress warned me that some of its Fail2ban filters were obsolete because I have the corresponding WP-fail2ban plugin installed, and the software is present on the underlying Ubuntu Server system. The solution was to connect to the server by SSH and execute the following commands.
wget https://plugins.svn.wordpress.org/wp-fail2ban/trunk/filters.d/wordpress-hard.conf
wget https://plugins.svn.wordpress.org/wp-fail2ban/trunk/filters.d/wordpress-soft.conf
wget https://plugins.svn.wordpress.org/wp-fail2ban/trunk/filters.d/wordpress-extra.conf
sudo mv wordpress-*.conf /etc/fail2ban/filter.d/
The first three commands download the updated configuration files before the last moves them to their final location. It is tempting to download the files directly to that final location, only for WGET to create new files instead of overwriting the old ones as required.
Moves to Hugo
30th November 2022What amazes me is how things can become more complicated over time. As long as you knew HTML, CSS and JavaScript, building a website was not as onerous as long as web browsers played ball with it. Since then, things have got easier to use but more complex at the same time. One example is WordPress: in the early days, themes were much simpler than they are now. The web also has got more insecure over time, and that adds to complexity as well. It sometimes feels as if there is a choice to make between ease of use and simplicity.
It is against that background that I reassessed the technology that I was using on my public transport and Irish history websites. The former used WordPress, while the latter used Drupal. The irony was that the simpler website was using the more complex platform, so the act of going simpler probably was not before time. Alternatives to WordPress were being surveyed for the first of the pair, but none had quite the flexibility, pervasiveness and ease of use that WordPress offers.
There is another approach that has been gaining notice recently. One part of this is the use of Markdown for web publishing. This is a simple and distraction-free plain text format that can be transformed into something more readable. It sees usage in blogs hosted on GitHub, but also facilitates the generation of static websites. The clutter is absent for those who have no need of the Gutenberg Editor on WordPress.
With the content written in Markdown, it can be fed to a static website generator like Hugo. Using defined templates and fixed assets like CSS together with images and other static files, it can slot the content into HTML files very speedily since it is written in the Go programming language. Once you get acclimatised, there are no folder structures that cannot be used, so you get full flexibility in how you build out your website. Sitemaps and RSS feeds can be built at the same time, both using the same input as the HTML files.
In a nutshell, it automates what once needed manual effort used a code editor or a visual web page editor. The use of HTML snippets and layouts means that there is no necessity for hand-coding content, like there was at the start of the web. It also helps that Bootstrap can be built in using Node, so that gives a basis for any styling. Then, SCSS can take care of things, giving even more automation.
Given that there is no database involved in any of this, the required information has to be stored somewhere, and neither the Markdown content nor the layout files contain all that is needed. The main site configuration is defined in a single TOML file, and you can have a single one of these for every publishing destination; I have development and production servers, which makes this a very handy feature. Otherwise, every Markdown file needs a YAML header where titles, template references, publishing status and other similar information gets defined. The layouts then are linked to their components, and control logic and other advanced functionality can be added too.
Because static files are being created, it does mean that site searching and commenting, or contact pages cannot work like they would on a dynamic web platform. Often, external services are plugged in using JavaScript. One that I use for contact forms is Getform.io. Then, Zapier has had its uses in using the RSS feed to tweet site updates on Twitter when new content gets added. Though I made different choices, Disqus can be used for comments and Algolia for site searching. Generally, though, you can find yourself needing to pay, particularly if you need to remove advertising or gain advanced features.
Some comments service providers offer open source self-hosted options, but I found these difficult to set up and ended up not offering commenting at all. That was after I tried out Cactus Comments only to find that it was not discriminating between pages, so it showed the same comments everywhere. There are numerous alternatives like Remark42, Hyvor Talk, Commento, FastComments, Utterances, Isso, Mouthful, Muut and HyperComments but trying them all out was too time-consuming for what commenting was worth to me. It also explains why some static websites even send readers to Twitter if they have something to say, though I have not followed this way of working.
For searching, I added a JavaScript/JSON self-hosted component to the transport website, and it works well. However, it adds to the size of what a browser needs to download. That is not a major issue for desktop browsers, but the situation with mobile browsers is such that it has a sizeable effect. Testing with PageSpeed and Lighthouse highlighted this, even if I left things as they are. The solution works well in any case.
One thing that I have yet to work out is how to edit or add content while away from home. Editing files using an SSH connection is as much a possibility as setting up a Hugo publishing setup on a laptop. After that, there is the question of using a tablet or phone, since content management systems make everything web based. These are points that I have yet to explore.
As is natural with a code-based solution, there is a learning curve with Hugo. Reading a book provided some orientation, and looking on the web resolved many conundrums. There is good documentation on the project website, while forum discussions turn up on many a web search. Following any research, there was next to nothing that could not be done in some way.
Migration of content takes some forethought and took quite a bit of time, though there was an opportunity to carry some housekeeping as well. The history website was small, so copying and pasting sufficed. For the transport website, I used Python to convert what was on the database into Markdown files before refining the result. That provided some automation, but left a lot of work to be done afterwards.
The results were satisfactory, and I like the associated simplicity and efficiency. That Hugo works so fast means that it can handle large websites, so it is scalable. The new Markdown method for content production is not problematical so far apart from the need to make it more portable, and it helps that I found a setup that works for me. This also avoids any potential dealbreakers that continued development of publishing platforms like WordPress or Drupal could bring. For the former, I hope to remain with the Classic Editor indefinitely, but now have another option in case things go too far.
Building a sitemap in XML
24th November 2022While there are many tools that will build XML site maps, there is some satisfaction to be had in creating your own. This is in spite of there being a multitude of search engine optimisation plugins for content management systems like WordPress or what is built into static site generators like Hugo. Sometimes, building your own allows for added simplicity and that is shared with recent efforts in WordPress theme development.
The sitemap XML protocol is simple enough to offer a short coding project. The basis was what Hugo generates and I used Python to create the XML files. The only libraries that I needed were configparser, SQLAlchemy and pandas. the first two of these allowed databases to be queried and the last on the list was used for data processing. Otherwise, it was a case of using what is built into the Python language like file writing and looping.
Once the scripts were ready, they could be uploaded to web servers and executed by scheduled jobs using CRON to keep things up to date. along the way, I also uncovered a way to publicise the locations of the sitemap files to search engine bots using robots.txt. The structure of the instruction is the following:
User-agent: *
Sitemap: sitemap.xml
This means that it announces to all bots the location of the sitemap file. In my case, I always included the full URL for the XML file and that clearly varies by website location.
Redirecting a WordPress site to its home page when its loop finds no posts
5th November 2022Since I created a bespoke theme for this site, I have been tweaking things as I go. The basis came from the WordPress Theme Developer Handbook, which gave me a simpler starting point shorn of all sorts of complexity that is encountered with other themes. Naturally, this means that there are little rough edges that need tidying over time.
One of these is dealing with errors on the site like when content is not found. This could be a wrong address or a search query that finds no matching posts. When that happens, there is a redirection to the home page using some simple JavaScript within the loop fallback code enclosed within script start and end tags (including the whole code triggers the action from this post so it cannot be shown here):
location.href="[blog home page ]";
The bloginfo function can be used with the url keyword to find the home page so this does not get hard coded. For now, this works so long as JavaScript is enabled but a more robust approach may come in time. It is not possible to do a PHP redirect because of the nature of HTTP: when headers have been sent, it is not possible to do server redirects. At this stage, things become client side so using JavaScript is one way to go instead.
Adding a new domain or subdomain to an SSL certificate using Certbot
11th June 2019On checking the Site Health page of a WordPress blog, I saw errors that pointed to a problem with its SSL set up. The www subdomain was not included in the site’s certificate and was causing PHP errors as a result though they had no major effect on what visitors saw. Still, it was best to get rid of them so I needed to update the certificate as needed. Execution of a command like the following did the job:
sudo certbot --expand -d existing.com,www.example.com
Using a Let’s Encrypt certificate meant that I could use the certbot command since that already was installed on the server. The --expand and -d switches ensured that the listed domains were added to the certificate to sort out the observed problem. In the above, a dummy domain name is used but this was replaced by the real one to produce the desired effect and make things as they should have been.
Running cron jobs using the www-data system account
22nd December 2018When you set up your own web server or use a private server (virtual or physical), you will find that web servers run using the www-data account. That means that website files need to be accessible to that system account if not owned by it. The latter is mandatory if you you want WordPress to be able to update itself with needing FTP details.
It also means that you probably need scheduled jobs to be executed using the privileges possess by the www-data account. For instance, I use WP-CLI to automate spam removal and updates to plugins, themes and WordPress itself. Spam removal can be done without the www-data account but the updates need file access and cannot be completed without this. Therefore, I got interested in setting up cron jobs to run under that account and the following command helps to address this:
sudo -u www-data crontab -e
For that to work, your own account needs to be listed in /etc/sudoers or be assigned to the sudo group in /etc/group. If it is either of those, then entering your own password will open the cron file for www-data and it can be edited as for any other account. Closing and saving the session will update cron with the new job details.
In fact, the same approach can be taken for a variety of commands where files only can be access using www-data. This includes copying, pasting and deleting files as well as executing WP-CLI commands. The latter issues a striking message if you run a command using the root account, a pervasive temptation given what it allows. Any alternative to the latter has to be better from a security standpoint.
Moving a website from shared hosting to a virtual private server
24th November 2018This year has seen some optimisation being applied to my web presences guided by the results of GTMetrix scans. It was then that I realised how slow things were, so server loads were reduced. Anything that slowed response times, such as WordPress plugins, got removed. Usage of Matomo also was curtailed in favour of Google Analytics while HTML, CSS and JS minification followed. What had yet to happen was a search for a faster server. Now, another website has been moved onto a virtual private server (VPS) to see how that would go.
Speed was not the only consideration since security was a factor too. After all, a VPS is more locked away from other users than a folder on a shared server. There also is the added sense of control, so Let’s Encrypt SSL certificates can be added using the Electronic Frontier Foundation’s Certbot. That avoids the expense of using an SSL certificate provided through my shared hosting provider and a successful transition for my travel website may mean that this one undergoes the same move.
For the VPS, I chose Ubuntu 18.04 as its operating system, and it came with the LAMP stack already in place. Have offload development websites, the mix of Apache, MySQL and PHP is more familiar to me than anything using Nginx or Python. It also means that .htaccess
files become more useful than they were on my previous Nginx-based platform. Having full access to the operating system using SSH helps too and should mean that I have fewer calls on technical support since I can do more for myself. Any extra tinkering should not affect others either, since this type of setup is well known to me and having an offline counterpart means that anything riskier is tried there beforehand.
Naturally, there were niggles to overcome with the move. The first to fix was to make the MySQL instance accept calls from outside the server so that I could migrate data there from elsewhere, and I even got my shared hosting setup to start using the new database to see what performance boost it might give. To make all this happen, I first found the location of the relevant my.cnf
configuration file using the following command:
find / -name my.cnf
Once I had the right file, I commented out the following line that it contained and restarted the database service afterwards using another command to stop the appearance of any error 111 messages:
bind-address 127.0.0.1
service mysql restart
After that, things worked as required and I moved onto another matter: uploading the requisite files. That meant installing an FTP server, so I chose proftpd since I knew that well from previous tinkering. Once that was in place, file transfer commenced.
When that was done, I could do some testing to see if I had an active web server that loaded the website. Along the way, I also instated some Apache modules like mod-rewrite using the a2enmod
command, restarting Apache each time I enabled another module.
Then, I discovered that Textpattern needed php-7.2-xml installed, so the following command was executed to do this:
apt install php7.2-xml
Then, the following line was uncommented in the correct php.ini configuration file that I found using the same method as that described already for the my.cnf
configuration and that was followed by yet another Apache restart:
extension=php_xmlrpc.dll
Addressing the above issues yielded enough success for me to change the IP address in my Cloudflare dashboard so it pointed at the VPS and not the shared server. The changeover happened seamlessly without having to await DNS updates as once would have been the case. It had the added advantage of making both WordPress and Textpattern work fully.
With everything working to my satisfaction, I then followed the instructions on Certbot to set up my new Let’s Encrypt SSL certificate. Aside from a tweak to a configuration file and another Apache restart, the process was more automated than I had expected, so I was ready to embark on some fine-tuning to embed the new security arrangements. That meant updating .htaccess
files and Textpattern has its own, so the following addition was needed there:
RewriteCond %{HTTPS} !=on
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]
This complemented what was already in the main .htaccess
file and WordPress allows you to include http(s) in the address it uses, so that was another task completed. The general .htaccess
only needed the following lines to be added:
RewriteCond %{SERVER_PORT} 80
RewriteRule ^(.*)$ https://www.assortedexplorations.com/$1 [R,L]
What all these achieve is to redirect insecure connections to secure ones for every visitor to the website. After that, internal hyperlinks without https needed updating along with any forms so that a padlock sign could be shown for all pages.
With the main work completed, it was time to sort out a lingering niggle regarding the appearance of an FTP login page every time a WordPress installation or update was requested. The main solution was to make the web server account the owner of the files and directories, but the following line was added to wp-config.php as part of the fix even if it probably is not necessary:
define('FS_METHOD', 'direct');
There also was the non-operation of WP Cron and that was addressed using WP-CLI and a script from Bjorn Johansen. To make double sure of its effectiveness, the following was added to wp-config.php to turn off the usual WP-Cron behaviour:
define('DISABLE_WP_CRON', true);
Intriguingly, WP-CLI offers a long list of possible commands that are worth investigating. A few have been examined, but more await attention.
Before those, I still need to get my new VPS to send emails. So far, sendmail has been installed, the hostname changed from localhost and the server restarted. More investigations are needed, but what I have not is faster than what was there before, so the effort has been rewarded already.
Overriding replacement of double or triple hyphenation in WordPress
7th June 2016On here, I have posts with example commands that include double hyphens and they have been displayed merged together, something that has resulted in a comment posted by a visitor to this part of the web. All the while, I have been blaming the fonts that I have been using only for it to be the fault of WordPress itself.
Changing multiple dashes to something else has been a feature of Word autocorrect but I never expected to see WordPress aping that behaviour and it has been doing so for a few years now. The culprit is wptexturize and that cannot be disabled for it does many other useful things.
What happens is that the wptexturize filter changes ‘--‘ (double hyphens) to ‘–’ (– in web entity encoding) and ‘--‘ (triple hyphens) to ‘—’ (— in web entity encoding). The solution is to add another filter to the content that changes these back to the way they were and the following code does this:
add_filter( ‘the_content’ , ‘mh_un_en_dash’ , 50 );
function mh_un_en_dash( $content ) {
$content = str_replace( ‘–’ , ‘--‘ , $content );
$content = str_replace( ‘—’ , ‘--‘ , $content );
return $content;
}
The first line of the segment adds in the new filter that uses the function defined below it. The third and fourth lines above do the required substitution before the function returns the post content for display in the web page. The whole code block can be used to create a plugin or placed the theme’s functions.php file. Either way, things appear without the substitution confusing your readers. It makes me wonder if a bug report has been created for this because the behaviour looks odd to me.
Turning off Advanced Content Filtering in CKEditor
3rd February 2015On one of my websites, I use Textpattern with CKEditor for editing of articles on there. This was working well until I upgraded CKEditor to a version with a number of 4.1 or newer because it started to change the HTML in my articles when I did not want it to do so, especially when it broke the appearance of the things. A search on Google revealed an unhelpful forum exchange that produced no solution to the issue so I decided to share one on here when I found it.
What I needed to do was switch off what is known as Advanced Content Filtering. It can be tuned but I felt that would take too much time so I implemented something like what you see below in the config.js with the ckeditor folder:
CKEDITOR.editorConfig = function( config ) {
config.allowedContent = true;
};
All settings go with the outer function wrapper and setting the config.allowedContent property to true within there sorted my problem as I wanted. Now, any HTML remains untouched and I am happy with the outcome. It might be better for features like Advanced Content Filtering to be switched off by default and turned on by those with the time and need for it, much like the one of the principles adopted by the WordPress project. Still, having any off switch is better than none at all.
Turning off the full height editor option in WordPress 4.0
10th September 2014Though I keep a little eye on WordPress development, it is no way near as rigorous as when I submitted a patch that got me a mention on the contributor list of a main WordPress release. That may explain how the full editor setting, which is turned on by default passed by on me without my taking much in the way of notice of it.
WordPress has become so mature now that I almost do not expect major revisions like the overhauls received by the administration back-end in 2008. The second interface was got so right that it still is with us and there were concerns in my mind at the time as to how usable it would be. Sometimes, those initial suspicions can come to nothing.
However, WordPress 4.0 brought a major change to the editor and I unfortunately am not sure that it is successful. A full height editor sounds a good idea in principle but I found some rough edges to its present implementation that leave me wondering if any UX person got to reviewing it. The first reason is that scrolling becomes odd with the editor’s toolbar becoming fixed when you scroll down far enough on an editor screen. The sidebar scrolling then is out of sync with the editor box, which creates a very odd sensation. Having keyboard shortcuts like CTRL+HOME and CTRL+END not working as they should only convinced me that the new arrangement was not for me and I wanted to turn it off.
A search with Google turned up nothing of note so I took to the WordPress.org forum to see if I could get any joy. That revealed that I should have thought of looking in the screen options dropdown box for an option called “Expand the editor to match the window height” so I could clear that tickbox. Because of the appearance of a Visual Editor control on there, I looked on the user profile screen and found nothing so the logic of how things are set up is sub-optimal. Maybe, the latter option needs to be a screen option now too. Thankfully, the window height editor option only needs setting once for both posts and pages so you are covered for all eventualities at once.
With a distraction-free editing option, I am not sure why someone went for the full height editor too. If WordPress wanted to stick with this, it does need more refinement so it behaves more conventionally. Personally, I would not build a website with that kind of ill-synchronised scrolling effect so it is something needs work as does the location of the Visual Editor setting. It could be that both settings need to be at the user level and not with one being above that level while another is at it. Until I got the actual solution, I was faced with using distraction-free mode all the time and also installed the WP Editor plugin too. That remains due to its code highlighting even if dropping into code view always triggers the need to create a new revision. Despite that, all is better in the end.