Technology Tales

Adventures & experiences in contemporary technology

Generating custom log messages in R

24th April 2023

While I have been exploring the use of R on a private basis during the last few years, a recent opportunity allowed me to use this exposure at work. This took the form of creating a utility script for use by others. To keep things lightweight, I did not go down the packaging route, but that may come later, possibly for something else.

However, anything used by others needs input checking and comprehensible feedback should anything go wrong. For me, that meant looking at the message, warning and stop functions. The last of these aborts script execution when there is a critical error while the other two do not do that. The message function is for informative user input while the warning function suggests things that may need their attention.

Each function takes string input and sends this to the terminal or log. They also can combine different pieces of text in the style of the paste0 function and can take the text output of other functions as input. Used in combination with conditional logic or error handling, they can help a user track down what went wrong without their needing to ask a script developer. Anything that helps anyone else to help themselves has to be good.

Moves to Hugo

30th November 2022

What amazes me is how things can become more complicated over time. As long as you knew HTML, CSS and JavaScript, building a website was not as onerous as long as web browsers played ball with it. Since then, things have got easier to use but more complex at the same time. One example is WordPress: in the early days, themes were much simpler than they are now. The web also has got more insecure over time, and that adds to complexity as well. It sometimes feels as if there is a choice to make between ease of use and simplicity.

It is against that background that I reassessed the technology that I was using on my public transport and Irish history websites. The former used WordPress, while the latter used Drupal. The irony was that the simpler website was using the more complex platform, so the act of going simpler probably was not before time. Alternatives to WordPress were being surveyed for the first of the pair, but none had quite the flexibility, pervasiveness and ease of use that WordPress offers.

There is another approach that has been gaining notice recently. One part of this is the use of Markdown for web publishing. This is a simple and distraction-free plain text format that can be transformed into something more readable. It sees usage in blogs hosted on GitHub, but also facilitates the generation of static websites. The clutter is absent for those who have no need of the Gutenberg Editor on WordPress.

With the content written in Markdown, it can be fed to a static website generator like Hugo. Using defined templates and fixed assets like CSS together with images and other static files, it can slot the content into HTML files very speedily since it is written in the Go programming language. Once you get acclimatised, there are no folder structures that cannot be used, so you get full flexibility in how you build out your website. Sitemaps and RSS feeds can be built at the same time, both using the same input as the HTML files.

In a nutshell, it automates what once needed manual effort used a code editor or a visual web page editor. The use of HTML snippets and layouts means that there is no necessity for hand-coding content, like there was at the start of the web. It also helps that Bootstrap can be built in using Node, so that gives a basis for any styling. Then, SCSS can take care of things, giving even more automation.

Given that there is no database involved in any of this, the required information has to be stored somewhere, and neither the Markdown content nor the layout files contain all that is needed. The main site configuration is defined in a single TOML file, and you can have a single one of these for every publishing destination; I have development and production servers, which makes this a very handy feature. Otherwise, every Markdown file needs a YAML header where titles, template references, publishing status and other similar information gets defined. The layouts then are linked to their components, and control logic and other advanced functionality can be added too.

Because static files are being created, it does mean that site searching and commenting, or contact pages cannot work like they would on a dynamic web platform. Often, external services are plugged in using JavaScript. One that I use for contact forms is Getform.io. Then, Zapier has had its uses in using the RSS feed to tweet site updates on Twitter when new content gets added. Though I made different choices, Disqus can be used for comments and Algolia for site searching. Generally, though, you can find yourself needing to pay, particularly if you need to remove advertising or gain advanced features.

Some comments service providers offer open source self-hosted options, but I found these difficult to set up and ended up not offering commenting at all. That was after I tried out Cactus Comments only to find that it was not discriminating between pages, so it showed the same comments everywhere. There are numerous alternatives like Remark42, Hyvor Talk, Commento, FastComments, Utterances, Isso, Mouthful, Muut and HyperComments but trying them all out was too time-consuming for what commenting was worth to me. It also explains why some static websites even send readers to Twitter if they have something to say, though I have not followed this way of working.

For searching, I added a JavaScript/JSON self-hosted component to the transport website, and it works well. However, it adds to the size of what a browser needs to download. That is not a major issue for desktop browsers, but the situation with mobile browsers is such that it has a sizeable effect. Testing with PageSpeed and Lighthouse highlighted this, even if I left things as they are. The solution works well in any case.

One thing that I have yet to work out is how to edit or add content while away from home. Editing files using an SSH connection is as much a possibility as setting up a Hugo publishing setup on a laptop. After that, there is the question of using a tablet or phone, since content management systems make everything web based. These are points that I have yet to explore.

As is natural with a code-based solution, there is a learning curve with Hugo. Reading a book provided some orientation, and looking on the web resolved many conundrums. There is good documentation on the project website, while forum discussions turn up on many a web search. Following any research, there was next to nothing that could not be done in some way.

Migration of content takes some forethought and took quite a bit of time, though there was an opportunity to carry some housekeeping as well. The history website was small, so copying and pasting sufficed. For the transport website, I used Python to convert what was on the database into Markdown files before refining the result. That provided some automation, but left a lot of work to be done afterwards.

The results were satisfactory, and I like the associated simplicity and efficiency. That Hugo works so fast means that it can handle large websites, so it is scalable. The new Markdown method for content production is not problematical so far apart from the need to make it more portable, and it helps that I found a setup that works for me. This also avoids any potential dealbreakers that continued development of publishing platforms like WordPress or Drupal could bring. For the former, I hope to remain with the Classic Editor indefinitely, but now have another option in case things go too far.

Building a sitemap in XML

24th November 2022

While there are many tools that will build XML site maps, there is some satisfaction to be had in creating your own. This is in spite of there being a multitude of search engine optimisation plugins for content management systems like WordPress or what is built into static site generators like Hugo. Sometimes, building your own allows for added simplicity and that is shared with recent efforts in WordPress theme development.

The sitemap XML protocol is simple enough to offer a short coding project. The basis was what Hugo generates and I used Python to create the XML files. The only libraries that I needed were configparser, SQLAlchemy and pandas. the first two of these allowed databases to be queried and the last on the list was used for data processing. Otherwise, it was a case of using what is built into the Python language like file writing and looping.

Once the scripts were ready, they could be uploaded to web servers and executed by scheduled jobs using CRON to keep things up to date. along the way, I also uncovered a way to publicise the locations of the sitemap files to search engine bots using robots.txt.  The structure of the instruction is the following:

User-agent: *
Sitemap: sitemap.xml

This means that it announces to all bots the location of the sitemap file. In my case, I always included the full URL for the XML file and that clearly varies by website location.

Self hosting Google fonts

14th November 2022

One thing that I have been doing across my websites is to move away from using Google web fonts using the advice on the Switching.Software website. While I have looked at free web font directories like 1001 Free Fonts or DaFont, they do not have the full range of bolding and character sets that I desire so I opted instead for the Google Webfonts Helper website. That not only offered copies of what Google has but also created a portion of CSS that I could add to a stylesheet on a website, making things more streamlined. At the same time, I also took the opportunity to change some of the fonts that were being used for sake of added variety. Open Sans is good but there are other acceptable sans-serif options like Mulish or Nunita as well, so these got used.

Some online writing tools

15th October 2021

Every week, I get an email newsletter from Woody’s Office Watch. This was something to which I started subscribing in the 1990’s but I took a break from it for a good while for reasons that I cannot recall and returned to it only in recent years. This week’s issue featured a list of online paraphrasing tools that are part of what is offered by Quillbot, Paraphraser, Dupli Checker and Pre Post Seo. Each got their own reviews in the newsletter so I will just outline other features in this posting.

In Quillbot’s case, the toolkit includes a grammar checker, summary generator, and citation generator. In addition to the online offering, there are extensions for Microsoft Word, Google Chrome, and Google Docs. In addition to the free version, a paid subscription option is available.

In spite of the name, Paraphraser is about more than what the title purports to do. There is article rewriting, plagiarism checking, grammar checking and text summarisation. Because there is no premium version, the offering is funded by advertising and it will not work with an ad blocker enabled. The mention of plagiarism suggests a perhaps murkier side to writing that cuts both ways: one is to avoid copying other work while another is the avoidance of groundless accusations of copying.

It was appear that the main role of Dupli Checker is to avoid accusations of plagiarism by checking what you write yet there is a grammar checker as well as a paraphrasing tool on there too. When I tried it, the English that it produced looked a little convoluted and there is a lack of fluency in what is written on its website as well. Together with a free offering that is supported by ads that were not blocked by my ad blocker, there are premium subscriptions too.

In web publishing, they say that content is king so the appearance of an option using the acronym for Search Engine Optimisation in it name may not be as strange as it might as first glance. There are numerous tools here with both free and paid tiers of service. While paraphrasing and plagiarism checking get top billing in the main menu on the home page, further inspection reveals that there is a lot more to check on this site.

In writing, inspiration is a fleeting and ephemeral quantity so anything that helps with this has to be of interest. While any rewriting of initial content may appear less smooth than the starting point, any help with the creation process cannot go amiss. For that reason alone, I might be tempted to try these tools from time to time and they might assist with proof reading as well because that can be a hit and miss affair for some.

 

Using .htaccess to control hotlinking

10th October 2020

There are times when blogs cease to exist and the only place to find the content is on the Wayback Machine. Even then, it is in danger of being lost completely. One such example is the subject of this post.

Though this website makes use of the facilities of Cloudflare for various functions that include the blocking of image hotlinking, the same outcome can be achieved using .htaccess files on Apache web servers. It may work on Nginx to a point too but there are other configuration files that ought to be updated instead of using a .htaccess when some frown upon the approach. In any case, the lines that need adding to .htaccess are listed below though the web address needs to include your own domain in place of the dummy example provided:

RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?yourdomain.com(/)?.*$ [NC]
RewriteRule .*\.(gif|jpe?g|png|bmp)$ [F,NC]

The first line turns on the mod_rewrite engine and you may have that done anyway. Of course, the module needs enabling in your Apache configuration for this to work and you have to be allowed to perform the required action as well. This means changing the Apache configuration files. The next pair of lines look at the HTTP referer strings and the third one only allows images to be served from your own web domain and not others. To add more, you need to copy the third line and change the web address accordingly. Any new lines need to precede the last line that defines the file extensions that are to be blocked to other web addresses.

RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?yourdomain.com(/)?.*$ [NC]
RewriteRule \.(gif|jpe?g|png|bmp)$ /images/image.gif [L,NC]

Another variant of the previous code involves changing the last line to display a default image showing others what is happening. That may not reduce the bandwidth usage as much as complete blocking but it may be useful for telling others what is happening.

Turning off push notifications in Firefox 46

7th May 2016

A new feature came with Firefox 44 that only recently started to come to my notice with Yahoo Mail offering to set up browser notifications for every time when a new email arrives there. This is something that I did not need and yet I did not get the option to switch it off permanently for that website so I was being nagged every time I when to check on things for that email address, an unneeded irritation. Other websites offered to set up similar push notifications but you could switch these off permanently so it is a site by site function unless you take another approach.

That is to open a new browser tab and enter about:config in the address bar before hitting the return key. If you have not done this before, a warning message will appear but you can dismiss this permanently.  Once beyond that, you are presented with a searchable list of options and the ones that you need are dom.webnotifications.enabled and dom.webnotifications.serviceworker.enabled. By default, the corresponding entries in the Value column will be true. Double-clicking on each one will set it to false and you should not see any more offers of push notifications that allow you get alerts from web services like Yahoo Mail so your web browsing should suffer less of these intrusions.

Changing to web fonts

12th February 2012

While you can add Windows fonts to Linux installations, I have found that their display can be flaky to say the least. Linux Mint and Ubuntu display them as sharp as I’d like but I have struggled to get the same sort of results from Arch Linux while I am not so sure about Fedora or openSUSE either.

That has caused me to look at web fonts for my websites with Google Web Fonts doing what I need with both Open Sans and Arimo doing what I need so far. There have been others with which I have dallied, such as Droid Sans, but these are the ones on which I have settled for now. Both are in use on this website now and I added calls for them to the web page headers using the following code (lines are wrapping due to space constraints):

<link href="http://fonts.googleapis.com/css?family=Open+Sans:300italic,400italic,600italic,700italic,400,300,600,700" rel="stylesheet" type="text/css">
<link href='http://fonts.googleapis.com/css?family=Arimo:400,400italic,700,700italic' rel='stylesheet' type='text/css'>

With those lines in place, it then is a matter of updating font-family and font declarations in CSS style sheets with “Open Sans” or “Arimo” as needed while keeping alternatives defined in case the Google font service goes down for whatever reason. A look at a development release of the WordPress Twenty Twelve theme caused me to come across Open Sans and I like it for its clean lines and Arimo, which was found by looking through the growing Google Web Fonts catalogue, is not far behind. Looking through that catalogue now causes for me a round of indecision since there is so much choice. For that reason, I think it better to be open to the recommendations of others.

Another look at Drupal

20th January 2010

Early on in the first year of this blog, I got to investigating the use of Drupal for creating an article-based subsite. In the end, the complexities of its HTML and CSS thwarted my attempts to harmonise the appearance of web pages with other parts of the same site and I discontinued my efforts. In the end, it was Textpattern that suited my needs and I have stuck with that for the aforementioned subsite. However, I recently spotted someone very obviously using Drupal in its out of the box state for a sort of blog (there is even an extension for importing WXR files containing content from a WordPress blog); they even hadn’t removed the Drupal logo. With my interest rekindled, I took another look for the sake of seeing where things have gone in the last few years. Well, first impressions are that it now looks like a blogging tool with greater menu control and the facility to define custom content types. There are plenty of nice themes around too though that highlights an idiosyncrasy in the sense that content editing is not fully integrated into the administration area where I’d expect it to be. The consequence of this situation is that pages, posts (or story as the content type is called) or any content types that you have defined yourself are created and edited with the front page theme controlling the appearance of the user interface. It is made even more striking when you use a different theme for the administration screens. That oddity aside, there is a lot to recommend Drupal though I’d try setting up a standalone site with it rather than attempting to shoehorn it as a part of an existing one like what I was trying when I last looked.

Investigating the real-time web

10th January 2010

Admittedly, I have been keeping away from Twitter and its kind for a while now but the current run of cold weather in the Britain and Ireland has alerted me to its usefulness and I have given the thing a go. With public transport operator website heaving over the last week, the advantages of microblogging became more than apparent, thanks in no small part to the efforts of Centrebus, National Rail Enquiries and the U.K. Met Office. The pithy nature of any messages saves the effort needed to compile a longer blog post and to read it afterwards. This aspect makes it invaluable for those times when all that needs to be communicated is short and sweet. Anything that cuts down on the information tide that hits all of us every day cam only be a good thing.

Along with Twitter, there is a whole suite of tools available for various bits and pieces. First off, there’s integration with WordPress courtesy of plugins like Alex King’s Twitter Tools. After that, there are numerous web applications for taming the beast. Though I only can say that I scratched the surface of what’s available, I have come accross HootSuite and Twitterfeed. The former is a console for managing more than one Twitter account at once while also offering the facility to do the same for Facebook, LinkedIn, WordPress.com and others too. Twitterfeed may be more limited in scope with offering to turn RSS feeds into tweets but it has its place too. HootSuite might have something similar for WordPress but Twitterfeed is a good more universal in its sweep. Naturally, there’s more out there than these two but I am not trying to be exhaustive here. If I make use of any other such services, I even might get inspired to mention them on here.

  • All the views that you find expressed on here in postings and articles are mine alone and not those of any organisation with which I have any association, through work or otherwise. As regards editorial policy, whatever appears here is entirely of my own choice and not that of any other person or organisation.

  • Please note that everything you find here is copyrighted material. The content may be available to read without charge and without advertising but it is not to be reproduced without attribution. As it happens, a number of the images are sourced from stock libraries like iStockPhoto so they certainly are not for abstraction.

  • With regards to any comments left on the site, I expect them to be civil in tone of voice and reserve the right to reject any that are either inappropriate or irrelevant. Comment review is subject to automated processing as well as manual inspection but whatever is said is the sole responsibility of the individual contributor.