Technology Tales

Adventures & experiences in contemporary technology

Building a sitemap in XML

24th November 2022

While there are many tools that will build XML site maps, there is some satisfaction to be had in creating your own. This is in spite of there being a multitude of search engine optimisation plugins for content management systems like WordPress or what is built into static site generators like Hugo. Sometimes, building your own allows for added simplicity and that is shared with recent efforts in WordPress theme development.

The sitemap XML protocol is simple enough to offer a short coding project. The basis was what Hugo generates and I used Python to create the XML files. The only libraries that I needed were configparser, SQLAlchemy and pandas. the first two of these allowed databases to be queried and the last on the list was used for data processing. Otherwise, it was a case of using what is built into the Python language like file writing and looping.

Once the scripts were ready, they could be uploaded to web servers and executed by scheduled jobs using CRON to keep things up to date. along the way, I also uncovered a way to publicise the locations of the sitemap files to search engine bots using robots.txt.  The structure of the instruction is the following:

User-agent: *
Sitemap: sitemap.xml

This means that it announces to all bots the location of the sitemap file. In my case, I always included the full URL for the XML file and that clearly varies by website location.

Self hosting Google fonts

14th November 2022

One thing that I have been doing across my websites is to move away from using Google web fonts using the advice on the Switching.Software website. While I have looked at free web font directories like 1001 Free Fonts or DaFont, they do not have the full range of bolding and character sets that I desire so I opted instead for the Google Webfonts Helper website. That not only offered copies of what Google has but also created a portion of CSS that I could add to a stylesheet on a website, making things more streamlined. At the same time, I also took the opportunity to change some of the fonts that were being used for sake of added variety. Open Sans is good but there are other acceptable sans-serif options like Mulish or Nunita as well, so these got used.

Some online writing tools

15th October 2021

Every week, I get an email newsletter from Woody’s Office Watch. This was something to which I started subscribing in the 1990’s but I took a break from it for a good while for reasons that I cannot recall and returned to it only in recent years. This week’s issue featured a list of online paraphrasing tools that are part of what is offered by Quillbot, Paraphraser, Dupli Checker and Pre Post Seo. Each got their own reviews in the newsletter so I will just outline other features in this posting.

In Quillbot’s case, the toolkit includes a grammar checker, summary generator, and citation generator. In addition to the online offering, there are extensions for Microsoft Word, Google Chrome, and Google Docs. In addition to the free version, a paid subscription option is available.

In spite of the name, Paraphraser is about more than what the title purports to do. There is article rewriting, plagiarism checking, grammar checking and text summarisation. Because there is no premium version, the offering is funded by advertising and it will not work with an ad blocker enabled. The mention of plagiarism suggests a perhaps murkier side to writing that cuts both ways: one is to avoid copying other work while another is the avoidance of groundless accusations of copying.

It was appear that the main role of Dupli Checker is to avoid accusations of plagiarism by checking what you write yet there is a grammar checker as well as a paraphrasing tool on there too. When I tried it, the English that it produced looked a little convoluted and there is a lack of fluency in what is written on its website as well. Together with a free offering that is supported by ads that were not blocked by my ad blocker, there are premium subscriptions too.

In web publishing, they say that content is king so the appearance of an option using the acronym for Search Engine Optimisation in it name may not be as strange as it might as first glance. There are numerous tools here with both free and paid tiers of service. While paraphrasing and plagiarism checking get top billing in the main menu on the home page, further inspection reveals that there is a lot more to check on this site.

In writing, inspiration is a fleeting and ephemeral quantity so anything that helps with this has to be of interest. While any rewriting of initial content may appear less smooth than the starting point, any help with the creation process cannot go amiss. For that reason alone, I might be tempted to try these tools from time to time and they might assist with proof reading as well because that can be a hit and miss affair for some.

 

Using .htaccess to control hotlinking

10th October 2020

There are times when blogs cease to exist and the only place to find the content is on the Wayback Machine. Even then, it is in danger of being lost completely. One such example is the subject of this post.

Though this website makes use of the facilities of Cloudflare for various functions that include the blocking of image hotlinking, the same outcome can be achieved using .htaccess files on Apache web servers. It may work on Nginx to a point too but there are other configuration files that ought to be updated instead of using a .htaccess when some frown upon the approach. In any case, the lines that need adding to .htaccess are listed below though the web address needs to include your own domain in place of the dummy example provided:

RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?yourdomain.com(/)?.*$ [NC]
RewriteRule .*\.(gif|jpe?g|png|bmp)$ [F,NC]

The first line turns on the mod_rewrite engine and you may have that done anyway. Of course, the module needs enabling in your Apache configuration for this to work and you have to be allowed to perform the required action as well. This means changing the Apache configuration files. The next pair of lines look at the HTTP referer strings and the third one only allows images to be served from your own web domain and not others. To add more, you need to copy the third line and change the web address accordingly. Any new lines need to precede the last line that defines the file extensions that are to be blocked to other web addresses.

RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?yourdomain.com(/)?.*$ [NC]
RewriteRule \.(gif|jpe?g|png|bmp)$ /images/image.gif [L,NC]

Another variant of the previous code involves changing the last line to display a default image showing others what is happening. That may not reduce the bandwidth usage as much as complete blocking but it may be useful for telling others what is happening.

Turning off push notifications in Firefox 46

7th May 2016

A new feature came with Firefox 44 that only recently started to come to my notice with Yahoo Mail offering to set up browser notifications for every time when a new email arrives there. This is something that I did not need and yet I did not get the option to switch it off permanently for that website so I was being nagged every time I when to check on things for that email address, an unneeded irritation. Other websites offered to set up similar push notifications but you could switch these off permanently so it is a site by site function unless you take another approach.

That is to open a new browser tab and enter about:config in the address bar before hitting the return key. If you have not done this before, a warning message will appear but you can dismiss this permanently.  Once beyond that, you are presented with a searchable list of options and the ones that you need are dom.webnotifications.enabled and dom.webnotifications.serviceworker.enabled. By default, the corresponding entries in the Value column will be true. Double-clicking on each one will set it to false and you should not see any more offers of push notifications that allow you get alerts from web services like Yahoo Mail so your web browsing should suffer less of these intrusions.

Changing to web fonts

12th February 2012

While you can add Windows fonts to Linux installations, I have found that their display can be flaky to say the least. Linux Mint and Ubuntu display them as sharp as I’d like but I have struggled to get the same sort of results from Arch Linux while I am not so sure about Fedora or openSUSE either.

That has caused me to look at web fonts for my websites with Google Web Fonts doing what I need with both Open Sans and Arimo doing what I need so far. There have been others with which I have dallied, such as Droid Sans, but these are the ones on which I have settled for now. Both are in use on this website now and I added calls for them to the web page headers using the following code (lines are wrapping due to space constraints):

<link href="http://fonts.googleapis.com/css?family=Open+Sans:300italic,400italic,600italic,700italic,400,300,600,700" rel="stylesheet" type="text/css">
<link href='http://fonts.googleapis.com/css?family=Arimo:400,400italic,700,700italic' rel='stylesheet' type='text/css'>

With those lines in place, it then is a matter of updating font-family and font declarations in CSS style sheets with “Open Sans” or “Arimo” as needed while keeping alternatives defined in case the Google font service goes down for whatever reason. A look at a development release of the WordPress Twenty Twelve theme caused me to come across Open Sans and I like it for its clean lines and Arimo, which was found by looking through the growing Google Web Fonts catalogue, is not far behind. Looking through that catalogue now causes for me a round of indecision since there is so much choice. For that reason, I think it better to be open to the recommendations of others.

Another look at Drupal

20th January 2010

Early on in the first year of this blog, I got to investigating the use of Drupal for creating an article-based subsite. In the end, the complexities of its HTML and CSS thwarted my attempts to harmonise the appearance of web pages with other parts of the same site and I discontinued my efforts. In the end, it was Textpattern that suited my needs and I have stuck with that for the aforementioned subsite. However, I recently spotted someone very obviously using Drupal in its out of the box state for a sort of blog (there is even an extension for importing WXR files containing content from a WordPress blog); they even hadn’t removed the Drupal logo. With my interest rekindled, I took another look for the sake of seeing where things have gone in the last few years. Well, first impressions are that it now looks like a blogging tool with greater menu control and the facility to define custom content types. There are plenty of nice themes around too though that highlights an idiosyncrasy in the sense that content editing is not fully integrated into the administration area where I’d expect it to be. The consequence of this situation is that pages, posts (or story as the content type is called) or any content types that you have defined yourself are created and edited with the front page theme controlling the appearance of the user interface. It is made even more striking when you use a different theme for the administration screens. That oddity aside, there is a lot to recommend Drupal though I’d try setting up a standalone site with it rather than attempting to shoehorn it as a part of an existing one like what I was trying when I last looked.

Investigating the real-time web

10th January 2010

Admittedly, I have been keeping away from Twitter and its kind for a while now but the current run of cold weather in the Britain and Ireland has alerted me to its usefulness and I have given the thing a go. With public transport operator website heaving over the last week, the advantages of microblogging became more than apparent, thanks in no small part to the efforts of Centrebus, National Rail Enquiries and the U.K. Met Office. The pithy nature of any messages saves the effort needed to compile a longer blog post and to read it afterwards. This aspect makes it invaluable for those times when all that needs to be communicated is short and sweet. Anything that cuts down on the information tide that hits all of us every day cam only be a good thing.

Along with Twitter, there is a whole suite of tools available for various bits and pieces. First off, there’s integration with WordPress courtesy of plugins like Alex King’s Twitter Tools. After that, there are numerous web applications for taming the beast. Though I only can say that I scratched the surface of what’s available, I have come accross HootSuite and Twitterfeed. The former is a console for managing more than one Twitter account at once while also offering the facility to do the same for Facebook, LinkedIn, WordPress.com and others too. Twitterfeed may be more limited in scope with offering to turn RSS feeds into tweets but it has its place too. HootSuite might have something similar for WordPress but Twitterfeed is a good more universal in its sweep. Naturally, there’s more out there than these two but I am not trying to be exhaustive here. If I make use of any other such services, I even might get inspired to mention them on here.

DePo Masthead

6th November 2009

There is a place on WordPress.com where I share various odds and ends about public transport in the U.K. It’s called On Trains and Buses and I try not to go tinkering with the design side of things too much. You only have the ability to change the CSS and my previous experience of doing that with this edifice while it lived on there taught me not to expect too much even if there are sandbox themes for anyone to turn into something presentable, not that I really would want to go doing that in full view of everyone (doing if offline first and copying the CSS afterwards when it’s done is my preferred way of going about it). Besides, I wanted to see how WordPress.com fares these days anyway.

While my public transport blog just been around for a little over a year, it’s worn a few themes over that time, ranging from the minimalist The Journalist v1.9 and Vigilance through to Spring Reloaded. After the last of these, I am back to minimalist again with DePo Masthead, albeit with a spot of my own colouring to soften its feel a little. I must admit growing to like it but it came to my attention that it was a bespoke design from Derek Powazek that Automattic’s Noel Jackson turned into reality. The result would appear that you cannot get it anywhere but from the WordPress.com Subversion theme repository. For those not versed in the little bit of Subversion action that is needed to get it, I did it for you and put it all into a zip file without making any changes to the original, hoping that it might make life easier for someone.

Download DePo Masthead

Going mobile

20th October 2009

Now that the mobile web is upon us, I have been wondering about making my various web presences more friendly for users of that platform and my interest has been piqued especially by the recent addition of such capability to WordPress.com. With that in mind, I grabbed the WordPress Mobile Edition plugin and set it to work, both on this blog and my outdoors one. Well, the results certainly seem to gain a seal of approval from mobiReady so that’s promising. It comes with a version of the Carrington Mobile theme but you need to pop that into the themes directory on your web server yourself for WordPress’ plugin installation routines won’t do that for you. It could be interesting to see how things go from here and the idea of creating my own theme while using the plugin for redirection honours sounds like a way forward;I have found the place where I can make any changes as needed. Home made variants of the methodology may find a use with my photo gallery and Textpattern sub-sites.

  • All the views that you find expressed on here in postings and articles are mine alone and not those of any organisation with which I have any association, through work or otherwise. As regards editorial policy, whatever appears here is entirely of my own choice and not that of any other person or organisation.

  • Please note that everything you find here is copyrighted material. The content may be available to read without charge and without advertising but it is not to be reproduced without attribution. As it happens, a number of the images are sourced from stock libraries like iStockPhoto so they certainly are not for abstraction.

  • With regards to any comments left on the site, I expect them to be civil in tone of voice and reserve the right to reject any that are either inappropriate or irrelevant. Comment review is subject to automated processing as well as manual inspection but whatever is said is the sole responsibility of the individual contributor.