Technology Tales

Adventures & experiences in contemporary technology

Self-hosted web analytics tracking

24th April 2009

It amazes me now to think how little tracking I used to do on my various web “experiments” only a few short years ago. However, there was a time when a mere web counter, perhaps displayed on web pages themselves, was enough to yield some level of satisfaction, or dissatisfaction in many a case. Things have come a long way since then and we now seem to have analytics packages all around us. In fact, we don’t even have to dig into our pockets to get our hands on the means to peruse this sort of information either.

At this point, I need to admit that I am known to make use of a few simultaneously but thoughts about reducing their number are coming to mind but there’ll be more on that later. Given that this site is hosted using WordPress software, it should come as no surprise that Automattic’s own plugin has been set into action to see how things are going. The main focus is on the total number of visits by day, week and month with a breakdown showing what pages are doing well as well as an indication of how people came to the site and what links they followed while there. Don’t go expecting details of your visitors like the software that they are using and the country where they are accessing the site with this minimalist option and satisfaction should head your way.

There is next to no way of discussing the subject of website analytics without mentioning Google’s comprehensive offering in the area. You have to admit that it’s comprehensive with perhaps the only bugbear being the lack of live tracking. That need has been addressed very effectively by Woopra, even if its WordPress plugin will not work with IE6. Otherwise, you need the desktop application (being written in Java, it’s a cross-platform affair and I have had it going in both Windows and Linux) but that works well too. Apart maybe from the lack of campaigns, Woopra supplies as good as all of the information that its main competitor provides. It certainly doe what I would need from it.

However, while they can be free as in beer, there are a some costs associated with using using external services like Google Analytics and Woopra. Their means of tracking your web pages for you is by executing a piece of JavaScript that needs to be added to every page. If you have everything set to use a common header or footer page, that shouldn’t be too laborious and there are plugins for publishing platforms like WordPress too. This way of working means that if anyone has JavaScript disabled or decides not to enable JavaScript for the requisite hosts while using the NoScript extension with Firefox, then your numbers are scuppered. Saying that, the same concerns probably any JavaScript code that you may want to execute but there’s another cost again: the calls to external websites can, even with the best attention in the world, slow down the loading of your own pages. Not only is additional JavaScript being run but there also is the latency caused by servers having to communicate across the web.

A self-hosted analytics package would avoid the latter and I found one recently through Lifehacker. Amazingly, it has been around for a while and I hadn’t known about it but I can’t say that I was actively looking for it either. Piwik, formerly known as PHPMyVisites, is the name of my discovery and it seems not too immature either. In fact, I’d venture that it does next to everything that Google Analytics does. While I’d prefer that it used PHP, JavaScript is its means of tracking web pages too. Nevertheless, page loading is still faster than with Google Analytics and/or Woopra and Firefox/NoScript users would only have to allow JavaScript for one site too. If you have had experience with installing PHP/MySQL powered publishing platforms like WordPress, Textpattern and such like, then putting Piwik in place is no ordeal. You may find yourself changing folder access but uploading of the required files, the specification of database credentials and adding an administration user is all fairly standard stuff. I have the thing tracking this edifice as well as my outdoor activities (hillwalking/cycling/photography) web presence and I cannot say that I have any complaints so we’ll see how it goes from here.

A new feature request for Textpattern?

22nd April 2009

Having been doing some updates to articles in A Wanderer’s Miscellany, an idea that makes life easier when working in Textpattern with old articles has come to mind. Currently, there is no way to navigate through pages in the administration area other than using the search or previous/next functionality. I have gotten to thinking that being able to subset articles by section or category using drop down menus would be a good way forward. A search for a suitable plugin was set in train but it yielded nothing of immediate use (amazingly, no one has given it a go thus far), hence the thinking regarding a new feature request. There is a place on the Txp support forum for exactly this kind of thing and I am in the throes of plucking up the courage to go for it. Apparently, some code cutting of my own would grease the wheels for the progression of any such thing but I remain unsure as to how far I want to go down that route so a bare request might be what they get. Moving to an alternative platform might be an alternative proposition but I see little reason in doing so when what I have otherwise works well for what I want it to do. That Textpattern feature request might just come into being…

Ubuntu upgrades: do a clean installation or use Update Manager?

9th April 2009

Part of some recent “fooling” brought on by the investigation of what turned out to be a duff DVD writer was a fresh installation of Ubuntu 8.10 on my main home PC. It might have brought on a certain amount of upheaval but it was nowhere near as severe as that following the same sort of thing with a Windows system. A few hours was all that was needed but the question as to whether it is better to do an upgrade every time a new Ubuntu release is unleashed on the world or to go for a complete virgin installation instead. With Ubuntu 9.04 in the offing, that question takes on a more immediate significance than it otherwise might do.

Various tricks make the whole reinstallation idea more palatable. For instance, many years of Windows usage have taught me the benefits of separating system and user files. The result is that my home directory lives on a different disk to my operating system files. Add to that the experience of being able to reuse that home drive across different Linux distros and even swapping from one distro to another becomes feasible. From various changes to my secondary machine, I can vouch that this works for Ubuntu, Fedora and Debian; the latter is what currently powers the said PC. You might have to user superuser powers to attend to ownership and access issues but the portability is certainly there and it applies anything kept on other disks too.

Naturally, there’s always the possibility of losing programs that you have had installed but losing the clutter can be liberating too. However, assembling a script made up up of one of more apt-get install commands can allow you to get many things back at a stroke. For example, I have a test web server (Apache/MySQL/PHP/Perl) set up so this would be how I’d get everything back in place before beginning further configuration. It might be no bad idea to back up your collection of software sources either; I have yet to add all of the ones that I have been using back into Synaptic. Then there are closed source packages such as VirtualBox (yes, I know that there is an open source edition) and Adobe Reader. After reinstating the former, all my virtual machines were available for me to use again without further ado. Restoring the latter allowed me to grab version 9.1 (probably more secure anyway) and it inveigles itself into Firefox now too so the number of times that I need to go through the download shuffle before seeing the contents of a PDF are much reduced, though not completely eliminated by the Windows-like ability to see a PDF loaded in a browser tab. Moving from software to hardware for a moment, it looks like any bespoke actions such as my activating an Epson Perfection 4490 Photo scanner need to be repeated but that was all that I needed to do. Getting things back into order is not so bad but you need to allow a modicum of time for this.

What I have discussed so far are what might be categorised as the common or garden aspects of a clean installation but I have seen some behaviours that make me wonder if the usual Ubuntu upgrade path is sufficiently complete in its refresh of your system. The counterpoint to all of this is that I may not have been looking for some of these things before now. That may apply to my noticing that DSLR support seems to be better with my Canon and Pentax cameras both being picked up and mounted for me as soon as they are connected to a PC, the caveat being that they are themselves powered on for this to happen. Another surprise that may be new is that the BBC iPlayer’s Listen Again works without further work from the user, a very useful development. It very clearly wasn’t that way before I carried out the invasive means. My previous tweaking might have prevented the in situ upgrade from doing its thing but I do see the point of not upsetting people’s systems with an overly aggressive update process, even if it means that some advances do not make themselves known.

So what’s my answer regarding which way to go once Ubuntu Jaunty Jackalope appears? For sake of avoiding initial disruption, I’d be inclined to go down the Update Manager route first while reserving the right to do a fresh installation later on. All in all, I am left with the gut feeling is that the jury is still out on this one.

Investigating Textpattern

9th March 2009

With the profusion of Content Management Systems out there, open source and otherwise, my curiosity has been aroused for a while now. In fact, Automattic’s aspirations for WordPress (the engine powering this blog) now seem to go beyond blogging and include wider CMS-style usage. Some may even have put the thing to those kinds of uses but I am of the opinion that it has a way to go yet before it can put itself on a par with the likes of Drupal and Joomla!.

Speaking of Drupal, I decided to give it a go a while back and came away with the impression that it’s a platform for an entire website. At the time, I was attracted by the idea of having one part of a website on Drupal and another using WordPress but the complexity of the CSS in the Drupal template thwarted my efforts and I desisted. The heavy connection between template and back end cut down on the level of flexibility too. That mix of different platforms might seem odd in architectural terms but my main website also had a custom PHP/MySQL driven photo gallery too and migrating everything into Drupal wasn’t going to be something that I was planning. In hindsight, I might have been trying to get Drupal to perform a role for which it was never meant so I am not holding its non-fulfillment of my requirements against it. Drupal may have changed since I last looked at it but I decided to give an alternative a go regardless.

Towards the end of last year, I began to look at Textpattern (otherwise known as Txp) in the same vein and it worked well enough after a little effort that I was able to replace what was once a visitor dossier with a set of travel jottings. In some respects, Textpattern might feel less polished when you start to compare it with alternatives like WordPress or Drupal but the inherent flexibility of its design leaves a positive impression. In short, I was happy to see that it allowed me to achieve what I wanted to do.

If I remember correctly, Textpattern’s default configuration is that of a blog and it can be used for that purpose. So, I got in some content and started to morph the thing into what I had in mind. My ideas weren’t entirely developed so some of that was going on while I went about bending Txp to my will. Most of that involved tinkering in the Presentation part of the Txp interface though. It differs from WordPress in that the design information like (X)HTML templates and CSS are stored in the database rather than in the file system à la WP. Txp also has its own tag language called Textile and, though it contains conditional tags, I find that encasing PHP in <txp:php></txp:php> tags is a more succinct way of doing things; only pure PHP code can be used in this way and not a mixture of such in <?php ?> tags and (X)HTML. A look at the tool’s documentation together with perusal of Apress’ Textpattern Solutions got me going in this new world (it was thus for me, anyway). The mainstay of the template system is the Page and each Section can use a different Page. Each Page can share components and, in Txp, these get called Forms. These are included in a Page using Textile tags of the form <txp:output_form form=”form1″ />. Style information is edited in another section and you can have several style sheets too.

The Txp Presentation system is made up of Sections, Pages, Forms and Styles. The first of these might appear in the wrong place when being under the Content tab would seem more appropriate but the ability to attach different page templates to different sections places their configuration where you find it in Textpattern and the ability to show or hide sections might have something to do with it too. As it happens, I have used the same template for all bar the front page of the site and got it to display single or multiple articles as appropriate using the Category system. It may be a hack but it appears to work well in practice. Being able to make a page template work in the way that you require really offers a great amount of flexibility and I have gone with one sidebar rather than two as found in the default set up.

Txp also has facility to add plugins (look in the Admin section of the UI) and this is very different from WordPress in that installation involves the loading of an encoded text file, probably for sake of maintaining the security and integrity of your installation. I added the navigation facility for my sidebar and breadcrumb links in this manner and back end stuff like Tiny MCE editor and Akismet came as plugins too. There may not be as many of these for Textpattern but the ones that I found were enough to fulfill my needs. If there are plugin configuration pages in the administration interface, you will find these under the Extensions tab.

To get the content in, I went with the more laborious copy, paste and amend route. Given that I was coming from the plain PHP/XHTML way of doing things, the import functionality was never going to do much for me with its focus on Movable Type, WordPress, Blogger and b2. The fact that you only import content into a particular section may displease some too. Peculiarly, there is no easy facility for Textpattern to Textpattern apart from doing a MySQL database copy. Some alternatives to this were suggested but none seemed to work as well as the basic MySQL route. Tiny MCE made editing easier once I went and turned off Textile processing of the article text. This was done on a case by case basis because I didn’t want to have to deal with any unintended consequences arising from turning it off at a global level.

While on the subject of content, this is also the part of the interface where you manage files and graphics along with administering things like comments, categories and links (think blogroll from WordPress). Of these, it is the comment or link facilities that I don’t use and even have turned comments off in the Txp preferences. I use categories to bundle together similar articles for appearance on the same page and am getting to use the image and file management side of things as time goes on.

All in all, it seems to work well even if I wouldn’t recommend it to many to whom WordPress might be geared. My reason for saying that is because it is a technical tool and is used best if you are prepared to your hands dirtier from code cutting than other alternatives. I, for one, don’t mind that at all because working in that manner might actually suit me. Nevertheless, not all users of the system need to have the same level of knowledge or access and it is possible to set up users with different permissions to limit their exposure to the innards of the administration. In line with Textpattern’s being a publishing tool, you get roles such as Publisher (administrator in other platforms), Managing Editor, Copy Editor, Staff Writer, Freelancer, Designer and None. Those names may mean more to others but I have yet to check out what those access levels entail because I use it on a single user basis.

There may be omissions from Txp like graphical presentation of visitor statistics in place of the listings that are there now and the administration interface might do with a little polish but it does what I want from it and that makes those other considerations less important. That more cut down feel makes it that little more useful in my view and the fact that I have created A Wanderer’s Miscellany may help to prove the point. You might even care to take a look at it to see what can be done and I am sure that it isn’t even close to exhausting the talents of Textpattern. I can only hope that I have done justice to it in this post.

Work locally, update remotely

4th December 2008

Here’s a trick that might have its uses: using a local WordPress instance to update your online blog (yes, there are plenty of applications that promise to edit your online blog but these need file permissions to the likes of xmlrpc.php to be opened up). Along with the right database access credentials and the ability to log in remotely, adding the following two lines to wp-config.php does the trick:

define('WP_SITEURL', 'http://localhost/blog');

define('WP_HOME', 'http://localhost/blog');

These two constants override what is in the database and allow to update the online database from your own PC using WordPress running on a local web server (Apache or otherwise). One thing to remember here is that both online and offline directory structures are similar. For example, if your online WordPress files are in blog in the root of the online web server file system (typically htdocs for Linux), then they need to be contained in the same directory in the root of the offline server too. Otherwise, things could get confusing and perhaps messy. Another thing to consider is that you are modifying your online blog so the usual rules about care and attention apply, particularly with respect to using the same version of WordPress both locally and remotely. This is especially a concern if you, like me, run development versions of WordPress to see if there are any upheavals ahead of us like the overhaul that is coming in with WordPress 2.7.

An alternative use of this same trick is to keep a local copy of your online database in case of any problems while using a local WordPress instance to work with it. I used to have to edit the database backup directly (on my main Ubuntu system), first with GEdit but then using a sed command like the following:

sed -e s/www\.onlinewebsite\.com/localhost/g backup.sql > backup_l.sql

The -e switch uses regular expression substitution that follows it to edit the input with the output being directed to a new file. It’s slicker than the interactive GEdit route but has been made redundant by defining constants for a local WordPress installation as described above.

Controlling the post revision feature in WordPress 2.6

21st July 2008

This may seem esoteric for some but I like to be in control of the technology that I use. So, when Automattic included post revision retention to WordPress 2.6, I had my reservations about how much it would clutter my database with things that I didn’t need. Thankfully, there is a way to control the feature, but you won’t find the option in the administration screens (they seem to view this as an advanced setting and so don’t want to be adding clutter to the interface for the sake of something that only a few might ever use); you have to edit wp-config.php yourself to add it. Here are the lines that can be added and the effects that they have:

Code: define('WP_POST_REVISIONS','0');

Effect: turns off post revision retention

Code: define('WP_POST_REVISIONS','-1');

Effect: turns it on (the default setting)

Code: define('WP_POST_REVISIONS','2');

Effect: only retains two previous versions of a post (the number can be whatever you want so long as it’s an integer with a value more than zero).

Update (2008-07-23):

There is now a plugin from Dion Hulse that does the above for you and more.

JavaScript: write it yourself or use a library?

3rd July 2008

I must admit that I have never been a great fan of JavaScript. For one thing, its need to interact with browser objects places you at the mercy of the purveyors of such pieces of software. Debugging is another fine art that can seem opaque to the the uninitiated since the amount and quality of the logging is determined an interpreter that isn’t provided by the language’s overseers. All in all, it seems to present a steep and obstacle-strewn learning curve to newcomers. As it happens, I have always found server side scripting languages like PHP and Perl to be more to my taste and I have no aversion at all to writing SQL.

In the late 1990’s when I was still using free web hosting, JavaScript probably was the best option for my then new online photo gallery. Whatever was the truth, it certainly was the way that I went. Learning Java or Flash might have been useful but I never managed to devote sufficient time to the task so JavaScript turned out to be the way forward until I got a taste of server side scripting. Moving to paid hosting allowed for that to develop and the JavaScript option took a back seat.

Based on my experience of the browser wars and working with JavaScript throughout their existence, I was more than a little surprised at the buzz surrounding AJAX. Ploughing part of the way through WROX’s Beginning AJAX did nothing to sell the technology to me; it came across as a very dry jargon-blighted read. Nevertheless, I do see the advantages of web applications being as responsive as their desktop equivalents but AJAX doesn’t always guarantee this; as someone that has seen such applications crawling on IE6, I can certainly vouch for this. In fact, I suspect that may be behind the appearance of technologies such as AIR and Silverlight so JavaScript may get usurped yet again, just like my move to a photo gallery powered on the server side.

Even with these concerns, using JavaScript to add a spot more interactivity is never a bad thing even if it can be overdone, hence the speed problems that I have witnessed. In fact, I have been known to use DOM scripting but I need to have the use in mind before I can experiment with a technology; I cannot do it the other way around. Nevertheless, I am keen to see what JavaScript libraries such as jQuery and Prototype might have to offer (both have been used in WordPress). I have happened on their respective websites so they might make good places to start and who knows where my curiosity might take me?

A case of “peekaboo” behaviour in Internet Explorer

1st July 2008

Recently, I changed the engine of my online photo gallery to a speedier PHP/MySQL-based affair from its PHP/Perl/XML-powered predecessor. On the server side, all was well, but a peculiar display issue turned up in Internet Explorer (6, 7 & 8 were afflicted by this behaviour) where photo caption text on the thumbnail gallery pages was being displayed erratically.

As far as I can gather, the trigger for the behaviour was that the thumbnail block was placed within a DIV floated using CSS that touched another DIV that cleared the floating behaviour. I use a table to hold the images and their associated captions in place. Furthermore, each caption was also a hyperlink nested within a set of P tags.

The remedy was to set the CSS Display property for the affected XHTML tag to a value of “inline-block”. Within a DIV, TABLE, TR, TD, P and A tag hierarchy, finding the right tag where the CSS property in question has the desired effect took some doing. As it happened, it was the tag set, that for the hyperlink, at the bottom of the stack that needed the fix.

Of course, it’s all very fine fixing something for one browser but it’s worthless if it breaks the presentation in other browsers. In that vein, I did some testing in Opera, Firefox, Seamonkey and Safari to check if all was well and it was. There may be older browsers, like versions of IE prior to 6, where things don’t appear as intended but I get the impression from my visitor statistics that the newer variants hold sway anyway. All in all, it was a useful lesson learnt and that’s never a bad thing.

A reasonable requirement of an IDE

20th May 2008

I have been having a play with NetBeans IDE Early Access for PHP and, while it has a lot to offer, one impression remains uppermost in my mind: it is so slow. I might have a project with a lot of files in it but start-up takes an age because of project checking. Other functionality such as text searching is far from speedy either. The sluggishness probably arises from this release being very early in its life cycle and it reminds me of how slow older versions of the Java IDE were, even if this is slower. For PHP development, I’ll be giving NetBeans a while to mature before taking another look at it.

On a similar note, I recently dispatched Quanta Plus from my system for sluggish start-ups and will not return to it because other alternatives such as Bluefish and Eclipse PDT fit my needs much better. I like my editors to be slick and responsive and Quanta has been around long enough for any slowness to be knocked out of it. However, I get the feeling that the extras have added bloat while I expect any additional functionality that I never use not to get in my way. It is for the latter reason that I was always able to get on with Dreamweaver and even run it on Ubuntu using the WINE library. If I really wanted a stripped out yet functional editor, Gedit would do most of what I need -- it colour codes syntax for a variety of languages for a start -- but it’s always handy to have a file system explorer window incorporated and I value any syntax checking and auto-completion as well. So, it looks as if Eclipse and Bluefish could be serving my needs for a while to come alongside so use Dreamweaver for online editing of website files.

The dangers of overriding JavaScript onload event handlers

17th April 2008

I gave myself a right old fright while tinkering with my hillwalking and photo gallery website. The problem stemmed from my use of window.onload to set up behaviours for web pages in a visitor information directory on the site. A lapse of concentration allowed me to associate an onload event handler with the body tags of the pages using a common header PHP script; another lapse also meant that my mistake was on public view for all to see because I uploaded files before I spotted the problem.

The result was that I was left wondering why the window.onload pieces weren’t working at all, something that seriously broke the pages. The mists of panic and bewilderment were cleared in good time by the realisation as what had happened: the body tag onload had overridden the window onload and rendered it inactive. I don’t know from where the thought arrived but it was the one that resolved the problems that I was seeing; it might be that I might have met it before in the dim and not-so-distance past.

Having your pages degrade gracefully for when a visitor has not enabled JavaScript or when you are foolish enough to break something like I did is definitely an asset, a point brought home to me by my salutary experience. I am not sure why I was willing to run the risk that I did but it now looks as if I need to include the task of adding improved graceful degradation to the to do list.

  • All the views that you find expressed on here in postings and articles are mine alone and not those of any organisation with which I have any association, through work or otherwise. As regards editorial policy, whatever appears here is entirely of my own choice and not that of any other person or organisation.

  • Please note that everything you find here is copyrighted material. The content may be available to read without charge and without advertising but it is not to be reproduced without attribution. As it happens, a number of the images are sourced from stock libraries like iStockPhoto so they certainly are not for abstraction.

  • With regards to any comments left on the site, I expect them to be civil in tone of voice and reserve the right to reject any that are either inappropriate or irrelevant. Comment review is subject to automated processing as well as manual inspection but whatever is said is the sole responsibility of the individual contributor.