Technology Tales

Adventures in consumer and enterprise technology

TOPIC: PHP

HTML Tidy for Windows

22nd March 2007

Drupal has modules (Import HTML and its helper Static HTML together make up one option) for importing static (X)HTML pages into its database, and it needs HTML Tidy to work. Since I am playing with the thing on Windows, I went out and snagged the version for that OS. Being either lazy or bloody-minded, I tried an XHTML page with PHP code embedded in it and, needless to say, the thing choked. I must try it with plain XHTML instead.

Open source CMS options

18th March 2007

After reading an article in the latest issue of PC Plus, I got curious about the world of content management systems again. I went over to OpenSourceCMS to sample the CMS demos that they have got on there. Mambo and Joomla! (I wish they dropped that exclamation mark; it messes up automated grammar checking) are fully fledged CMS’s and look impressive too, though how they would fit into my online presence is something of an open question. Since I spied that PHP-Nuke uses themes, that is an attraction; I am already used to that mindset thanks to WordPress. While Drupal is seemingly less slick than the others, that could be an attraction in itself; it does offer themes but no rich text editing is available.

Though all of the above are built on top of PHP/MySQL, I ignored them for some reason when I last looked at open-source content management systems. That does seem a strange thing to do, but this was a while ago and the moderate cost of adding database functionality to my website was not something that I was willing to pay, though I have done so since for HennessyBlog.

Therefore, I ended up seeing what Plone (built on Zope and using the Python programming language) could do. What I was had in mind at the time was a replacement for Perl-powered photo gallery, and a CMS was never going to fit the bill; it still doesn’t. In any case, Plone left me with the impression that it was an all or nothing affair when I like coexistence of website components on a single server. Things may have changed since then, so giving it another go remains an option.

Now that I have decided to have a look at Drupal, the emphasis this time is not on using it as a photo gallery platform; if I wanted that, I’d go with the API for something like Flickr or Zooomr. This time, the emphasis on using a CMS to manage the visitor information directories on my website. It does coexist with the other website components, including WordPress and the aforementioned bespoke built photo gallery. Interestingly, Drupal does offer blogging functionality if I wanted it.

Set up involved a spot of work with MySQL before moving onto other things:

mysql -u adminuserid -p /* logging in*/
create database drupal; /*creating new database*/
grant ALL on drupal.* to adminuser identified by “**********”; /* granting access to new database */
quit; /* exiting */

Because it is easier to see what’s going on (not wrong, hopefully), I prefer command line working with MySQL. For some reason, Drupal comes only in tar.gz archives, so I extracted this into the web server directory and opened up the site in Firefox. Installation only requires the set-up of database access and is soon completed. A few things turned up in the status report that needed attention: cron, this can be run manually; activation of PHP Unicode and GD library (PHP’s gd_info function is a real help in testing this) extensions, editing of php.ini to remove commenting semicolons activated them and restarting Apache made them available; having a place to store uploads, the directory called files got created.

Since then, I have set about bending it to my will, not always an easy thing to do with software. The first thing to do was to give it a static home page. By default, Drupal places tasters for any nominated pages and stories on its home page and shows configuration instructions until you allow some content to filter through. However, adding the Front Page module allows you to override this behaviour and have something more static. It was an entry on Kehan’s Blog that set me heading in the right direction.

The next steps were to persuade the thing to allow external links to exist in menus (though patches exist, I have yet to learn how to apply them other than finding the nefarious piece of code and replacing, a considerable challenge that makes me wonder if there is not a better way to do it: with a module, perhaps?) and carry on the theme editing until it ties in with the rest of my site. Then, I’ll make the decision whether to replace my current workflow (Perl-powered pre-processing of XML into PHP/XHTML using XSLT and the Saxon parser followed by FTP upload to the web server) with this one. The automation of the former argues in its favour. We’ll see how things pan out…

HennessyBlog theme update

12th February 2007

Over the weekend, I have been updating the theme on my other blog, HennessyBlog. It has been a task that projected me onto a learning curve with the WordPress 2.1 codebase. Thus, I have collected what I encountered, so I know that it’s out there on the web for you (and I) to use and peruse. It took some digging to get to know some of what you find below. Since any function used to power WordPress takes some finding, I need to find one place on the web where the code for WordPress is more fully documented. The sites presenting tutorials on how to use WordPress are more often than not geared towards non-techies rather than code cutters like myself. Then again, they might be waiting for someone to do it for them…

The changes made are as follows:

Tweaks to the interface

These are subtle, with the addition of navigation controls to the sidebar and the change in location of the post metadata being the most obvious enhancements. “Decoration” with solid and dashed lines (using CSS border attributes rather than the deprecated hr tagset) and standards compliance links.

Standards compliance

Adding standards compliance links does mean that you’d better check that all is in order; it was then that I discovered that there was work to be done. There is an issue with the WordPress wpautop function (it lives in the formatting.php file) in that it sometimes doesn’t add closing tags. Finding out that it was this function that is implicated took a trip to the WordPress.org website; while a good rummage in the wp-includes folder does a lot, it can’t achieve everything.

Like many things in the WordPress code, the wpautop function isn’t half buried. The the_content function (see template-functions-post.php) used to output blog entries calls the get_content function (also in template-functions-post.php) to extract the data from MySQL. The add_filter function (in plugin.php) associates the wpautop function and others with the get_the_content function to add the p tags to the output.

To return to the non-ideal behaviour that caused me to start out on the above quest, an example is where you have an img tag enclosed by div tags. The required substitution involves the use of regular expressions that work most of the time but get confused here. So adding a hack to the wpautop function was needed to change the code so that the p end tag got inserted. I’ll be keeping an eye out for any more scenarios like this that slip through the net and for any side effects. Otherwise, compliance is just making sure that all those img tags have their alt attributes completed.

Tweaks to navigation code

Most of my time has been spent on tweaking of the PHP code supporting the navigation. Because different functions were being called in different places, I wanted to harmonise things. To accomplish this, I created new functions in the functions.php for my theme and needed to resolve a number of issues along the way. Not least among these were regular expressions used for subsetting with the preg_match function; these were not Perl-compliant to my eyes, as would be implied by the choice of function. Now that I have found that PCRE’s in PHP use a more pragmatic syntax, there appeared to be issues with the expressions that were being used. These seemed to behave OK in their native environment but fell out of favour within the environs of my theme. Being acquainted with Perl, I went for a more familiar expression style and the issue has been resolved.

Along the way, I broke the RSS feed. This was on my off-line test blog so no one, apart from myself, that is, would have noticed. After a bit of searching, I realised that some stray white-space from the end of a PHP file (wp-config.php being a favourite culprit), after the PHP end tag in the script file as it happens, was finding its way into the feed and causing things to fall over. Feed readers don’t take too kindly to the idea of the XML declaration not making an appearance on the first line of the file. Some confusion was caused by the refusal of Firefox to refresh things as it should before I realised that a forced refresh of the feed display was needed. Sometimes, it takes a while for an addled brain to think of these kinds of things.

  • The content, images, and materials on this website are protected by copyright law and may not be reproduced, distributed, transmitted, displayed, or published in any form without the prior written permission of the copyright holder. All trademarks, logos, and brand names mentioned on this website are the property of their respective owners. Unauthorised use or duplication of these materials may violate copyright, trademark and other applicable laws, and could result in criminal or civil penalties.

  • All comments on this website are moderated and should contribute meaningfully to the discussion. We welcome diverse viewpoints expressed respectfully, but reserve the right to remove any comments containing hate speech, profanity, personal attacks, spam, promotional content or other inappropriate material without notice. Please note that comment moderation may take up to 24 hours, and that repeatedly violating these guidelines may result in being banned from future participation.

  • By submitting a comment, you grant us the right to publish and edit it as needed, whilst retaining your ownership of the content. Your email address will never be published or shared, though it is required for moderation purposes.