Technology Tales

Adventures in consumer and enterprise technology

TOPIC: WORLD WIDE WEB

Running cron jobs using the www-data system account

22nd December 2018

When you set up your own web server or use a private server (virtual or physical), you will find that web servers run using the www-data account. That means that website files need to be accessible to that system account if not owned by it. The latter is mandatory if you want WordPress to be able to update itself with needing FTP details.

It also means that you probably need scheduled jobs to be executed using the privileges possessed by the www-data account. For instance, I use WP-CLI to automate spam removal and updates to plugins, themes and WordPress itself. Spam removal can be done without the www-data account, but the updates need file access and cannot be completed without this. Therefore, I got interested in setting up cron jobs to run under that account and the following command helps to address this:

sudo -u www-data crontab -e

For that to work, your own account needs to be listed in /etc/sudoers or be assigned to the sudo group in /etc/group. If it is either of those, then entering your own password will open the cron file for www-data, and it can be edited as for any other account. Closing and saving the session will update cron with the new job details.

In fact, the same approach can be taken for a variety of commands where files only can be accessed using www-data. This includes copying, pasting and deleting files as well as executing WP-CLI commands. The latter issues a striking message if you run a command using the root account, a pervasive temptation given what it allows. Any alternative to the latter has to be better from a security standpoint.

More on mod_rewrite

25th June 2007

Today, I caught sight of an article on anti-plagiarism tools at The Blog Herald, and among the tricks was to use mod-rewrite to stop people "borrowing" both your images and your bandwidth. The gist is that you set up one or more conditions that exclude websites from the application of a rule forbidding access to images; the logic is that if the website referencing an image is not one of the websites listed in the conditions, then it doesn't get to display any of your images.

RewriteCond %{HTTP_REFERER} !^http://(www\.)?awebsite.com(/)?.*$ [NC]

RewriteRule .*\.(gif|jpe?g|png|bmp)$ [F,NC]

The wonders of mod_rewrite

24th June 2007

When I wrote about tidying dynamic URL's a little while back, I had no inkling that that would be a second part to the tale. My discovery of mod_rewrite, an Apache module that facilitates URL translation. The effect is that one URL is presented to the user in the browser address bar, and the very same URL is also seen by search engines, while another is passed to the server for processing. Though it might sound like subterfuge, it works very well once you manage to get it set up properly. While the web host for my hillwalking blog/photo gallery has everything configured such that it is ready to go, the same did not apply to the offline Apache 2.2.x server that I have going on my own Windows XP box. There were two parts to getting it working there:

  1. Activating mod-rewrite on the server: this is as easy as uncommenting a line in the httpd.conf file for the site (the line in question is: LoadModule rewrite_module modules/mod_rewrite.so).
  2. Ensuring that the .htaccess file in the root of the web server directory is active. You need to set the values of the AllowOverride directives for the server root and CGI directories to All so that .htaccess is active. Not doing it for the latter will result in an error beginning with the following: Options FollowSymLinks or SymLinksIfOwnerMatch is off which implies that RewriteRule directive is forbidden. Having Allow from All set for the required directories is another option to consider when you see errors like that.

Once you have got the above sorted, add this line to .htaccess: RewriteEngine On. Preceding it with an Options directive to ensure that FollowSymLinks and SymLinksIfOwnerMatch are switched on does no harm at all and may even be needed to get things running. That done, you can set about putting mod_write to work with lines like this:

RewriteRule ^pages/(.*)/?$ pages.php?query=$1

The effect of this is to take http://www.website.com/pages/input and convert it into a form for action by the server; in this case, that is http://www.website.com/pages.php?query=input. Anything contained by a bracket is assigned to the value of a system-named variable. If you have several bracketed sections, they are assigned to sequentially numbered variables as follows: $1 for the first, $2 for the second and so on. It's all good stuff when you get it going, and not only does it make things look much neater, but it also possesses an advantage when it comes to future-proofing too. Web addresses can be kept constant over time, even if things change behind the scenes. It means that any returning visitors will find what they saw the last time that they visited and surely must ensure good karma in the eyes of those all important search engines.

  • The content, images, and materials on this website are protected by copyright law and may not be reproduced, distributed, transmitted, displayed, or published in any form without the prior written permission of the copyright holder. All trademarks, logos, and brand names mentioned on this website are the property of their respective owners. Unauthorised use or duplication of these materials may violate copyright, trademark and other applicable laws, and could result in criminal or civil penalties.

  • All comments on this website are moderated and should contribute meaningfully to the discussion. We welcome diverse viewpoints expressed respectfully, but reserve the right to remove any comments containing hate speech, profanity, personal attacks, spam, promotional content or other inappropriate material without notice. Please note that comment moderation may take up to 24 hours, and that repeatedly violating these guidelines may result in being banned from future participation.

  • By submitting a comment, you grant us the right to publish and edit it as needed, whilst retaining your ownership of the content. Your email address will never be published or shared, though it is required for moderation purposes.