Technology Tales

Adventures & experiences in contemporary technology

Installing Perl modules using CPAN on Linux Mint 19.2

28th September 2019

My online travel photo gallery is a self-coded set of PHP scripts that read data from tables in a MySQL database. These tables are built from input XML files using a Perl script that itself creates and executes an SQL script. The Perl script also does some image processing using GraphicsMagick commands to resize images and to add copyright information and image framing. Because this processed one image at a time sequentially, it was taking several minutes to complete and only partly used the capacity of the PC that I used.

This led me to look at adding parallel processing and that is what brought me to looking at the Parallel::ForkManager Perl module. An alternative approach might have been to add new images in such a way as not to need the full run involving hundreds of image files, but that will take more work and I fancied having a look at parallelising things anyway.

If it was not there already, the first act would have been to install build-essential to get access to the cpan command. The following command accomplishes this:

sudo apt-get install build-essential

Once that is there, the cpan command needs to be run and some questions answered to get things going. The first question to answer is whether you want setup to be as automated as possible and the default answer of yes worked for me. The next question to answer regards the approach that cpan takes when installing modules and I chose sudo here (local::lib is the default value and manual is another option). After this, cpan drops into its own command shell. Here, I issued two more commands to continue the basic setup by updating CPAN.pm to the latest version and adding Bundle::CPAN to optimise the module further:

make install
install Bundle::CPAN

Continuing the last of these may need extra intervention to confirmation the suggested default of exit at one point in its operation and that takes a little time to complete. It is after this that Parallel::ForkManager can be installed using the following command:

install Parallel::ForkManager

That completed quickly and the cpan shell was exited using its exit command. Then, the new module was available in scripting after that. The actual use of this module is something that hope to describe in another post so I am ending this one here and the same process is just as applicable to setting up cpan and adding any other Perl CPAN module.

Creating a data-driven informat in SAS

27th September 2019

Recently, I needed to create some example data with an extra numeric identifier variable that would be assigned according to the value of a character identifier variable. Not wanting to add another dataset merge or join to the code, I decided to create an informat from data. Initially, I looked into creating a format instead but it not accomplish what I wanted to do.

data patient;
keep fmtname start end label type;
set test.dm;
by subject;
fmtname="PATIENT";
start=subject;
end=start;
label=patient;
type="I";
run;

The input data needed a little processing as shown above. The format name was added in the variable FMTNAME and TYPE variable was assigned a value of I to make this a numeric informat; to make character equivalent, a value of J is assigned. The START and END variables declare the value range associated with the value of LABEL that would become the actual value of the numeric identifier variable. The variable names are fixed because the next step will not work with different ones.

proc format lib=work cntlin=patient;
run;
quit;

To create the actual informat, the dataset is read by a FORMAT procedure with the CNTLIN parameter specifying the name of the input dataset and LIB defining the library where the format catalog is stored. When this in complete, the informat is available for use with an input function as shown in the code excerpt below.

data ae1;
set ae;
patient=input(subject,patient.);
run;

Compressing an entire dataset library using SAS

26th September 2019

Turning dataset compression for SAS datasets can produce quite a reduction in size so it often is standard practice to do just this. It can be set globally with a single command and many working systems do this for you:

options compress=yes;

It also can be done on a dataset by dataset option by adding (compress=yes) beside the dataset name on the data line of a data step. The size reduction can be gained retrospectively too but only if you create additional copies of the datasets. The following code uses the DATASETS procedure to accomplish this end at the time of copy the datasets from one library to another:

proc datasets lib=adam nolist;
copy inlib=adam outlib=adamc noclone datecopy memtype=data;
run;
quit;

The NOCLONE option on the COPY statement allows the compression status to be changed while the DATECOPY one keeps the same date/time stamp on the new file at the operating system level. Lastly, the MEMTYPE=DATA setting ensures that only datasets are processed while the NOLIST option on the PROC DATASETS line suppresses listing of dataset information.

It may be possible to do something like this with PROC COPY too but I know from use that the option presented here works as expected. It also does not involve much coding effort, which helps a lot.

A display of brand loyalty

12th July 2013

Since 2007, my main camera has been a Pentax K10D DSLR and it has gone on many journeys with me. In fact, more than 15,000 images have been captured with it and I have classed it as an unfailing servant. The autofocus may not be the fastest but my subjects tend to be stationary: landscapes, architecture, flora and transport. Even any bus and train photos have included parked vehicles rather than moving ones so there never have been issues. The hint of underexposure in any photos always can be sorted because DNG files are what I create, with all the raw capture information that is possible to retain. In fact, it has been hard to justify buying another SLR because the K10D has done so well for me.

In recent months, I have looking at processed photos and asking myself if time has moved along for what is not far from being a six year old camera. At various times, I have been looking at higher members of the Pentax while wondering if an upgrade would be a good idea. First, there was the K7 and then the K5 before the K5 II got launched. Even though its predecessor is still to be found on sale, it was the newer model that became my choice.

Pentax K5-II

My move to Pentax in 2007 was a case of brand disloyalty since I had been a Canon user from when I acquired my first SLR, an EOS 300. Even now, I still have a Powershot G11 that finds itself slipped into a pocket on many a time. Nevertheless, I find that Canon images feel a little washed out prior to post processing and that hasn’t been the case with the K10D. In fact, I have been hearing good things about Nikon cameras delivering punchy results so one of them would be a contender were it not for how well the Pentax performed.

So, what has my new K5 II body gained me that I didn’t have before? For one thing, the autofocus is a major improvement on that in the K10D. It may not stop me persevering with manual focusing for most of the time but there are occasions the option of solid autofocus is good to have. Other advances include a 16.3 megapixel sensor with a much larger ISO range. The advances in sensor technology since when the K10D appeared may give me better quality photos and noise is something that my eyes may have begun to detect in K10D photos even at my usual ISO of 400.

There have been innovations that I don’t need too. Live View is something that I use heavily with the Powershot G11 because it has such a pitiful optical viewfinder. The K5 II has a very bright and sharp one so that function lays dormant, especially when I witnessed dodgy autofocus performance with it in use; manual focusing should be OK, I reckon. By default too, the screen stays on all the time and that’s a nuisance for an optical viewfinder user like me so I looked through the manual and the menus to switch off the thing. My brief flirtation with the image level display met an end for much the same reason though it’s good that it’s there. There is some horizon auto-correction available as a feature and this is left on to see what it offers since there have been a multitude of times when I needed to sort out crooked horizons caused by my handholding the camera.

The K5 II may have a 3″ screen on its back but it has done nothing to increase the size of the camera. If anything, it is smaller that the K10D and that usefully means that I am not on the lookout for a new camera holster. Not having a bigger body also means there is little change in how the much camera feels in the hand compared with the older one.

In many ways, the K5 II works very like the K10D once I took control over settings that didn’t suit me. Both have Shake Reduction in their camera bodies though the setting has been moved into the settings menu in the new camera when the older one had a separate switch on its body. Since I’d be inclined to leave it on all the time and prefer not to have it knocked off accidentally, this is not an issue.  Otherwise, many of the various switches are in the same places so it’s not that hard to find my way around them.

That’s not to say that there aren’t other changes like the addition of a lock to the mode dial but I have used Canon EOS camera bodies with that feature so I do not consider it a step backwards. The exposure compensation button has been moved to the top of the camera where I found it very easily and have been using perhaps more than on the K10D; it’s also something that I use on the G11 so the experimentation is being brought across to the K5 II now as well. Beside it, there’s a new ISO button so further experimentation can be attempted with that to see how it does.

If I have any criticism, it’s about the clutter of the menus on the K5 II. The long lists through you scrolled on the K10D have been replaced with a series of extra tabs so that on-screen scrolling is not needed as before. However, I reckon that this breaks up things too much and makes working through the settings look more foreboding to anyone who is not so technical in mindset. Nevertheless, settings such as the the type of file to capture are there and I continue to use RAW DNG files as is usual for me though JPEG and Pentax’s own RAW format also are there. For a while, I forgot to set the date, soon found out what I did and the situation was remedied. The same sort of thing applied to storing files in different folders according to the capture date. For my own reasons, I turned this off to put everything into a single PENTX directory to suit my own workflow. My latest discovery among the menus was the ability to add photographer and copyright holder information to the EXIF metadata attached to the image files created by the camera. With legislative proposals that dilute the automatic rights of copyright holders going through the U.K. parliament, this seems a very timely inclusion even if most would prefer that there was no change to copyright law.

Of course, the worth of any camera is in the images that it produces and I have been happy with what I have been getting so far. The bigger files mean less images fit on a memory card as before. Thankfully, SDHC card capacities have grown even if I don’t wish to machine gun my photography altogether. While out and about, I was surprised to apertures like F/14 and F/18 when I was more accustomed to a progression like F/11, F/13, F/16, F/19, F/22, etc. Most of those older values still are there though so there hasn’t been a complete break with convention. The same comment applies to shutter speeds where ones like 1/100 and 1/160 made there appearance where I might have expected just ones like 1/90, 1/125, 1/250 and so on. The extra possibilities, and that is what they are, do allow more flexibility I suppose and may even make it easier to make correct exposures though any judgement of correctness has to be in the eye of a photographer and not what a computer algorithm in a camera determines. For much of the time until now, I have stuck with an ISO of 400 apart from a little testing in a woodland area of an evening soon after the camera arrived.

Since the K5 II came my way a few months ago, I have been meaning to collect my thoughts on here and there has been a delay while I brought mu thinking to a sensible close.At one point, it felt like there was so much to say that the piece became larger in my mind that even what you have been reading now. After all, there are other things that I can adjust to see how the resulting images look and white balance is but one of these.The K10D isn’t beyond experimentation either, especially since I discovered that shake reduction was switched off and it has me asking if that lacking in quality that I mentioned earlier has another explanation. Of course, actually making use of my tripod would be another good suggestion so it’s safe to say that yet more photographic explorations await.

ERROR: This range is repeated, or values overlap: – .

15th September 2012

This is another posting in an occasional series on SAS error and warning messages that aren’t as clear as they’d need to be. What produced the message was my creation of a control data set that I then wished to use to create a data-driven (in)format. It was the PROC FORMAT step that issued the message and I got no (in)format created. However, there were no duplicate entries in the control data set as the message suggested to me so a little more investigation was needed.

What that revealed was that there might be one variable missing from the data set that I needed to have. The SAS documentation has FMTNAME, START and LABEL as compulsory variables with they containing the following: format name, initial value and displayed value. My intention was to create a numeric code variable for one containing character strings using my data-driven format with then numbers specified within a character variable as it should be. What was missing then was TYPE.

This variable can be one of the following values: C for character formats, I for numeric informats, J for character informats, N for numeric formats and P for picture formats. Due to it being a conversion from character values to numeric ones, I set the values of TYPE to I and used an input function to do the required operations. The code for successfully creating the informat is below:

proc sql noprint;
create table tpts as
select distinct "_vstpt" as fmtname,
"I" as type,
vstpt as start,
vstpt as end,
strip(put(vstptnum,best.)) as label
from test
where not missing(vstptnum);
quit;

proc format library=work cntlin=tpts;
run;
quit;

Though I didn’t need to do it, I added an END variable too for sake of completeness. In this case, the range is such that its start and end are the same and there are cases where that will not be the case though I am not dwelling on those.

Watch where you store your virtual machines when using VMware on Linux

12th July 2008

My experience is with Ubuntu on this one but I have found that you need to be careful as regards the file system used by the drive where you keep your virtual machines. If it is NTFS, VMware can fail to start a VM because it cannot create a virtual memory file while it presents as physical memory to a guest operating system. Use ext2 or ext3 and there should be no problem, even if that means formatting a drive to fulfill the need. That’s what I did and all was well thereafter.

SAS/Access and Oracle Timestamp format

25th April 2007

Until SAS 9.2, SAS/Access will not support the Oracle timestamp format. There still is no word on when 9.2 might appear so it’s over to the SQL pass-through facility…

  • All the views that you find expressed on here in postings and articles are mine alone and not those of any organisation with which I have any association, through work or otherwise. As regards editorial policy, whatever appears here is entirely of my own choice and not that of any other person or organisation.

  • Please note that everything you find here is copyrighted material. The content may be available to read without charge and without advertising but it is not to be reproduced without attribution. As it happens, a number of the images are sourced from stock libraries like iStockPhoto so they certainly are not for abstraction.

  • With regards to any comments left on the site, I expect them to be civil in tone of voice and reserve the right to reject any that are either inappropriate or irrelevant. Comment review is subject to automated processing as well as manual inspection but whatever is said is the sole responsibility of the individual contributor.