Technology Tales

Adventures & experiences in contemporary technology

Why all the commas?

4th December 2022

In recent times, I have been making use of Grammarly for proofreading what I write for online consumption. That has applied as much to what I write in Markdown form as it has for what is authored using content management systems like WordPress and Textpattern.

The free version does nag you to upgrade to a paid subscription, but is not my main irritation. That would be its inflexibility because you cannot turn off rules that you think intrusive, at least in the free version. This comment is particularly applicable to the unofficial plugin that you can install in Visual Studio Code. To me, the add-on for Firefox feels less scrupulous.

There are other options though, and one that I have encountered is LanguageTool. This also offers a Firefox add-on, but there are others not only for other browsers but also Microsoft Word. Recent versions of LibreOffice Writer can connect to a LanguageTool server using in-built functionality, too. There are also dedicated editors for iOS, macOS or Windows.

The one operating that does not get specific add-on support is Linux, but there is another option there. That uses an embedded HTTP server that I installed using Homebrew and set to start automatically using cron. This really helps when using the LanguageTool Linter extension in Visual Studio Code because it can connect to that instead of the public API, which bans your IP address if you overuse it. The extension is also configurable with the ability to add exceptions (both grammatical and spelling), though I appear to have enabled smart formatting only to have it mess up quotes in a Markdown file that then caused Hugo rendering to fail.

Like Grammarly, there is an online editor that offers more if you choose an annual subscription. That is cheaper than the one from Grammarly, so that caused me to go for that instead to get rephrasing suggestions both in the online editor and through a browser add-on. It is better not to get nagged all the time…

The title may surprise you, but I have been using co-ordinating conjunctions without commas for as long as I can remember. Both Grammarly and LanguageTool pick up on these, so I had to do some investigation to find a gap in my education, especially since LanguageTool is so good at finding them. What I also found is how repetitive my writing style can be, which also means that rephrasing has been needed. That, after all, is the point of a proofreading tool, and it can rankle if you have fixed opinions about grammar or enjoy creative writing.

Putting some off-copyright texts from other authors triggers all kinds of messages, but you just have to ignore these. Turning off checks needs care, even if turning them on again is easy to do. There, however, is the danger that artificial intelligence tools could make writing too uniform, since there is only so much that these technologies can do. They should make you look at your text more intently, though, which is never a bad thing because computers still struggle with meaning.

A look at the Julia programming language

19th November 2022

Several open-source computing languages get mentioned when talking about working with data. Among these are R and Python, but there are others; Julia is another one of these. It took a while before I got to check out Julia because I felt the need to get acquainted with R and Python beforehand. There are others like Lua to investigate too, but that can wait for now.

With the way that R is making an incursion into clinical data reporting analysis following the passage of decades when SAS was predominant, my explorations of Julia are inspired by a certain contrariness on my part. Alongside some small personal projects, there has been some reading in (digital) book form and online. Concerning the latter of these, there are useful tutorials like Introduction to Data Science: Learn Julia Programming, Maths & Data Science from Scratch or Julia Programming: a Hands-on Tutorial. Like what happens with R, there are online versions of published books available free of charge, and they include Julia Data Science and Interactive Visualization and Plotting with Julia. Video learning can help too and Jane Herriman has recorded and shared useful beginner’s guides on YouTube that start with the basics before heading onto more advanced subjects like multiple dispatch, broadcasting and metaprogramming.

This piece of learning has been made of simple self-inspired puzzles before moving on to anything more complex. That differs from my dalliance with R and Python, where I ventured into complexity first, not least because of testing them out with public COVID data. Eventually, I got around to doing that with Julia too though my interest was beginning to wane by then, and Julia’s abilities for creating multipage PDF files were such that PDF Toolkit was needed to help with this. Along the way, I have made use of such packages as CSV.jl, DataFrames.jl, DataFramesMeta, Plots, Gadfly.jl, XLSX.jl and JSON3.jl, among others. After that, there is PrettyTables.jl to try out, and anyone can look at the Beautiful Makie website to see what Makie can do. There are plenty of other packages creating graphs such as SpatialGraphs.jl, PGFPlotsX and GRUtils.jl. For formatting numbers, options include Format.jl and Humanize.jl.

So far, my primary usage has been with personal financial data together with automated processing and backup of photo files. The photo file processing has taken advantage of the ability to compile Julia scripts for added speed because just-in-time compilation always means there is a lag before the real work begins.

VS Code is my chosen editor for working with Julia scripts, since it has a plugin for the language. That adds the REPL, syntax highlighting, execution and data frame viewing capabilities that once were added to the now defunct Atom editor by its own plugin. While it would be nice to have a keyboard shortcut for script execution, the whole thing works well and is regularly updated.

Naturally, there have been a load of queries as I have gone along and the Julia Documentation has been consulted as well as Julia Discourse and Stack Overflow. The latter pair have become regular landing spots on many a Google search. One example followed a glitch that I encountered after a Julia upgrade when I asked a question about this and was directed to the XLSX.jl Migration Guides where I got the information that I needed to fix my code for it to run properly.

There is more learning to do as I continue to use Julia for various things. Once compiled, it does run fast like it has been promised. The syntax paradigm is akin to R and Python, but there are Julia-specific features too. If you have used the others, the learning curve is lessened but not eliminated completely. This is not an object-oriented language as such, but its functional nature makes it familiar enough for getting going with it. In short, the project has come a long way since it started more than ten years ago. There is much for the scientific programmer, but only time will tell if it usurped its older competitors. For now, I will remain interested in it.

A desktop Markdown editing environment

8th November 2022

Earlier this year, I changed over two websites from dynamic versions using content management systems to static ones by using Hugo to build them from Markdown files. That meant that I needed to look at the editing of MarkDown even if it is a fairly simple file format. For one thing, Grammarly can be incorporated into WordPress so I did not want to lose something like that.

The latter point meant that I was steered away from plain text editors. Otherwise, there are online ones like StackEdit and Dillinger but the Firefox Grammarly plugin only appears to work on the first of these, and even then only partially in my experience. Dillinger does offer connections to online file storage providers like Google, Dropbox and OneDrive but I wanted to store files on my desktop for upload to a web server. It also works with Github but I prefer to use another web hosting provider.

There are various specialised MarkDown editors for desktop usage like Typora, ReText, Formiko or Ghostwriter but I chose none of these. My actual choice may surprise many: it was Visual Studio Code. The availability of a Grammarly plug-in was what swayed it for me even if it did need to be switched on for MarkDown files. In many ways, it does work as smoothly as elsewhere because it gets fooled by links and other code-like pieces of text. Also, having the added ability to add words to a custom dictionary would be ideal. Some rule overriding is available but I am not sure that everything is covered even if the list of options is lengthy. Some time is needed to inspect all of them before I proceed any further. Thus far, things are working well enough for me.

Removing a Julia package

5th October 2022

While I have been programming with SAS for a few decades and it remains a lynchpin in the world of clinical development in the pharmaceutical industry, other technologies like R and Python are gaining a foothold. Two years ago, I started to look at those languages with personal projects being a great way of facilitating this. In addition, I got to hear of Julia and got to try that too. That journey continues since I have put it into use for importing and backing up photos, and there are other possible uses too.

Recently, I updated Julia to version 1.8.2 but ran into a problem with the DataArrays package that I had installed so I decided to remove it since it was added during experimentation. The Pkg package that is used for package management is documented but I had not gotten to that so some web searching ensued. It turns out that there are two ways of doing this. One uses the REPL: after pressing the ] key, the following command gets issued:

rm DataArrays

When all is done, pressing the delete or backspace keys returns things to normal. This also can be done in a script as well as the REPL and the following line works in both instances:

using Pkg; Pkg.rm("DataArrays")

The semicolon is used to separate two commands issued in the same line but they can be on different lines or issued separately just as well. Naturally, DataArrays is just an example here so you just replace that with the name of whatever other package you need to remove. Since we can get carried away when downloading packages, there are times when a clean-up is needed to remove redundant packages so knowing how to remove any clutter is invaluable.

Accessing Julia REPL command history

4th October 2022

In the BASH shell used on Linux and UNIX, the history command calls up a list of recent commands used and has many uses. There is a .bash_history file in the root of the user folder that logs and provides all this information so there are times when you need to exclude some commands from there but that is another story.

The Julia REPL environment works similarly to many operating system command line interfaces, so I wondered if there was a way to recall or refer to the history of commands issued. So far, I have not come across an equivalent to the BASH history command for the REPL itself but there the command history is retained in a file like .bash_history. The location varies on different operating systems though. On Linux, it is ~/.julia/logs/repl_history.jl while it is %USERPROFILE%\.julia\logs\repl_history.jl on Windows. While I tend to use scripts that I have written in VSCode rather than entering pieces of code in the REPL, the history retains its uses and I am sharing it here for others. In the past, the location changed but these are the ones with Julia 1.8.2, the version that I have at the time of writing.

Useful Python packages for working with data

14th October 2021

My response to changes in the technology stack used in clinical research is to develop some familiarity with programming and scripting platforms that complement and compete with SAS, a system with which I have been programming since 2000. One of these has been R but Python is another that has taken up my attention and I now also have Julia in my sights as well. There may be others to assess in the fullness of time.

While I first started to explore the Data Science world in the autumn of 2017, it was in the autumn of 2019 that I began to complete LinkedIn training courses on the subject. Good though they were, I find that I need to actually use a tool in order to better understand it. At that time, I did get to hear about Python packages like Pandas, NumPy, SciPy, Scikit-learn, Matplotlib, Seaborn and Beautiful Soup  though it took until of spring of this year for me to start gaining some hands-on experience with using any of these.

During the summer of 2020, I attended a BCS webinar on the CodeGrades initiative, a programming mentoring scheme inspired by the way classical musicianship is assessed. In fact, one of the main progenitors is a trained classical musician and teacher of classical music who turned to Python programming when starting a family so as to have a more stable income. The approach is that a student selects a project and works their way through it with mentoring and periodic assessments carried out in a gentle and discursive manner. Of course, the project has to be engaging for the learning experience to stay the course and that point came through in the webinar.

That is one lesson that resonates with me with subjects as diverse as web server performance and the ongoing pandemic pandemic supplying data and there are other sources of public data to examine as well before looking through my own personal archive gathered over the decades. Some subjects are uplifting while others are more foreboding but the key thing is that they sustain interest and offer opportunities for new learning. Without being able to dream up new things to try, my knowledge of R and Python would not be as extensive as it is and I hope that it will help with learning Julia too.

In the main, my own learning has been a solo effort with consultation of documentation along with web searches that have brought me to the likes of Real Python, Stack Abuse, Data Viz with Python and R and others for longer tutorials as well as threads on Stack Overflow. Usually, the web searching begins when I need a steer on a particular or a way to resolve a particular error or warning message but books always are worth reading even if that is the slower route. Those from the Dummies series or from O’Reilly have proved must useful so far but I do need to read them more completely than I already have; it is all too tempting to go with the try the “programming and search for solutions as you go” approach instead.

To get going, many choose the Anaconda distribution to get Jupyter notebook functionality but I prefer a more traditional editor so Spyder has been my tool of choice for Python programming and there are others like PyCharm as well. Spyder itself is written in Python so it can be installed using pip from PyPi like other Python packages. It has other dependencies like Pylint for code management activities but these get installed behind the scenes.

The packages that I first met in 2019 may be the mainstays for doing data science but I have discovered others since then. It also seems that there is porosity between the worlds of R an Python so you get some Python packages aping R packages and R has the Reticulate package for executing Python code. There are Python counterparts to such Tidyverse stables as dply and ggplot2 in the form of Siuba and Plotnine, respectively. The syntax of these packages are not direct copies of what is executed in R but they are close enough for there to be enough familiarity for added user friendliness compared to Pandas or Matplotlib. The interoperability does not stop there for there is SQLAlchemy for connecting to MySQL and other databases (PyMySQL is needed as well) and there also is SASPy for interacting with SAS Viya.

Pyhton may not have the speed of Julia but there are plenty of packages for working with larger workloads. Of these, Dask, Modin and RAPIDS all have there uses for dealing with data volumes that make Pandas code crawl. As if to prove that there are plenty of libraries for various forms of data analytics, data science, artificial intelligence and machine learning, there also are the likes of Keras, TensorFlow and NetworkX. These are just a selection of what is available and there is no need not to check out more. It may be tempting to stick with the most popular packages all the time, especially when they do so much, but it never hurst to keep an open mind either.

Turning off push notifications in Firefox 46

7th May 2016

A new feature came with Firefox 44 that only recently started to come to my notice with Yahoo Mail offering to set up browser notifications for every time when a new email arrives there. This is something that I did not need and yet I did not get the option to switch it off permanently for that website so I was being nagged every time I when to check on things for that email address, an unneeded irritation. Other websites offered to set up similar push notifications but you could switch these off permanently so it is a site by site function unless you take another approach.

That is to open a new browser tab and enter about:config in the address bar before hitting the return key. If you have not done this before, a warning message will appear but you can dismiss this permanently.  Once beyond that, you are presented with a searchable list of options and the ones that you need are dom.webnotifications.enabled and dom.webnotifications.serviceworker.enabled. By default, the corresponding entries in the Value column will be true. Double-clicking on each one will set it to false and you should not see any more offers of push notifications that allow you get alerts from web services like Yahoo Mail so your web browsing should suffer less of these intrusions.

Command Line Processing of EXIF Image Metadata

8th July 2013

There is a bill making its way through the U.K. parliament at the moment that could reduce the power of copyright when it comes to images placed on the web. The current situation is that anyone who creates an image automatically holds the copyright for it. However, the new legislation will remove that if it becomes law as it stands. As it happens, the Royal Photographic Society is doing what it can to avoid any changes to what we have now. There may be the barrier of due diligence but how many of us take steps to mark our own intellectual property? For one, I have been less that attentive to this and now wonder if there is anything more that I should be doing. Others may copyleft their images but I don’t want to find myself unable to share my own photos because another party is claiming rights over them. There’s watermarking them but I also want to add something to the image metadata too.

That got me wondering about adding metadata to any images that I post online that assert my status as the copyright holder. It may not be perfect but any action is better than doing nothing at all. Given that I don’t post photos where EXIF metadata is stripped as part of the uploading process, it should be there to see for anyone who bothers to check and there may not be many who do.

Because I also wanted to batch process images, I looked for a command line tool to do the needful and found ExifTool. Being a Perl library, it is cross-platform so you can use it on Linux, Windows and even OS X. To install it on a Debian or Ubuntu based Linux distro, just use the following command:

sudo apt-get install libimage-exiftool-perl

The form of the command that I found useful for adding the actual copyright information is below:

exiftool -p “-copyright=(c) John …” -ext jpg -overwrite_original

The -p switch preserves the timestamp of the image file while the -overwrite_original one ensures that you don’t end up with unwanted backup files. The copyright message goes within the quotes along with the -copyright option. With a little shell scripting, you can traverse a directory structure and change the metadata for any image files contained in different sub-folders. If you wish to do more than this, there’s always the user documentation to be consulted.

All Change?

19th September 2011

Could 2011 be remembered as the year when the desktop computing interface got a major overhaul? One part of this, Windows 8, won’t be with us until next year but there has been enough happening so far this year that has resulted in a lot of comment. With many if not all of the changes, it is possible to detect the influence of interfaces used on smartphones. After all, the carryover from Windows Phone 7 to the new Metro interface is unmistakeable.

Two developments in the Linux world have spawned a hell of an amount of comment: Canonical’s decision to develop Unity for Ubuntu and the arrival of GNOME 3. While there have been many complaints about the changes made in both, there must be a fair few folk who are just getting on with using them without complaint. Maybe there are many who even quietly like the new interfaces. While I am not so sure about Unity, I surprised myself by taking to GNOME Shell so much that I installed it on Linux Mint. It remains a work in progress as does Unity but it’ll be very interesting to see it mature. Perhaps a good number of the growing collection of GNOME Shell plugins could make it into the main codebase. If that were to happen, I could see it being welcomed by a good few folk.

There was little doubt that the changes in GNOME 3 looked daunting so Ubuntu’s taking a different approach is understandable until you come to realise how change that involves anyway. With GNOME 3 working so well for me, I feel disinclined to dally very much with Unity at all. In fact, I am writing these words on a Toshiba laptop running UGR, effectively Ubuntu running GNOME 3, and that could become my main home computing operating system in time.

For those who find these changes not to their taste, there are alternatives. Some Linux distributions are sticking with GNOME 2 as long as they can and there apparently has been some mention of a fork to keep a GNOME 2 interface available indefinitely. However, there are other possibilities such as LXDE and XFCE out there too. In fact, until GNOME 3 won me over, LXDE was coming to mind as a place of safety until I learned that Linux Mint was retaining its desktop identity. As always, there’s KDE too but I have never warmed to that for some reason.

The latest version of OS X, Lion, also included some changes inspired by iOS, the operating system that powers both the iPhone and iPad. However, while the current edition of PC Pro highlights some disgruntlement in professional circles regarding Apple’s direction, they do not seem to have aroused the kind of ire that has been abroad in the world of Linux. Is it because Linux users want to feel that they are in charge and that iMac and MacBook users are content to have decisions made for them so long as everything just works? Speaking for myself, the former description seems to fit me though having choices means that I can reject decisions that I do not like so much.

At the time of writing, the release of a developer preview of the next version of Windows has been generating a lot of attention. It also appears that changes are headed for the Windows user too. However, I get the sense that a more conservative interface option will be retained and that could be essential for avoiding the alienation of corporate users. After all, I cannot see the Metro interface gaining much favour in the working environment when so many of us have so much to do. Nevertheless, I plan to get my hands on the developer preview to have a look (the weekend proved too short for this). It will be very interesting to see how the next version of Windows develops and I plan to keep an eye on it as it does so.

It now looks as if many will have their work cut out if they are to avoid where desktop computing interfaces are going. Established paradigms are being questioned, particularly as a result of touch interfaces on smartphones and tablets. Wii and Kinect have involved other ways of interacting with computers too so there’s a lot of mileage in rethinking how we work with computers. So far, I have been able to deal with the changes in the world of Linux but I am left wondering at the changes that Microsoft is making. After Vista, they need to be careful and they know that. Maybe, they’ll be better at getting users through changes in computing interfaces than others but it’ll be very interesting to see what happens. Unlike open source community projects, they have the survival of a massive multinational at stake.

A multitude of operating systems

27th October 2009

Like buses, it seems that a whole hoard of operating systems is descending upon us at once. OS X 10.6 came first and it was the turn of Windows 7 last week with all of the excitement that it generated in the computing and technology media. Next up will be Ubuntu, already a source of some embarrassment for the BBC’s Rory Cellan-Jones when he got his facts muddled; to his credit he later corrected himself though I do wonder how up to speed is his appreciated that Ubuntu has its distinct flavours with a netbook variant being different to the main offering that I use. Along with Ubuntu 9.10, Fedora 12 and openSUSE 11.2 are also in the wings. As if all these weren’t enough, the latest issue of PC Plus gives an airing to less well-known operating systems like Haiku (the project that carries on BeOS). The inescapable conclusion is that, far from the impressions of mainstream computer users who know only Windows, we are swimming in a sea of operating system options in which you may drown if you decide to try sampling them all. That may explain why I stick with Ubuntu for home use due to reasons of familiarity and reliability and leave much of the distro hopping to others. Of course, it shouldn’t surprise anyone that Windows is the choice of where I work with 2000 being usurped by Vista in the next few weeks (IT managers always like to be behind the curve for sake of safety).

  • All the views that you find expressed on here in postings and articles are mine alone and not those of any organisation with which I have any association, through work or otherwise. As regards editorial policy, whatever appears here is entirely of my own choice and not that of any other person or organisation.

  • Please note that everything you find here is copyrighted material. The content may be available to read without charge and without advertising but it is not to be reproduced without attribution. As it happens, a number of the images are sourced from stock libraries like iStockPhoto so they certainly are not for abstraction.

  • With regards to any comments left on the site, I expect them to be civil in tone of voice and reserve the right to reject any that are either inappropriate or irrelevant. Comment review is subject to automated processing as well as manual inspection but whatever is said is the sole responsibility of the individual contributor.