Technology Tales

Adventures & experiences in contemporary technology

Removing a Julia package

5th October 2022

While I have been programming with SAS for a few decades and it remains a lynchpin in the world of clinical development in the pharmaceutical industry, other technologies like R and Python are gaining a foothold. Two years ago, I started to look at those languages with personal projects being a great way of facilitating this. In addition, I got to hear of Julia and got to try that too. That journey continues since I have put it into use for importing and backing up photos, and there are other possible uses too.

Recently, I updated Julia to version 1.8.2 but ran into a problem with the DataArrays package that I had installed so I decided to remove it since it was added during experimentation. The Pkg package that is used for package management is documented but I had not gotten to that so some web searching ensued. It turns out that there are two ways of doing this. One uses the REPL: after pressing the ] key, the following command gets issued:

rm DataArrays

When all is done, pressing the delete or backspace keys returns things to normal. This also can be done in a script as well as the REPL and the following line works in both instances:

using Pkg; Pkg.rm("DataArrays")

The semicolon is used to separate two commands issued in the same line but they can be on different lines or issued separately just as well. Naturally, DataArrays is just an example here so you just replace that with the name of whatever other package you need to remove. Since we can get carried away when downloading packages, there are times when a clean-up is needed to remove redundant packages so knowing how to remove any clutter is invaluable.

Accessing Julia REPL command history

4th October 2022

In the BASH shell used on Linux and UNIX, the history command calls up a list of recent commands used and has many uses. There is a .bash_history file in the root of the user folder that logs and provides all this information so there are times when you need to exclude some commands from there but that is another story.

The Julia REPL environment works similarly to many operating system command line interfaces, so I wondered if there was a way to recall or refer to the history of commands issued. So far, I have not come across an equivalent to the BASH history command for the REPL itself but there the command history is retained in a file like .bash_history. The location varies on different operating systems though. On Linux, it is ~/.julia/logs/repl_history.jl while it is %USERPROFILE%\.julia\logs\repl_history.jl on Windows. While I tend to use scripts that I have written in VSCode rather than entering pieces of code in the REPL, the history retains its uses and I am sharing it here for others. In the past, the location changed but these are the ones with Julia 1.8.2, the version that I have at the time of writing.

Getting custom Python imports to work in Visual Studio Code

18th February 2022

While I continue to use Spyder as my preferred Python code editor, I also tried out Visual Studio Code. Handily, this Integrated Development Environment also has facilities for working with R and Julia code as well as MarkDown text editing and adding the required extensions is enough for these applications; it helps that there is an unofficial Grammarly extension for content creation.

My Python code development makes use of the Pylance extension and it works a little differently from Spyder when it comes to including files using import statements. Spyder will look into the folder where the base script is located but the default behaviour of Pylance is that it looks in the root path of your workspace. This meant that any code that ran successfully in Spyder failed in Visual Studio Code.

The way around this was to add the required location using the python.analysis.extraPaths setting for the workspace. That meant opening Settings by navigating to File > Preferences > Settings in the menu system and entering python.analysis.extraPaths into the search box. That took me to the section that I needed and I then clicked on Add Item before entering the required path and clicking on the OK button. That was enough to fix the problem and all worked as it should after that.

Useful Python packages for working with data

14th October 2021

My response to changes in the technology stack used in clinical research is to develop some familiarity with programming and scripting platforms that complement and compete with SAS, a system with which I have been programming since 2000. One of these has been R but Python is another that has taken up my attention and I now also have Julia in my sights as well. There may be others to assess in the fullness of time.

While I first started to explore the Data Science world in the autumn of 2017, it was in the autumn of 2019 that I began to complete LinkedIn training courses on the subject. Good though they were, I find that I need to actually use a tool in order to better understand it. At that time, I did get to hear about Python packages like Pandas, NumPy, SciPy, Scikit-learn, Matplotlib, Seaborn and Beautiful Soup  though it took until of spring of this year for me to start gaining some hands-on experience with using any of these.

During the summer of 2020, I attended a BCS webinar on the CodeGrades initiative, a programming mentoring scheme inspired by the way classical musicianship is assessed. In fact, one of the main progenitors is a trained classical musician and teacher of classical music who turned to Python programming when starting a family so as to have a more stable income. The approach is that a student selects a project and works their way through it with mentoring and periodic assessments carried out in a gentle and discursive manner. Of course, the project has to be engaging for the learning experience to stay the course and that point came through in the webinar.

That is one lesson that resonates with me with subjects as diverse as web server performance and the ongoing pandemic pandemic supplying data and there are other sources of public data to examine as well before looking through my own personal archive gathered over the decades. Some subjects are uplifting while others are more foreboding but the key thing is that they sustain interest and offer opportunities for new learning. Without being able to dream up new things to try, my knowledge of R and Python would not be as extensive as it is and I hope that it will help with learning Julia too.

In the main, my own learning has been a solo effort with consultation of documentation along with web searches that have brought me to the likes of Real Python, Stack Abuse, Data Viz with Python and R and others for longer tutorials as well as threads on Stack Overflow. Usually, the web searching begins when I need a steer on a particular or a way to resolve a particular error or warning message but books always are worth reading even if that is the slower route. Those from the Dummies series or from O’Reilly have proved must useful so far but I do need to read them more completely than I already have; it is all too tempting to go with the try the “programming and search for solutions as you go” approach instead.

To get going, many choose the Anaconda distribution to get Jupyter notebook functionality but I prefer a more traditional editor so Spyder has been my tool of choice for Python programming and there are others like PyCharm as well. Spyder itself is written in Python so it can be installed using pip from PyPi like other Python packages. It has other dependencies like Pylint for code management activities but these get installed behind the scenes.

The packages that I first met in 2019 may be the mainstays for doing data science but I have discovered others since then. It also seems that there is porosity between the worlds of R an Python so you get some Python packages aping R packages and R has the Reticulate package for executing Python code. There are Python counterparts to such Tidyverse stables as dply and ggplot2 in the form of Siuba and Plotnine, respectively. The syntax of these packages are not direct copies of what is executed in R but they are close enough for there to be enough familiarity for added user friendliness compared to Pandas or Matplotlib. The interoperability does not stop there for there is SQLAlchemy for connecting to MySQL and other databases (PyMySQL is needed as well) and there also is SASPy for interacting with SAS Viya.

Pyhton may not have the speed of Julia but there are plenty of packages for working with larger workloads. Of these, Dask, Modin and RAPIDS all have there uses for dealing with data volumes that make Pandas code crawl. As if to prove that there are plenty of libraries for various forms of data analytics, data science, artificial intelligence and machine learning, there also are the likes of Keras, TensorFlow and NetworkX. These are just a selection of what is available and there is no need not to check out more. It may be tempting to stick with the most popular packages all the time, especially when they do so much, but it never hurst to keep an open mind either.

Using multi-line commenting in Perl to inactivate blocks of code during testing

26th December 2019

Recently, I needed to inactivate blocks of code in a Perl script while doing some testing. This is something that I often do in other computing languages so I sought the same in Perl. To do that, I need to use the POD methodology. This meant enclosing the code as follows.

=start

<< Code to be inactivated by inclusion in a comment >>

=cut

The =start line could use any word after the equality sign but it seems that =cut is needed to close the multi-line comment. If this was actual programming documentation, then the comment block should include some meaningful text for use with perldoc but that was not a concern here since the commenting statements would be removed afterwards anyway and it is good practice not to leave commented code in a production script or program to avoid any later confusion.

In my case, this facility allowed me to isolate the code that I needed to alter and test before putting everything back as needed. It also saved time since I did not need to individually comment out every executable line because multiple lines could be inactivated at a time.

Searching file contents using PowerShell

25th October 2018

Having made plenty of use of grep on the Linux/UNIX command and findstr on the legacy Windows command line, I wondered if PowerShell could be used to search the contents of files for a text string. Usefully, this turns out to be the case but I found that the native functionality does not use what I have used before. The form of the command is given below:

Select-String -Path <filename search expression> -Pattern "<search expression>" > <output file>

While you can have the output appear on screen, it always seems easier to send it to a file for subsequent and that is what I am doing above. The input to the -Path switch can be a filename or a wildcard expression while that to the -Pattern can be a text string enclosed in quotes or a regular expression. It works well once you know what to do so here is an example:

Select-String -Path *.sas -Pattern "proc report" > c:\temp\search.txt

The search.txt file then includes both the file information and the text that has been found for sake of checking that you have what you want. What you do next is up to you.

Using PowerShell to reinstall Windows Apps

9th September 2016

Recently, I managed to use 10AppsManager to remove most of the in-built apps from a Windows 10 virtual machine that I have for testing development versions in case anything ugly were to appear in a production update. Curiosity is my excuse for letting the tool do what it did and some could do with restoration. Out of the lot, Windows Store is the main one that I have sorted so far.

The first step of the process was to start up PowerShell in administrator mode. On my system, this is as simple as clicking on the relevant item in the menu popped up by right clicking on the Start Menu button and clicking on the Yes button in the dialogue box that appears afterwards. In your case, it might be a case of right clicking on the appropriate Start Menu programs entry, selecting the administrator option and going from there.

With this PowerShell session open, the first command to issue is the following:

Get-Appxpackage -Allusers > c:\temp\appxpackage.txt

This creates a listing of Windows app information and pops it into a text file in your choice of directory. Opening the text file in Notepad allows you to search it more easily and there is an entry for Windows Store:

Name                   : Microsoft.WindowsStore
Publisher              : CN=Microsoft Corporation, O=Microsoft Corporation, L=Redmond, S=Washington, C=US
Architecture           : X64
ResourceId             :
Version                : 11607.1001.32.0
PackageFullName        : Microsoft.WindowsStore_11607.1001.32.0_x64__8wekyb3d8bbwe
InstallLocation        : C:\Program Files\WindowsApps\Microsoft.WindowsStore_11607.1001.32.0_x64__8wekyb3d8bbwe
IsFramework            : False
PackageFamilyName      : Microsoft.WindowsStore_8wekyb3d8bbwe
PublisherId            : 8wekyb3d8bbwe
PackageUserInformation : {S-1-5-21-3224249330-198124288-2558179248-1001
IsResourcePackage      : False
IsBundle               : False
IsDevelopmentMode      : False
Dependencies           : {Microsoft.VCLibs.140.00_14.0.24123.0_x64__8wekyb3d8bbwe,
Microsoft.NET.Native.Framework.1.3_1.3.24201.0_x64__8wekyb3d8bbwe,
Microsoft.NET.Native.Runtime.1.3_1.3.23901.0_x64__8wekyb3d8bbwe,
Microsoft.WindowsStore_11607.1001.32.0_neutral_split.scale-100_8wekyb3d8bbwe}

Using the information from the InstallLocation field, the following command can be built and executed (here, it has gone over several lines so you need to get your version onto a single one):

Add-AppxPackage -register “C:\Program Files\WindowsApps\Microsoft.WindowsStore_11607.1001.32.0_x64__8wekyb3d8bbwe\AppxManifest.xml” -DisableDevelopmentMode

Once the above has completed, the app was installed and ready to use again. As the mood took me, I installed other apps from the Windows Store as I saw fit.

Reloading .bashrc within a BASH terminal session

3rd July 2016

BASH is a command-line interpretor that is commonly used by Linux and UNIX operating systems. Chances are that you will find find yourself in a BASH session if you start up a terminal emulator in many of these though there are others like KSH and SSH too.

BASH comes with its own configuration files and one of these is located in your own home directory, .bashrc. Among other things, it can become a place to store command shortcuts or aliases. Here is an example:

alias us=’sudo apt-get update && sudo apt-get upgrade’

Such a definition needs there to be no spaces around the equals sign and the actual command to be declared in single quotes. Doing anything other than this will not work as I have found. Also, there are times when you want to update or add one of these and use it without shutting down a terminal emulator and restarting it.

To reload the .bashrc file to use the updates contained in there, one of the following commands can be issued:

source ~/.bashrc

. ~/.bashrc

Both will read the file and execute its contents so you get those updates made available so you can continue what you are doing. There appears to be a tendency for this kind of thing in the world of Linux and UNIX because it also applies to remounting drives after a change to /etc/fstab and restarting system services like Apache, MySQL or Nginx. The command for the former is below:

sudo mount -a

Often, the means for applying the sorts of in-situ changes that you make are simple ones too and anything that avoids system reboots has to be good since you have less work interruptions.

Changing file timestamps using Windows PowerShell

29th October 2014

Recently, a timestamp got changed on an otherwise unaltered file on me and I needed to change it back. Luckily, I found an answer on the web that used PowerShell to do what I needed and I am recording it here for future reference. The possible commands are below:

$(Get-Item temp.txt).creationtime=$(Get-Date "27/10/2014 04:20 pm")
$(Get-Item temp.txt).lastwritetime=$(Get-Date "27/10/2014 04:20 pm")
$(Get-Item temp.txt).lastaccesstime=$(Get-Date "27/10/2014 04:20 pm")

The first of these did not interest me since I wanted to leave the file creation date as it was. The last write and access times were another matter because these needed altering. The Get-Item commandlet brings up the file, so its properties can be set. Here, these include creationtime, lastwritetime and lastaccesstime. The Get-Date commandlet reads in the provided date and time for use in the timestamp assignment. While PowerShell itself is case-insensitive, I have opted to show the camel case that is produced when you are tabbing through command options for the sake of clarity.

The Get-Item and Get-Date have aliases of gi and gd, respectively and the Get-Alias commandlet will show you a full list while Get-Command (gcm) gives you a list of commandlets. Issuing the following gets you a formatted list that is sent to a text file:

gcm | Format-List > temp2.txt

There is some online help but it is not quite as helpful as it ought to be so I have popped over to Microsoft Learn whenever I needed extra enlightenment. Here is a command that pops the full thing into a text file:

Get-Help Format-List -full > temp3.txt

In fact, getting a book might be the best way to find your way around PowerShell because of all its commandlets and available objects.

For now, other commands that I have found useful include the following:

Get-Service | Format-List
New-Item -Name test.txt -ItemType "file"

The first of these gets you a list of services while the second creates a new blank text file for you and it can create new folders for you too. Other useful commandlets are below:

Get-Location (gl)
Set-Location (sl)
Copy-Item
Remove-Item
Move-Item
Rename-Item

The first of the above is like the cwd or pwd commands that you may have seen elsewhere in that the current directory location is given. Then, the second will change your directory location for you. After that, there are commandlets for copying, deleting, moving and renaming files. These also have aliases so users of the legacy Windows command line or a UNIX or Linux shell can use something that is familiar to them.

Little fixes like the one with which I started this piece are all very good to know but it is in scripting that PowerShell really is said to show its uses. Having seen the usefulness of such things in the world on Linux and UNIX, I cannot disagree with that and PowerShell has its own IDE too. That may be just as well given how much there is to learn. That especially is the case when you might need to issue the following command in a PowerShell session opened using the Run as Administrator option just to get the execution as you need it:

Set-ExecutionPolicy RemoteSigned

Issuing Get-ExecutionPolicy will show you if this is needed when the response is: Restricted. A response of RemoteSigned shows you that all is in order, though you need to check that any script you then run has no nasty payload in there, which is why execution is restrictive in the first place. This sort of thing is yet another lesson to be learnt with PowerShell.

Renaming multiple files in Linux

19th August 2012

The Linux and UNIX command mv has a number of limitations, such as not overwriting destination files and not renaming multiple files using wildcards. The only solution to the first that I can find is one that involves combining the cp and rm commands. For the second, there’s another command: rename. There is a command like it in Windows but this one is a little different in its syntax. Before saying more about that, here’s an example like what I used recently:

rename s/fedora/fedora2/ fedora.*

The first argument in the above command is a regular expression much like what Perl is famous for implementing; in fact, it is Perl-compatible ones (PCRE) that are used. The s before the first slash stands for substitute with fedora being the string that needs to be replaced and fedora being what replaces it. The third command is the file name glob that you want to use, fedora.* in this case. Therefore, all files in a directory named fedora will be renamed fedora2 regardless of the file type. The same sort of operation can be performed for all files with the same extension and it needing changing, htm to html, for instance. Of course, there are other uses but these are handy ones to know.

  • All the views that you find expressed on here in postings and articles are mine alone and not those of any organisation with which I have any association, through work or otherwise. As regards editorial policy, whatever appears here is entirely of my own choice and not that of any other person or organisation.

  • Please note that everything you find here is copyrighted material. The content may be available to read without charge and without advertising but it is not to be reproduced without attribution. As it happens, a number of the images are sourced from stock libraries like iStockPhoto so they certainly are not for abstraction.

  • With regards to any comments left on the site, I expect them to be civil in tone of voice and reserve the right to reject any that are either inappropriate or irrelevant. Comment review is subject to automated processing as well as manual inspection but whatever is said is the sole responsibility of the individual contributor.