Technology Tales

Notes drawn from experiences in consumer and enterprise technology

TOPIC: CROSS-PLATFORM SOFTWARE

What to do when the externally-managed environment error appears while using pip to install Python packages on Linux Mint 22

16th December 2024

After upgrading to Linux Mint 22, the following message appeared when attempting to install Python packages using the pip command:

error: externally-managed-environment

× This environment is externally managed
╰─> To install Python packages system-wide, try apt install
python3-xyz, where xyz is the package you are trying to
install.

If you wish to install a non-Debian-packaged Python package,
create a virtual environment using python3 -m venv path/to/venv.
Then use path/to/venv/bin/python and path/to/venv/bin/pip. Make
sure you have python3-full installed.

If you wish to install a non-Debian packaged Python application,
it may be easiest to use pipx install xyz, which will manage a
virtual environment for you. Make sure you have pipx installed.

See /usr/share/doc/python3.12/README.venv for more information.

note: If you believe this is a mistake, please contact your Python installation or OS distribution provider. You can override this, at the risk of breaking your Python installation or OS, by passing --break-system-packages.
hint: See PEP 668 for the detailed specification.

This will frustrate anyone following how-tos on the web, so users will need to know about it. On something like Linux Mint, the repositories may not be as up-to-date as PyPI, so picking up the very latest version has its advantages. Thus, I initially used the unrecommended --break-system-packages switch to get things going as before, since doing never broke anything before. While the way of working feels like an overkill in some ways, using pipx probably is the way forward as long as things work as I want them to do.

There is wisdom in using virtual environments too, especially when AI models are involved. For most of what I get to do, that may be getting too elaborate. Then, deleting or renaming the message file in /usr/lib/python3.12/EXTERNALLY-MANAGED is tempting if that gets around things, as retrograde as that probably is. After all, I never broke anything before this message started to appear, possibly since my interests are data related.

Clearing the Julia REPL during code development

23rd September 2024

During development, there are times when you need to clear the Julia REPL. It can become so laden with content that it becomes hard to perform debugging of your code. One way to accomplish this is issuing the CTRL + L keyboard shortcut while focus is within the REPL; you need to click on it first. Another is to issue the following in the REPL itself:

print("\033c")

Here \033 is an escape code in octal format. It is often used in terminal control sequences. The c character is what resets the terminal to its initial state. Printing this sequence is what does the clearance, and variations can be used to clear other kinds of console screens too. That makes it a more generic solution.

Dropping to an underlying shell using the ; character is another possibility. Then, you can use the clear or cls commands as needed; the latter is for Windows systems.

One last option is to define a Julia function for doing this:

function clear_console()
    run(`clear`)  # or `cls` for Windows
end

Calling the clear_console function then clears the screen programmatically, allowing for greater automation. The run function is the one that sends that command in backticks to the underlying shell for execution. Even using that alone should work too.

Avoiding errors caused by missing Julia packages when running code on different computers

15th September 2024

As part of an ongoing move to multi-location working, I am sharing scripts and other artefacts via GitHub. This includes Julia programs that I have. That has led me to realise that a bit of added automation would help iron out any package dependencies that arise. Setting up things as projects could help, yet that feels a little too much effort for what I have. Thus, I have gone for adding extra code to check on and install any missing packages instead of having failures.

For adding those extra packages, I instate the Pkg package as follows:

import Pkg

While it is a bit hackish, I then declare a single array that lists the packages to be checked:

pkglits =["HTTP", "JSON3", "DataFrames", "Dates", "XLSX"]

After that, there is a function that uses a try catch construct to find whether a package exists or not, using the inbuilt eval macro to try a using declaration:

tryusing(pkgsym) = try
@eval using $pkgsym
return true
catch e
return false
end

The above function is called in a loop that both tests the existence of a package and, if missing, installs it:

for i in 1:length(pkglits)
rslt = tryusing(Symbol(pkglits[i]))
if rslt == false
Pkg.add(pkglits[i])
end
end

Once that has completed, using the following line to instate the packages required by later processing becomes error free, which is what I sought:

using HTTP, JSON3, DataFrames, Dates, XLSX

Version control of large files on GitHub

27th August 2024

When you try pushing large files to a GitHub repository, you may find that you breach its 100 MB limit. When you do, you either need to buy a data pack or exclude the file from being tracked. In my case, I decided that the monthly fee for 50 GB was not overly onerous, so I added that. Excluding such files using the .gitignore functionality makes a lot of sense, too.

If you decide to proceed as I did, you will need to install git-lfs. Since that may vary by operating system, I am leaving to you to look for those details on the website that I have linked to earlier. Activating it for your user account needs the following:

git lfs install

Following that, you need to flag the file or type of file using a command like the following:

git lfs track "[file path with name or search pattern]"

Executing the above adds the file path including the file name or the search pattern (normal operating system wildcards like * work here) to a file named .gitattributes in the root of the repository folder hierarchy. If that file no longer exists, it will get created the first time that this is done. It will also need to be added to the repository using git add like any other file. A general command like the following will also do it anyway, since it covers everything in the relevant folder:

git add .

After making a commit, the next step is to push the contents into GitHub. At this stage, the large file or files will be recognised and sent to large file storage with only a text link in the main area. Everything else will be handled as normal.

While on this subject, I need to add a few words of warning. Pushing a large file to GitHub without doing things up front will cause the operation to fail. That may make the transition over to large file storage all the more tricky, since things will be out of order. Moving everything to a temporary folder and again cloning the repository was how I got out of this impasse when it happened to me. Then, I could get the large file handling set up before getting going again. It is better to sort things like this out at the start of the process, rather than attempting to remedy things part way through the process.

Making the LanguageTool embedded HTTP Server work on Windows 11

11th August 2024

My choice of Markdown editor is VS Code or VSCodium, the latter being a fork of the former with Microsoft telemetry removed. In either case, I use the LanguageTool Linter extension for the required grammar and spelling checks. Pointing that to the remote web service offered by LanguageTool could get punitive, even if I am a subscriber. Thus, I use a locally installed equivalent instead.

In my usual Linux system, that is how I work. However, I have replicated the set-up on a Windows laptop for added flexibility. The needed the JRE, so that was downloaded from the Oracle website and then installed. The next step is to download the LanguageTool embedded HTTP Server zip file and decompress it to a chosen location. To run the server, the command like the following is issued from the Windows Terminal (the single line may break over two here):

java -cp "[Chosen Location]\LanguageTool-stable\LanguageTool-6.4\languagetool-server.jar" org.languagetool.server.HTTPServer --port 8081 --allow-origin

That is enough to get things going because it fulfils the default settings of the LanguageTool Linter extension in VS Code or VSCodium. The fastText application is unavailable for Windows, so I did without it. So far, things are operating acceptably, even if there is a way to address more memory should that be required.

AttributeError: module 'PIL' has no attribute 'Image'

11th March 2024

One of my websites has an online photo gallery. This has been a long-term activity that has taken several forms over the years. Once HTML and JavaScript based, it then was powered by Perl before PHP and MySQL came along to take things from there.

While that remains how it works, the publishing side of things has used its own selection of mechanisms over the same time span. Perl and XML were the backbone until Python and Markdown took over. There was a time when ImageMagick and GraphicsMagick handled image processing, but Python now does that as well.

That was when the error message gracing the title of this post came to my notice. Everything was working well when executed in Spyder, but the message appears when I tried running things using Python on the command line. PIL is the abbreviated name for the Python 3 pillow package; there was one called PIL in the Python 2 days.

For me, pillow loads, resizes and creates new images, which is handy for adding borders and copyright/source information to each image as well as creating thumbnails. All this happens in memory and that makes everything go quickly, much faster than disk-based tools like ImageMagick and GraphicsMagick.

Of course, nothing is going to happen if the package cannot be loaded, and that is what the error message is about. Linux is what I mainly use, so that is the context for this scenario. What I was doing was something like the following in the Python script:

import PIL

Then, I referred to PIL.Image when I needed it, and this could not be found when the script was run from the command line (BASH). The solution was to add something like the following:

from PIL import Image

That sorted it, and I must have run into trouble with PIL.ImageFilter too, since I now load it in the same manner. In both cases, I could just refer to Image or ImageFilter as I required and without the dot syntax. However, you need to make sure that there is no clash with anything in another loaded Python package when doing this.

Upgrading Julia packages

23rd January 2024

Whenever a new version of Julia is released, I have certain actions to perform. With Julia 1.10, installing and updating it has become more automated thanks to shell scripting or the use of WINGET, depending on your operating system. Because my environment predates this, I found that the manual route still works best for me, and I will continue to do that.

Returning to what needs doing after an update, this includes updating Julia packages. In the REPL, this involves dropping to the PKG subshell using the ] key if you want to avoid using longer commands or filling your history with what is less important for everyday usage.

Previously, I often ran code to find a package was missing after updating Julia, so the add command was needed to reinstate it. That may raise its head again, but there also is the up command for upgrading all packages that were installed. This could be a time saver when only a single command is needed for all packages and not one command for each package as otherwise might be the case.

Generating custom log messages in R

24th April 2023

While I have been exploring the use of R on a private basis during the last few years, a recent opportunity allowed me to use this exposure at work. This took the form of creating a utility script for use by others. To keep things lightweight, I did not go down the packaging route, but that may come later, possibly for something else.

However, anything used by others needs input checking and comprehensible feedback should anything go wrong. For me, that meant looking at the message, warning and stop functions. The last of these aborts script execution when there is a critical error, something that the other two do not do. The message function is for informative user input, while the warning function suggests things that may need their attention.

Each function takes string input and sends this to the terminal or log. They also can combine different pieces of text in the style of the paste0 function, and can take the text output of other functions as input. Used in combination with conditional logic or error handling, they can help a user track down what went wrong without their needing to ask a script developer. Anything that helps anyone else to help themselves has to be good.

Removing duplicate characters from strings using BASH scripting

30th March 2023

Recently, I wanted to extract some text from the Linux command by word number only for multiple spaces to make things less predictable. The solution was to remove the duplicate spaces. This can be done using sed, but you add the complexity of regular expressions if you opt for that solution. Instead, the tr command offers a neater approach. For removing duplicate spaces, the command takes the following form:

echo "test   test" | tr -s " "

Since I was piping some text to the command, that is what I have above. The tr command is intended to replace or delete characters, and the -s switch is a shorthand for --squeeze-repeats. The actual character to be deduplicated is passed in quotes at the end; here, it is a space, but it could be anything that is duplicated. The resulting text in this example becomes:

test test

After the processing, there is now only one space separating the two words, which is the solution that I sought. It certainly cut out any variability that I was encountering in my usage.

Moves to Hugo

30th November 2022

What amazes me is how things can become more complicated over time. As long as you knew HTML, CSS and JavaScript, building a website was not as onerous as long as web browsers played ball with it. Since then, things have got easier to use but more complex at the same time. One example is WordPress: in the early days, themes were much simpler than they are now. The web also has got more insecure over time, and that adds to complexity as well. It sometimes feels as if there is a choice to make between ease of use and simplicity.

It is against that background that I reassessed the technology that I was using on my public transport and Irish history websites. The former used WordPress, while the latter used Drupal. The irony was that the simpler website was using the more complex platform, so the act of going simpler probably was not before time. Alternatives to WordPress were being surveyed for the first of the pair, but none had quite the flexibility, pervasiveness and ease of use that WordPress offers.

There is another approach that has been gaining notice recently. One part of this is the use of Markdown for web publishing. This is a simple and distraction-free plain text format that can be transformed into something more readable. It sees usage in blogs hosted on GitHub, but also facilitates the generation of static websites. The clutter is absent for those who have no need of the Gutenberg Editor on WordPress.

With the content written in Markdown, it can be fed to a static website generator like Hugo. Using defined templates and fixed assets like CSS together with images and other static files, it can slot the content into HTML files very speedily since it is written in the Go programming language. Once you get acclimatised, there are no folder structures that cannot be used, so you get full flexibility in how you build out your website. Sitemaps and RSS feeds can be built at the same time, both using the same input as the HTML files.

In a nutshell, it automates what once needed manual effort used a code editor or a visual web page editor. The use of HTML snippets and layouts means that there is no necessity for hand-coding content, like there was at the start of the web. It also helps that Bootstrap can be built in using Node, so that gives a basis for any styling. Then, SCSS can take care of things, giving even more automation.

Given that there is no database involved in any of this, the required information has to be stored somewhere, and neither the Markdown content nor the layout files contain all that is needed. The main site configuration is defined in a single TOML file, and you can have a single one of these for every publishing destination; I have development and production servers, which makes this a very handy feature. Otherwise, every Markdown file needs a YAML header where titles, template references, publishing status and other similar information gets defined. The layouts then are linked to their components, and control logic and other advanced functionality can be added too.

Because static files are being created, it does mean that site searching and commenting, or contact pages cannot work like they would on a dynamic web platform. Often, external services are plugged in using JavaScript. One that I use for contact forms is Forminit. Then, Zapier has had its uses in using the RSS feed to tweet site updates on Twitter when new content gets added. Though I made different choices, Disqus can be used for comments and Algolia for site searching. Generally, though, you can find yourself needing to pay, particularly if you need to remove advertising or gain advanced features.

Some commenting service providers offer open source self-hosted options, but I found these difficult to set up and ended up not offering commenting at all. That was after I tried out Cactus Comments only to find that it was not discriminating between pages, so it showed the same comments everywhere. There are numerous alternatives like Remark42, Hyvor Talk, Commento, FastComments, Utterances, Isso, Mouthful, Muut and HyperComments but trying them all out was too time-consuming for what commenting was worth to me. It also explains why some static websites even send readers to Twitter if they have something to say, though I have not followed this way of working.

For searching, I added a JavaScript/JSON self-hosted component to the transport website, and it works well. However, it adds to the size of what a browser needs to download. That is not a major issue for desktop browsers, but the situation with mobile browsers is such that it has a sizeable effect. Testing with PageSpeed and Lighthouse highlighted this, even if I left things as they are. The solution works well in any case.

One thing that I have yet to work out is how to edit or add content while away from home. Editing files using an SSH connection is as much a possibility as setting up a Hugo publishing setup on a laptop. After that, there is the question of using a tablet or phone, since content management systems make everything web based. These are points that I have yet to explore.

As is natural with a code-based solution, there is a learning curve with Hugo. Reading a book provided some orientation, and looking on the web resolved many conundrums. There is good documentation on the project website, while forum discussions turn up on many a web search. Following any research, there was next to nothing that could not be done in some way.

Migration of content takes some forethought and took quite a bit of time, though there was an opportunity to carry some housekeeping as well. The history website was small, so copying and pasting sufficed. For the transport website, I used Python to convert what was on the database into Markdown files before refining the result. That provided some automation, but left a lot of work to be done afterwards.

The results were satisfactory, and I like the associated simplicity and efficiency. That Hugo works so fast means that it can handle large websites, so it is scalable. The new Markdown method for content production is not problematical so far apart from the need to make it more portable, and it helps that I found a setup that works for me. This also avoids any potential dealbreakers that continued development of publishing platforms like WordPress or Drupal could bring. For the former, I hope to remain with the Classic Editor indefinitely, but now have another option in case things go too far.

  • The content, images, and materials on this website are protected by copyright law and may not be reproduced, distributed, transmitted, displayed, or published in any form without the prior written permission of the copyright holder. All trademarks, logos, and brand names mentioned on this website are the property of their respective owners. Unauthorised use or duplication of these materials may violate copyright, trademark and other applicable laws, and could result in criminal or civil penalties.

  • All comments on this website are moderated and should contribute meaningfully to the discussion. We welcome diverse viewpoints expressed respectfully, but reserve the right to remove any comments containing hate speech, profanity, personal attacks, spam, promotional content or other inappropriate material without notice. Please note that comment moderation may take up to 24 hours, and that repeatedly violating these guidelines may result in being banned from future participation.

  • By submitting a comment, you grant us the right to publish and edit it as needed, whilst retaining your ownership of the content. Your email address will never be published or shared, though it is required for moderation purposes.