Cross-platform software | Technology Tales

Making the LanguageTool embedded HTTP Server work on Windows 11

11^th August 2024

My choice of Markdown editor is VS Code or VSCodium, the latter being a fork of the former with Microsoft telemetry removed. In either case, I use the LanguageTool Linter extension for the required grammar and spelling checks. Pointing that to the remote web service offered by LanguageTool could get punitive, even if I am a subscriber. Thus, I use a locally installed equivalent instead.

In my usual Linux system, that is how I work. However, I have replicated the set-up on a Windows laptop for added flexibility. The needed the JRE, so that was downloaded from the Oracle website and then installed. The next step is to download the LanguageTool embedded HTTP Server zip file and decompress it to a chosen location. To run the server, the command like the following is issued from the Windows Terminal (the single line may break over two here):

java -cp "[Chosen Location]\LanguageTool-stable\LanguageTool-6.4\languagetool-server.jar" org.languagetool.server.HTTPServer --port 8081 --allow-origin

That is enough to get things going because it fulfils the default settings of the LanguageTool Linter extension in VS Code or VSCodium. The fastText application is unavailable for Windows, so I did without it. So far, things are operating acceptably, even if there is a way to address more memory should that be required.

AttributeError: module 'PIL' has no attribute 'Image'

11^th March 2024

One of my websites has an online photo gallery. This has been a long-term activity that has taken several forms over the years. Once HTML and JavaScript based, it then was powered by Perl before PHP and MySQL came along to take things from there.

While that remains how it works, the publishing side of things has used its own selection of mechanisms over the same time span. Perl and XML were the backbone until Python and Markdown took over. There was a time when ImageMagick and GraphicsMagick handled image processing, but Python now does that as well.

That was when the error message gracing the title of this post came to my notice. Everything was working well when executed in Spyder, but the message appears when I tried running things using Python on the command line. PIL is the abbreviated name for the Python 3 pillow package; there was one called PIL in the Python 2 days.

For me, pillow loads, resizes and creates new images, which is handy for adding borders and copyright/source information to each image as well as creating thumbnails. All this happens in memory and that makes everything go quickly, much faster than disk-based tools like ImageMagick and GraphicsMagick.

Of course, nothing is going to happen if the package cannot be loaded, and that is what the error message is about. Linux is what I mainly use, so that is the context for this scenario. What I was doing was something like the following in the Python script:

import PIL

Then, I referred to PIL.Image when I needed it, and this could not be found when the script was run from the command line (BASH). The solution was to add something like the following:

from PIL import Image

That sorted it, and I must have run into trouble with PIL.ImageFilter too, since I now load it in the same manner. In both cases, I could just refer to Image or ImageFilter as I required and without the dot syntax. However, you need to make sure that there is no clash with anything in another loaded Python package when doing this.

Upgrading Julia packages

23^rd January 2024

Whenever a new version of Julia is released, I have certain actions to perform. With Julia 1.10, installing and updating it has become more automated thanks to shell scripting or the use of WINGET, depending on your operating system. Because my environment predates this, I found that the manual route still works best for me, and I will continue to do that.

Returning to what needs doing after an update, this includes updating Julia packages. In the REPL, this involves dropping to the PKG subshell using the ] key if you want to avoid using longer commands or filling your history with what is less important for everyday usage.

Previously, I often ran code to find a package was missing after updating Julia, so the add command was needed to reinstate it. That may raise its head again, but there also is the up command for upgrading all packages that were installed. This could be a time saver when only a single command is needed for all packages and not one command for each package as otherwise might be the case.

Generating custom log messages in R

24^th April 2023

While I have been exploring the use of R on a private basis during the last few years, a recent opportunity allowed me to use this exposure at work. This took the form of creating a utility script for use by others. To keep things lightweight, I did not go down the packaging route, but that may come later, possibly for something else.

However, anything used by others needs input checking and comprehensible feedback should anything go wrong. For me, that meant looking at the message, warning and stop functions. The last of these aborts script execution when there is a critical error, something that the other two do not do. The message function is for informative user input, while the warning function suggests things that may need their attention.

Each function takes string input and sends this to the terminal or log. They also can combine different pieces of text in the style of the paste0 function, and can take the text output of other functions as input. Used in combination with conditional logic or error handling, they can help a user track down what went wrong without their needing to ask a script developer. Anything that helps anyone else to help themselves has to be good.

Removing duplicate characters from strings using BASH scripting

30^th March 2023

Recently, I wanted to extract some text from the Linux command by word number only for multiple spaces to make things less predictable. The solution was to remove the duplicate spaces. This can be done using sed, but you add the complexity of regular expressions if you opt for that solution. Instead, the tr command offers a neater approach. For removing duplicate spaces, the command takes the following form:

echo "test test" | tr -s " "

Since I was piping some text to the command, that is what I have above. The tr command is intended to replace or delete characters, and the -s switch is a shorthand for --squeeze-repeats. The actual character to be deduplicated is passed in quotes at the end; here, it is a space, but it could be anything that is duplicated. The resulting text in this example becomes:

test test

After the processing, there is now only one space separating the two words, which is the solution that I sought. It certainly cut out any variability that I was encountering in my usage.

Moves to Hugo

30^th November 2022

What amazes me is how things can become more complicated over time. As long as you knew HTML, CSS and JavaScript, building a website was not as onerous as long as web browsers played ball with it. Since then, things have got easier to use but more complex at the same time. One example is WordPress: in the early days, themes were much simpler than they are now. The web also has got more insecure over time, and that adds to complexity as well. It sometimes feels as if there is a choice to make between ease of use and simplicity.

It is against that background that I reassessed the technology that I was using on my public transport and Irish history websites. The former used WordPress, while the latter used Drupal. The irony was that the simpler website was using the more complex platform, so the act of going simpler probably was not before time. Alternatives to WordPress were being surveyed for the first of the pair, but none had quite the flexibility, pervasiveness and ease of use that WordPress offers.

There is another approach that has been gaining notice recently. One part of this is the use of Markdown for web publishing. This is a simple and distraction-free plain text format that can be transformed into something more readable. It sees usage in blogs hosted on GitHub, but also facilitates the generation of static websites. The clutter is absent for those who have no need of the Gutenberg Editor on WordPress.

With the content written in Markdown, it can be fed to a static website generator like Hugo. Using defined templates and fixed assets like CSS together with images and other static files, it can slot the content into HTML files very speedily since it is written in the Go programming language. Once you get acclimatised, there are no folder structures that cannot be used, so you get full flexibility in how you build out your website. Sitemaps and RSS feeds can be built at the same time, both using the same input as the HTML files.

In a nutshell, it automates what once needed manual effort used a code editor or a visual web page editor. The use of HTML snippets and layouts means that there is no necessity for hand-coding content, like there was at the start of the web. It also helps that Bootstrap can be built in using Node, so that gives a basis for any styling. Then, SCSS can take care of things, giving even more automation.

Given that there is no database involved in any of this, the required information has to be stored somewhere, and neither the Markdown content nor the layout files contain all that is needed. The main site configuration is defined in a single TOML file, and you can have a single one of these for every publishing destination; I have development and production servers, which makes this a very handy feature. Otherwise, every Markdown file needs a YAML header where titles, template references, publishing status and other similar information gets defined. The layouts then are linked to their components, and control logic and other advanced functionality can be added too.

Because static files are being created, it does mean that site searching and commenting, or contact pages cannot work like they would on a dynamic web platform. Often, external services are plugged in using JavaScript. One that I use for contact forms is Getform.io. Then, Zapier has had its uses in using the RSS feed to tweet site updates on Twitter when new content gets added. Though I made different choices, Disqus can be used for comments and Algolia for site searching. Generally, though, you can find yourself needing to pay, particularly if you need to remove advertising or gain advanced features.

Some commenting service providers offer open source self-hosted options, but I found these difficult to set up and ended up not offering commenting at all. That was after I tried out Cactus Comments only to find that it was not discriminating between pages, so it showed the same comments everywhere. There are numerous alternatives like Remark42, Hyvor Talk, Commento, FastComments, Utterances, Isso, Mouthful, Muut and HyperComments but trying them all out was too time-consuming for what commenting was worth to me. It also explains why some static websites even send readers to Twitter if they have something to say, though I have not followed this way of working.

For searching, I added a JavaScript/JSON self-hosted component to the transport website, and it works well. However, it adds to the size of what a browser needs to download. That is not a major issue for desktop browsers, but the situation with mobile browsers is such that it has a sizeable effect. Testing with PageSpeed and Lighthouse highlighted this, even if I left things as they are. The solution works well in any case.

One thing that I have yet to work out is how to edit or add content while away from home. Editing files using an SSH connection is as much a possibility as setting up a Hugo publishing setup on a laptop. After that, there is the question of using a tablet or phone, since content management systems make everything web based. These are points that I have yet to explore.

As is natural with a code-based solution, there is a learning curve with Hugo. Reading a book provided some orientation, and looking on the web resolved many conundrums. There is good documentation on the project website, while forum discussions turn up on many a web search. Following any research, there was next to nothing that could not be done in some way.

Migration of content takes some forethought and took quite a bit of time, though there was an opportunity to carry some housekeeping as well. The history website was small, so copying and pasting sufficed. For the transport website, I used Python to convert what was on the database into Markdown files before refining the result. That provided some automation, but left a lot of work to be done afterwards.

The results were satisfactory, and I like the associated simplicity and efficiency. That Hugo works so fast means that it can handle large websites, so it is scalable. The new Markdown method for content production is not problematical so far apart from the need to make it more portable, and it helps that I found a setup that works for me. This also avoids any potential dealbreakers that continued development of publishing platforms like WordPress or Drupal could bring. For the former, I hope to remain with the Classic Editor indefinitely, but now have another option in case things go too far.

A look at the Julia programming language

19^th November 2022

Several open-source computing languages get mentioned when talking about working with data. Among these are R and Python, but there are others; Julia is another one of these. It took a while before I got to check out Julia because I felt the need to get acquainted with R and Python beforehand. There are others like Lua to investigate too, but that can wait for now.

With the way that R is making an incursion into clinical data reporting analysis following the passage of decades when SAS was predominant, my explorations of Julia are inspired by a certain contrariness on my part. Alongside some small personal projects, there has been some reading in (digital) book form and online. Concerning the latter of these, there are useful tutorials like Introduction to Data Science: Learn Julia Programming, Maths & Data Science from Scratch or Julia Programming: a Hands-on Tutorial. Like what happens with R, there are online versions of published books available free of charge, and they include Julia Data Science and Interactive Visualization and Plotting with Julia. Video learning can help too and Jane Herriman has recorded and shared useful beginner's guides on YouTube that start with the basics before heading onto more advanced subjects like multiple dispatch, broadcasting and metaprogramming.

This piece of learning has been made of simple self-inspired puzzles before moving on to anything more complex. That differs from my dalliance with R and Python, where I ventured into complexity first, not least because of testing them out with public COVID data. Eventually, I got around to doing that with Julia too, though my interest was beginning to wane by then, and Julia's abilities for creating multipage PDF files were such that the PDF Toolkit was needed to help with this. Along the way, I have made use of such packages as CSV.jl, DataFrames.jl, DataFramesMeta, Plots, Gadfly.jl, XLSX.jl and JSON3.jl, among others. After that, there is PrettyTables.jl to try out, and anyone can look at the Beautiful Makie website to see what Makie can do. There are plenty of other packages creating graphs, such as SpatialGraphs.jl, PGFPlotsX and GRUtils.jl. For formatting numbers, options include Format.jl and Humanize.jl.

So far, my primary usage has been with personal financial data together with automated processing and backup of photo files. The photo file processing has taken advantage of the ability to compile Julia scripts for added speed because just-in-time compilation always means there is a lag before the real work begins.

VS Code is my chosen editor for working with Julia scripts, since it has a plugin for the language. That adds the REPL, syntax highlighting, execution and data frame viewing capabilities that once were added to the now defunct Atom editor by its own plugin. While it would be nice to have a keyboard shortcut for script execution, the whole thing works well and is regularly updated.

Naturally, there have been a load of queries as I have gone along and the Julia Documentation has been consulted as well as Julia Discourse and Stack Overflow. The latter pair have become regular landing spots on many a Google search. One example followed a glitch that I encountered after a Julia upgrade when I asked a question about this and was directed to the XLSX.jl Migration Guides where I got the information that I needed to fix my code for it to run properly.

There is more learning to do as I continue to use Julia for various things. Once compiled, it does run fast like it has been promised. The syntax paradigm is akin to R and Python, but there are Julia-specific features too. If you have used the others, the learning curve is lessened but not eliminated completely. This is not an object-oriented language as such, but its functional nature makes it familiar enough for getting going with it. In short, the project has come a long way since it started more than ten years ago. There is much for the scientific programmer, but only time will tell if it usurped its older competitors. For now, I will remain interested in it.

Disabling the SSL connection requirement in MySQL Workbench

7^th November 2022

A while ago, I found that MySQL Workbench would only use SSL connections and that was stopping it from connecting to local databases. Thus, I looked for a way to override this: the cure was to go to Database > Manage Connections... in the menus for the application's home tab. In the dialogue box that appeared, I chose the connection of interest and went to the Advanced panel under Connection and removed the line useSSL=1 from the Others field. The screenshot below shows you what things looked like before the change was made. Naturally, the best practice would be to secure a remote database connection using SSL, so this approach is best reserved for remote non-production databases. While it may be that this does not happen now, I thought I would share this in case the problem persists for anyone.

Removing a Julia package

5^th October 2022

While I have been programming with SAS for a few decades, and it remains a linchpin in the world of clinical development in the pharmaceutical industry, other technologies like R and Python are gaining a foothold. Two years ago, I started to look at those languages with personal projects being a great way of facilitating this. In addition, I got to hear of Julia and got to try that too. That journey continues since I have put it into use for importing and backing up photos, and there are other possible uses too.

Recently, I updated Julia to version 1.8.2 but ran into a problem with the DataArrays package that I had installed, so I decided to remove it since it was added during experimentation. Though the Pkg package that is used for package management is documented, I had not got to that, which meant that some web searching ensued. It turns out that there are two ways of doing this. One uses the REPL: after pressing the ] key, the following command gets issued:

rm DataArrays

When all is done, pressing the delete or backspace keys returns things to normal. This also can be done in a script as well as the REPL, and the following line works in both instances:

using Pkg; Pkg.rm("DataArrays")

While the semicolon is used to separate two commands issued on the same line, they can be on different lines or issued separately just as well. Naturally, DataArrays is just an example here; you just replace that with the name of whatever other package you need to remove. Since we can get carried away when downloading packages, there are times when a clean-up is needed to remove redundant packages, so knowing how to remove any clutter is invaluable.

Accessing Julia REPL command history

4^th October 2022

In the BASH shell used on Linux and UNIX, the history command calls up a list of recent commands used and has many uses. There is a .bash_history file in the root of the user folder that logs and provides all this information, so there are times when you need to exclude some commands from there, but that is another story.

The Julia REPL environment works similarly to many operating system command line interfaces, so I wondered if there was a way to recall or refer to the history of commands issued. So far, I have not come across an equivalent to the BASH history command for the REPL itself, but there the command history is retained in a file like .bash_history. The location varies on different operating systems, though. On Linux, it is ~/.julia/logs/repl_history.jl while it is %USERPROFILE%\.julia\logs\repl_history.jl on Windows. While I tend to use scripts that I have written in VSCode rather than entering pieces of code in the REPL, the history retains its uses. Thus, I am sharing it here for others. In the past, the location changed, but these are the ones with Julia 1.8.2, the version that I have at the time of writing.

« Older Entries « » Newer Entries »