Technology Tales

Adventures in consumer and enterprise technology

TOPIC: PATH

Dealing with this Python error message on Windows: UnicodeEncodeError: 'charmap' codec can't encode characters in position 56-57: character maps to <undefined>

14th March 2025

Recently, I got caught out by the above message when summarising some text using Python and Open AI's API while working within VS Code. There was no problem on Linux or macOS, but it was triggered on the Windows command line from within VS Code. Unlike the Julia or R REPL's, everything in Python gets executed in the console like this:

& "C:/Program Files/Python313/python.exe" script.py

The Windows command line shell operated with cp1252 character encoding, and that was tripping up the code like the following:

with open("out.txt", "w") as file:
    file.write(new_text)

The cure was to specify the encoding of the output text as utf-8:

with open("out.txt", "w", encoding='utf-8') as file:
    file.write(new_text)

After that, all was well and text was written to a file like in the other operating systems. One other thing to note is that the use of backslashes in file paths is another gotcha. Adding an r before the quotes gets around this to escape the contents, like using double backslashes. Using forward slashes is another option.

with open(r"c:\temp\out.txt", "w", encoding='utf-8') as file:
    file.write(new_text)

Resolving a clash between Homebrew and Python

22nd November 2022

For reasons that I cannot recall now, I installed the Hugo static website generator on my Linux system and web servers using Homebrew. The only reason that I suggest is that it might have been a way to get the latest version at the time because Linux Mint only does major changes like that every two years, keeping it in line with long-term support editions of Ubuntu.

When Homebrew was installed, it changed the lookup path for command line executables by adding the following line to my .bashrc file:

eval "$(/home/linuxbrew/.linuxbrew/bin/brew shellenv)"

This executed the following lines:

export HOMEBREW_PREFIX="/home/linuxbrew/.linuxbrew";
export HOMEBREW_CELLAR="/home/linuxbrew/.linuxbrew/Cellar";
export HOMEBREW_REPOSITORY="/home/linuxbrew/.linuxbrew/Homebrew";
export PATH="/home/linuxbrew/.linuxbrew/bin:/home/linuxbrew/.linuxbrew/sbin${PATH+:$PATH}";
export MANPATH="/home/linuxbrew/.linuxbrew/share/man${MANPATH+:$MANPATH}:";
export INFOPATH="/home/linuxbrew/.linuxbrew/share/info:${INFOPATH:-}";

While the result suits Homebrew, it changed the setup of Python and its packages on my system. Eventually, this had undesirable consequences, like messing up how Spyder started, so I wanted to change this. There are other things that I have automated using Python and these were not working either.

One way that I have seen suggested is to execute the following command, but I cannot vouch for this:

brew unlink python

What I did was to comment out the offending line in .bashrc and replace it with the following:

export PATH="$PATH:/home/linuxbrew/.linuxbrew/bin:/home/linuxbrew/.linuxbrew/sbin"

export HOMEBREW_PREFIX="/home/linuxbrew/.linuxbrew";
export HOMEBREW_CELLAR="/home/linuxbrew/.linuxbrew/Cellar";
export HOMEBREW_REPOSITORY="/home/linuxbrew/.linuxbrew/Homebrew";

export MANPATH="/home/linuxbrew/.linuxbrew/share/man${MANPATH+:$MANPATH}:";
export INFOPATH="${INFOPATH:-}/home/linuxbrew/.linuxbrew/share/info";

The first command adds Homebrew paths to the end of the PATH variable rather than the beginning, which was the previous arrangement. This ensures system folders are searched for executable files before Homebrew folders. It also means Python packages load from my user area instead of the Homebrew location, which happened under Homebrew's default configuration. When working with Python packages, remember not to install one version at the system level and another in your user area, as this creates conflicts.

So far, the result of the Homebrew changes is not unsatisfactory, and I will watch for any rough edges that need addressing. If something comes up, then I will set things up in another way.

Searching file contents using PowerShell

25th October 2018

Having made plenty of use of grep on the Linux/UNIX command and findstr on the legacy Windows command line, I wondered if PowerShell could be used to search the contents of files for a text string. Usefully, this turns out to be the case, but I found that the native functionality does not use what I have used before. The form of the command is given below:

Select-String -Path <filename search expression> -Pattern "<search expression>" > <output file>

While you can have the output appear on the screen, it always seems easier to send it to a file for subsequent use, and that is what I am doing above. The input to the -Path switch can be a filename or a wildcard expression, while that to the -Pattern can be a text string enclosed in quotes or a regular expression. Given that it works well once you know what to do, here is an example:

Select-String -Path *.sas -Pattern "proc report" > c:\temp\search.txt

The search.txt file then includes both the file information and the text that has been found for the sake of checking that you have what you want. What you do next is up to you.

Copying a directory tree on a Windows system using XCOPY and ROBOCOPY

17th September 2016

My usual method for copying a directory tree without any of the files in there involves the use of the Windows command line tool XCOPY and the command takes the following form:

xcopy /t /e <source> <destination>

The /t switch tells XCOPY to copy only the directory structure, while the /e one tells it to include empty directories too. Substituting /s for /e would ensure that only non-empty directories are copied. <source> and <destination> are the directory paths that you want to use and need to be enclosed in quotes if you have a space in a directory name.

There is one drawback to this approach that I have discovered. When you have long directory paths, messages about there being insufficient memory are issued and the command fails. The limitation has nothing to do with the machine that you are using, but is a limitation of XCOPY itself.

After discovering that, I got to check if ROBOCOPY can do the same thing without the same file path length limitation because I did not have the liberty of shortening folder names to get the whole path within the length expected by XCOPY. The following is the form of the command that I found did what I needed:

robocopy <source1> <destination1> /e /xf *.* /r:0 /w:0 /fft

Here, <source1> and <destination1> are the directory paths that you want to use and need to be enclosed in quotes if you have a space in a directory name. The /e switch copies all subdirectories and not just non-empty ones. Then, the xf *.* portion excludes all files from the copying process. The remaining options are added to help with getting around access issues and to try to copy only those directories that do not exist in the destination location. The /ftt switch was added to address the latter by causing ROBOCOPY to assume FAT file times. To get around the folder permission delays, the /r:0 switch was added to stop any operation being retried, with /w:0 setting wait times to 0 seconds. All this was enough to achieve what I wanted, and I am keeping it on file for my future reference, as well as sharing it with you.

Smarter file renaming using PowerShell

14th November 2014

It appears that the Rename-Item commandlet in PowerShell is a very useful tool when it comes to smarter renaming of files. Even text substitution is a possibility, and what follows is an example that takes the output of the Dir command for listing the files in a directory and replaces hyphens with underscores in each one.

Dir | Rename-Item –NewName { $_.name –replace “-“,”_” }

The result is that something like the-file.txt becomes the_file.txt. This behaviour is reminiscent of the rename command found on Linux and UNIX systems, where regular expressions can be used, like in the following example that has the same result as the above:

rename 's/-/_/g' *

In both cases, you do need to be careful as to what files are in a directory for this, though the wildcard syntax on Linux or UNIX will be more familiar to anyone who has worked with files via almost any command line. Another thing to watch in the UNIX world is that * parses the whole directory structure, and that could be something that is not wanted for much of the time.

All of this is a far cry from the capabilities of the ren or rename command used in the days of MS-DOS and what has become the legacy Windows command line. Apart from simple renaming, any attempt at tweaking a filename through substitution ended up with the extra string getting appended to filenames when I tried it. Thus, the PowerShell option looks better in comparison.

Changing the working directory in a SAS session

12th August 2014

It appears that PROC SGPLOT along with other statistical graphics procedures creates image files, even if you are creating RTF or PDF files. By default, these are PNG files, but there are other possibilities. When working with PC SAS, I have seen them written to the current working directory and that could clutter up your folder structure, especially if they are unwanted.

Being unable to track down a setting that controls this behaviour, I resolved to find a way around it by sending the files to the SAS work directory so they are removed when a SAS session is ended. One option is to set the session's working directory to be the SAS work one, which can be done in SAS code without needing to use the user interface. As a result, you get some automation.

The method is implicit, though, in that you need to use an X statement to tell the operating system to change the folder for you. Here is the line of code that I have used:

x "cd %sysfunc(pathname(work))";

The X statement passes commands to an operating system's command line, and they are enclosed in quotes. %sysfunc then is a macro command that allows certain data step functions or call routines as well as some SCL functions to be executed. An example of the latter is pathname and this resolves library or file references, and it is interrogating the location of the SAS work library here so it can be passed to the operating systems cd (change directory) command for processing. Since this method works on Windows and UNIX, Linux should be covered too, offering a certain amount of automation since you don't have to specify the location of the SAS work library in every session due to the folder name changing all the while.

Of course, if someone were to tell me of another way to declare the location of the generated PNG files that works with RTF and PDF ODS destinations, then I would be all ears. Even direct output without image file creation would be even better. Until then, though, the above will do nicely.

  • The content, images, and materials on this website are protected by copyright law and may not be reproduced, distributed, transmitted, displayed, or published in any form without the prior written permission of the copyright holder. All trademarks, logos, and brand names mentioned on this website are the property of their respective owners. Unauthorised use or duplication of these materials may violate copyright, trademark and other applicable laws, and could result in criminal or civil penalties.

  • All comments on this website are moderated and should contribute meaningfully to the discussion. We welcome diverse viewpoints expressed respectfully, but reserve the right to remove any comments containing hate speech, profanity, personal attacks, spam, promotional content or other inappropriate material without notice. Please note that comment moderation may take up to 24 hours, and that repeatedly violating these guidelines may result in being banned from future participation.

  • By submitting a comment, you grant us the right to publish and edit it as needed, whilst retaining your ownership of the content. Your email address will never be published or shared, though it is required for moderation purposes.