Dealing with this Python error message on Windows: UnicodeEncodeError: ‘charmap’ codec can’t encode characters in position 56-57: character maps to <undefined>
Published on 14th March 2025 Estimated Reading Time: 2 minutesRecently, I got caught out by the above message when summarising some text using Python and Open AI’s API while working within VS Code. There was no problem on Linux or macOS, but it was triggered on the Windows command line from within VS Code. Unlike the Julia or R REPL’s, everything in Python gets executed in the console like this:
& "C:/Program Files/Python313/python.exe" script.py
The Windows command line shell operated with cp1252 character encoding, and that was tripping up the code like the following:
with open("out.txt", "w") as file:
file.write(new_text)
The cure was to specify the encoding of the output text as utf-8:
with open("out.txt", "w", encoding='utf-8') as file:
file.write(new_text)
After that, all was well and text was written to a file like in the other operating systems. One other thing to note is that the use of backslashes in file paths is another gotcha. Adding an r before the quotes gets around this to escape the contents, like using double backslashes. Using forward slashes is another option.
with open(r"c:\temp\out.txt", "w", encoding='utf-8') as file:
file.write(new_text)
Please be aware that comment moderation is enabled and may delay the appearance of your contribution.