Technology Tales

Notes drawn from experiences in consumer and enterprise technology

TOPIC: UNICODE TRANSFORMATION FORMATS

Safe file copying methods for cross platform SAS programming

9th February 2026

Not so long ago, I needed to copy a file within a SAS program, only to find that an X command would not accomplish the operation. That cast my mind back to file operations using SAS in order to be platform-independent. Thus, I fell upon statements within a data step.

Before going that far, you need to define filename statements for the source and target locations like this:

filename src "<location of source file or files>" lrecl=32767 encoding="utf-8";
filename dst "<target location for copies>" lrecl=32767 encoding="utf-8";

Above, I also specified logical record length (LRECL) and UTF-8 encoding. The latter is safer in these globalised times, when the ASCII character set does not suffice for everyday needs.

Once you have defined filename statements, you can move to the data step itself, which does not output any data set because of the data _null_ statement:

data _null_;
    length rc msg $ 300;
    rc = fcopy('src','dst');
    if rc ne 0 then do;
        msg = sysmsg();
        putlog "ERROR: FCOPY failed rc=" rc " " msg;
    end;
    else putlog "NOTE: Copied OK.";
run;

The main engine here is the fcopy function, which outputs a return code (rc). Other statements like putlog are there to communication outcomes and error messages when the file copying operation fails. The text of the error message (msg) comes from the sysmsg function. After file copying has completed, it is time to unset the filename statements as follows:

filename src clear;
filename dst clear;

One thing that needs to be pointed out here is that this is an approach best reserved for text files like what I was copying when doing this. When I attempted the same kind of operation with an XLSX file, the copy would not open in Excel afterwards. Along the way, it had got scrambled. Once you realise that an XLSX file is essentially a zip archive of XML files, you might see how that could go awry.

Resolving Python UnicodeEncodeError messages issued while executing scripts using the Windows Command Line

14th March 2025

Recently, I got caught out by this message when summarising some text using Python and Open AI's API while working within VS Code:

UnicodeEncodeError: 'charmap' codec can't encode characters in position 56-57: character maps to <undefined>

There was no problem on Linux or macOS, but it was triggered on the Windows command line from within VS Code. Unlike the Julia or R REPL's, everything in Python gets executed in the console like this:

& "C:/Program Files/Python313/python.exe" script.py

The Windows command line shell operated with cp1252 character encoding, and that was tripping up the code like the following:

with open("out.txt", "w") as file:
    file.write(new_text)

The cure was to specify the encoding of the output text as utf-8:

with open("out.txt", "w", encoding='utf-8') as file:
    file.write(new_text)

After that, all was well and text was written to a file like in the other operating systems. One other thing to note is that the use of backslashes in file paths is another gotcha. Adding an r before the quotes gets around this to escape the contents, like using double backslashes. Using forward slashes is another option.

with open(r"c:\temp\out.txt", "w", encoding='utf-8') as file:
    file.write(new_text)

  • The content, images, and materials on this website are protected by copyright law and may not be reproduced, distributed, transmitted, displayed, or published in any form without the prior written permission of the copyright holder. All trademarks, logos, and brand names mentioned on this website are the property of their respective owners. Unauthorised use or duplication of these materials may violate copyright, trademark and other applicable laws, and could result in criminal or civil penalties.

  • All comments on this website are moderated and should contribute meaningfully to the discussion. We welcome diverse viewpoints expressed respectfully, but reserve the right to remove any comments containing hate speech, profanity, personal attacks, spam, promotional content or other inappropriate material without notice. Please note that comment moderation may take up to 24 hours, and that repeatedly violating these guidelines may result in being banned from future participation.

  • By submitting a comment, you grant us the right to publish and edit it as needed, whilst retaining your ownership of the content. Your email address will never be published or shared, though it is required for moderation purposes.