Technology Tales

Adventures in consumer and enterprise technology

Welcome

  • We live in a world where computer software penetrates every life and is embedded in nearly every piece of hardware that we have. Everything is digital now; it goes beyond computers and cameras, a major point of encounter when this website was started.

  • As if that were not enough, we have AI inexorably encroaching on our daily lives. The possibilities are becoming so endless that one needs to be careful not to get lost in it all. Novelty beguiles us now as much around the time of the expansion of personal computing availability decades ago, and then we got the internet. Excitement has returned, at least for a while.

  • All this percolates what is here, just dive in to see what you can find!

Ansible automation for Linux Mint updates with repository failover handling

7th November 2025

Recently, I had a Microsoft PPA output disrupt an Ansible playbook mediated upgrade process for my main Linux workstation. Thus, I ended up creating a failover for this situation, and the first step in the playbook was to define the affected repo:

  vars:
    microsoft_repo_url: "https://packages.microsoft.com/repos/code/dists/stable/InRelease"

The next move was to start defining tasks, with the first testing the repo to pick up any lack of responsiveness and flag that for subsequent operations.

  tasks:
  - name: Check Microsoft repository availability
    uri:
      url: "{{ microsoft_repo_url }}"
      method: HEAD
      return_content: no
      timeout: 10
    register: microsoft_repo_check
    failed_when: false

  - name: Set flag to skip Microsoft updates if unreachable
    set_fact:
      skip_microsoft_repos: "{{ microsoft_repo_check.status is not defined or microsoft_repo_check.status != 200 }}"

In the event of a failure, the next task was to disable the repo to allow other processing to take place. This was accomplished by temporarily renaming the relevant files under /etc/apt/sources.list.d/.

    - name: Temporarily disable Microsoft repositories
    become: true
    shell: |
      for file in /etc/apt/sources.list.d/microsoft*.list; do
        [ -f "$file" ] && mv "$file" "${file}.disabled"
      done
      for file in /etc/apt/sources.list.d/vscode*.list; do
        [ -f "$file" ] && mv "$file" "${file}.disabled"
      done
    when: skip_microsoft_repos | default(false)
    changed_when: false

With that completed, the rest of the update actions could be performed near enough as usual.

  - name: Update APT cache (retry up to 5 times)
    apt:
      update_cache: yes
    register: apt_update_result
    retries: 5
    delay: 10
    until: apt_update_result is succeeded

  - name: Perform normal upgrade
    apt:
      upgrade: yes
    register: apt_upgrade_result
    retries: 3
    delay: 10
    until: apt_upgrade_result is succeeded

  - name: Perform dist-upgrade with autoremove and autoclean
    apt:
      upgrade: dist
      autoremove: yes
      autoclean: yes
    register: apt_dist_result
    retries: 3
    delay: 10
    until: apt_dist_result is succeeded

After those, another renaming operation restores the earlier filenames to what they were.

  - name: Re-enable Microsoft repositories
    become: true
    shell: |
      for file in /etc/apt/sources.list.d/*.disabled; do
        base="$(basename "$file" .disabled)"
        if [[ "$base" == microsoft* || "$base" == vscode* || "$base" == edge* ]]; then
          mv "$file" "/etc/apt/sources.list.d/$base"
        fi
      done
    when: skip_microsoft_repos | default(false)
    changed_when: false

Needless to say, this disabling only happens in the event of there being a system failure. Otherwise, the steps are skipped and everything else is completed as it should be. While there is some cause for extended the repository disabling actions to other third repos as well, that is something that I will leave aside for now. Even this shows just how much can be done using Ansible playbooks and how much automation can be achieved. As it happens, I even get Flatpaks updated in much the same way:

    -   name: Ensure Flatpak is installed
      apt:
        name: flatpak
        state: present
        update_cache: yes
        cache_valid_time: 3600

  -   name: Update Flatpak remotes
      command: flatpak update --appstream -y
      register: flatpak_appstream
      changed_when: "'Now at' in flatpak_appstream.stdout"
      failed_when: flatpak_appstream.rc != 0

  -   name: Update all Flatpak applications
      command: flatpak update -y
      register: flatpak_result
      changed_when: "'Now at' in flatpak_result.stdout"
      failed_when: flatpak_result.rc != 0

  -   name: Install unused Flatpak applications
      command: flatpak uninstall --unused
      register: flatpak_cleanup
      changed_when: "'Nothing' not in flatpak_cleanup.stdout"
      failed_when: flatpak_cleanup.rc != 0

  -   name: Repair Flatpak installations
      command: flatpak repair
      register: flatpak_repair
      changed_when: flatpak_repair.stdout is search('Repaired|Fixing')
      failed_when: flatpak_repair.rc != 0

The ability to call system commands as you see in the above sequence is an added bonus, though getting the response detection completely sorted remains an outstanding task. All this has only scratched the surface of what is possible.

Automating Positron and RStudio updates on Linux Mint 22

6th November 2025

Elsewhere, I have written about avoiding manual updates with VSCode and VSCodium. Here, I come to IDE's produced by Posit, formerly RStudio, for data science and analytics uses. The first is a more recent innovation that works with both R and Python code natively, while the second has been around for much longer and focusses on native R code alone, though there are R packages allowing an interface of sorts with Python. Neither are released via a PPA, necessitating either manual downloading or the scripted approach taken here for a Linux system. Each software tool will be discussed in turn.

Positron

Now, we work through a script that automates the upgrade process for Positron. This starts with a shebang line calling the bash executable before moving to a line that adds safety to how the script works using a set statement. Here, the -e switch triggers exiting whenever there is an error, halting the script before it carries on to perform any undesirable actions. That is followed by the -u switch that causes errors when unset variables are called; normally these would be assigned a missing value, which is not desirable in all cases. Lastly, the -o pipefail switch causes a pipeline (cmd1 | cmd2 | cm3) to fail if any command in the pipeline produces an error, which can help debugging because the error is associated with the command that fails to complete.

#!/bin/bash
set -euo pipefail

The next step then is to determine the architecture of the system on which the script is running so that the correct download is selected.

ARCH=$(uname -m)
case "$ARCH" in
  x86_64) POSIT_ARCH="x64" ;;
  aarch64|arm64) POSIT_ARCH="arm64" ;;
  *) echo "Unsupported arch: $ARCH"; exit 1 ;;
esac

Once that completes, we define the address of the web page to be interrogated and the path to the temporary file that is to be downloaded.

RELEASES_URL="https://github.com/posit-dev/positron/releases"
TMPFILE="/tmp/positron-latest.deb"

Now, we scrape the page to find the address of the latest DEB file that has been released.

echo "Finding latest Positron .deb for $POSIT_ARCH..."
DEB_URL=$(curl -fsSL "$RELEASES_URL" \
  | grep -Eo "https://cdn\.posit\.co/[A-Za-z0-9/_\.-]+Positron-[0-9\.~-]+-${POSIT_ARCH}\.deb" \
  | head -n 1)

If that were to fail, we get an error message produced before the script is aborted.

if [ -z "${DEB_URL:-}" ]; then
  echo "Could not find a .deb link for ${POSIT_ARCH} on the releases page"
  exit 1
fi

Should all go well thus far, we download the latest DEB file using curl.

echo "Downloading: $DEB_URL"
curl -fL "$DEB_URL" -o "$TMPFILE"

When the download completes, we try installing the package using apt, much like we do with a repo, apart from specifying an actual file path on our system.

echo "Installing Positron..."
sudo apt install -y "$TMPFILE"

Following that, we delete the installation file and issue a message informing the user of the task's successful completion.

echo "Cleaning up..."
rm -f "$TMPFILE"

echo "Done."

When I do this, I tend to find that the Python REPL console does not open straight away, causing me to shut down Positron and leaving things for a while before starting it again. There may be temporary files that need to be expunged and that needs its own time. Someone else might have a better explanation that I am happy to use if that makes more sense than what I am suggesting. Otherwise, all works well.

RStudio

A lot of the same processing happens during the script updating RStudio, so we will just cover the differences. The set -x statement ensures that every command is printed to the console for the debugging that was needed while this was being developed. Otherwise, much code, including architecture detection, is shared between the two apps.

#!/bin/bash
set -euo pipefail
set -x

# --- Detect architecture ---
ARCH=$(uname -m)
case "$ARCH" in
  x86_64) RSTUDIO_ARCH="amd64" ;;
  aarch64|arm64) RSTUDIO_ARCH="arm64" ;;
  *) echo "Unsupported architecture: $ARCH"; exit 1 ;;
esac

Figuring out the distro version and the web page to scrape was where additional effort was needed, and that is reflected in some of the code that follows. Otherwise, many of the ideas applied with Positron also have a place here.

# --- Detect Ubuntu base ---
DISTRO=$(grep -oP '(?<=UBUNTU_CODENAME=).*' /etc/os-release || true)
[ -z "$DISTRO" ] && DISTRO="noble"
# --- Define paths ---
TMPFILE="/tmp/rstudio-latest.deb"
LOGFILE="/var/log/rstudio_update.log"

echo "Detected Ubuntu base: ${DISTRO}"
echo "Fetching latest version number from Posit..."

# --- Get version from Posit's official RStudio Desktop page ---
VERSION=$(curl -s https://posit.co/download/rstudio-desktop/ \
  | grep -Eo 'rstudio-[0-9]+\.[0-9]+\.[0-9]+-[0-9]+' \
  | head -n 1 \
  | sed -E 's/rstudio-([0-9]+\.[0-9]+\.[0-9]+-[0-9]+)/\1/')

if [ -z "$VERSION" ]; then
  echo "Error: Could not extract the latest RStudio version number from Posit's site."
  exit 1
fi

echo "Latest RStudio version detected: ${VERSION}"

# --- Construct download URL (Jammy build for Noble until Noble builds exist) ---
BASE_DISTRO="jammy"
BASE_URL="https://download1.rstudio.org/electron/${BASE_DISTRO}/${RSTUDIO_ARCH}"
FULL_URL="${BASE_URL}/rstudio-${VERSION}-${RSTUDIO_ARCH}.deb"

echo "Downloading from:"
echo "  ${FULL_URL}"

# --- Validate URL before downloading ---
if ! curl --head --silent --fail "$FULL_URL" >/dev/null; then
  echo "Error: The expected RStudio package was not found at ${FULL_URL}"
  exit 1
fi

# --- Download and install ---
curl -L "$FULL_URL" -o "$TMPFILE"
echo "Installing RStudio..."
sudo apt install -y "$TMPFILE" | tee -a "$LOGFILE"

# --- Clean up ---
rm -f "$TMPFILE"
echo "RStudio update to version ${VERSION} completed successfully." | tee -a "$LOGFILE"

When all ended, RStudio worked without a hitch, leaving me to move on to other things. The next time that I am prompted to upgrade the environment, this is the way I likely will go.

A primer on Regular Expressions in PowerShell: Learning through SAS code analysis

5th November 2025

Many already realise that regular expressions are a powerful tool for finding patterns in text, though they can be complex in their own way. Thus, this primer uses a real-world example (scanning SAS code) to introduce the fundamental concepts you need to work with regex in PowerShell. Since the focus here is on understanding how patterns are built and applied, you do not need to know SAS to follow along.

Getting Started with Pattern Matching

Usefully, PowerShell gives you several ways to test whether text matches a pattern, and the simplest is the -match operator:

$text = "proc sort data=mydata;"
$text -match "proc sort"  # Returns True

When -match finds a match, it returns True. PowerShell's -match operator is case-insensitive by default, so "PROC SORT" would also match. If you need case-sensitive matching, use -cmatch instead.

For more detailed results, Select-String is useful:

$text | Select-String -Pattern "proc sort"

This shows you where the match occurred and provides context around it.

Building Your First Patterns

Literal Matches

The simplest patterns match exactly what you type. If you want to find the text proc datasets, you write:

$pattern = "proc datasets"

This will match those exact words in sequence.

Special Characters and Escaping

Some characters have special meaning in regex. The dot (.) matches any single character, so if you wish to match a literal dot, you need to escape it with a backslash:

$pattern = "first\."  # Matches "first." literally

Without the backslash, first. would match "first" followed by any character (for example, "firstly", "first!", "first2").

Alternation

The pipe symbol (|) lets you match one thing or another:

$pattern = "(delete;$|output;$)"

This matches either delete; or output; at the end of a line. The dollar sign ($) is an anchor that means "end of line".

Anchors: Controlling Where Matches Occur

Anchors do not match characters. Instead, they specify positions in the text.

  • ^ matches the start of a line
  • $ matches the end of a line
  • \b matches a word boundary (the position between a word character and a non-word character)

Here is a pattern that finds proc sort only at the start of a line:

$pattern = "^\s*proc\s+sort"

Breaking this down:

  • ^ anchors to the start of the line
  • \s* matches zero or more whitespace characters
  • proc matches the literal text
  • \s+ matches one or more whitespace characters
  • sort matches the literal text

The \s is a shorthand for any whitespace character (spaces, tabs, newlines). The * means "zero or more" and + means "one or more".

Quantifiers: How Many Times Something Appears

Quantifiers control repetition:

    • * means zero or more
    • + means one or more
    • ? means zero or one (optional)
    • {n} means exactly n times
    • {n,} means n or more times
    • {n,m} means between n and m times

By default, quantifiers are greedy (they match as much as possible). Adding ? after a quantifier makes it reluctant (it matches as little as possible):

$pattern = "%(m|v).*?(?=[,(;\s])"

Here .*? matches any characters, but as few as possible before the next part of the pattern matches. This is useful when you want to stop at the first occurrence of something rather than continuing to the last.

Groups and Capturing

Parentheses create groups, which serve two purposes: they let you apply quantifiers to multiple characters, and they capture the matched text for later use:

$pattern = "(first\.|last\.)"

This creates a group that matches either first. or last.. The group captures whichever one matches so you can extract it later.

Non-capturing groups, written as (?:...), group without capturing. This is useful when you need grouping for structure but do not need to extract the matched text:

$pattern = "(?:raw\.|sdtm\.|adam\.)"

Lookaheads: Matching Without Consuming

Lookaheads are powerful but can be confusing at first. They check what comes next without including it in the match.

A positive lookahead (?=...) succeeds if the pattern inside matches what comes next:

$pattern = "%(m|v).*?(?=[,(;\s])"

This matches %m or %v followed by any characters, stopping just before a comma, parenthesis, semicolon or whitespace. The delimiter is not included in the match.

A negative lookahead (?!...) succeeds if the pattern inside does not match what comes next:

$pattern = "(?!(?:r_options))\b(raw\.|sdtm\.|adam\.|r_)(?!options)"

This is more complex. It matches library prefixes like raw., sdtm., adam. or r_, but only if:

  • The text at the current position is not r_options
  • The matched prefix is not followed by options

This prevents false matches on text like r_options whilst still allowing r_something_else.

Inline Modifiers

You can change how patterns behave by placing modifiers at the start:

  • (?i) makes the pattern case-insensitive
  • (?-i) makes the pattern case-sensitive
$pattern = "(?i)^\s*proc\s+sort"

This matches proc sort, PROC SORT, Proc Sort or any other case variation.

A Complete Example: Finding PROC SORT Without DATA=

Let us build up a practical pattern step by step. The goal is to find PROC SORT statements that are missing a DATA= option.

Start with the basic match:

$pattern = "proc sort"

Add case-insensitivity and line-start flexibility:

$pattern = "(?i)^\s*proc\s+sort"

Add a word boundary to avoid matching proc sorting:

$pattern = "(?i)^\s*proc\s+sort\b"

Now add a negative lookahead to exclude lines that contain data=:

$pattern = "(?i)^\s*proc\s+sort\b(?![^;\n]*\bdata\s*=\s*)"

This negative lookahead checks the rest of the line (up to a semicolon or newline) and fails if it finds data followed by optional spaces, an equals sign and more optional spaces.

Finally, match the rest of the line:

$pattern = "(?i)^\s*proc\s+sort\b(?![^;\n]*\bdata\s*=\s*).*"

Working with Multiple Patterns

Real-world scanning often involves checking many patterns. PowerShell arrays make this straightforward:

$matchStrings = @(
    "%(m|v).*?(?=[,(;\s])",
    "(raw\.|sdtm\.|adam\.).*?(?=[,(;\s])",
    "(first\.|last\.)",
    "proc datasets"
)

$text = "Use %mvar in raw.dataset with first.flag"

foreach ($pattern in $matchStrings) {
    if ($text -match $pattern) {
        Write-Host "Match found: $pattern"
    }
}

Finding All Matches in a String

The -match operator only tells you whether a pattern matches. To find all occurrences, use [regex]::Matches:

$text = "first.x and last.y and first.z"
$pattern = "(first\.|last\.)"
$matches = [regex]::Matches($text, $pattern)

foreach ($match in $matches) {
    Write-Host "Found: $($match.Value) at position $($match.Index)"
}

This returns a collection of match objects, each containing details about what was found and where.

Replacing Text

The -replace operator applies a pattern and substitutes matching text:

$text = "proc datasets; run;"
$text -replace "proc datasets", "proc sql"
# Result: "proc sql; run;"

You can use captured groups in the replacement:

$text = "raw.demographics"
$text -replace "(raw\.|sdtm\.|adam\.)", "lib."
# Result: "lib.demographics"

Validating Patterns Before Use

Before running patterns against large files, validate that they are correct:

$matchStrings = @(
    "%(m|v).*?(?=[,(;\s])",
    "(raw\.|sdtm\.|adam\.).*?(?=[,(;\s])"
)

foreach ($pattern in $matchStrings) {
    try {
        [regex]::new($pattern) | Out-Null
        Write-Host "Valid: $pattern"
    }
    catch {
        Write-Host "Invalid: $pattern - $($_.Exception.Message)"
    }
}

This catches malformed patterns (for example, unmatched parentheses or invalid syntax) before they cause problems in your scanning code.

Scanning Files Line by Line

A typical workflow reads a file and checks each line against your patterns:

$matchStrings = @(
    "proc datasets",
    "(first\.|last\.)",
    "%(m|v).*?(?=[,(;\s])"
)

$code = Get-Content "script.sas"

foreach ($line in $code) {
    foreach ($pattern in $matchStrings) {
        if ($line -match $pattern) {
            Write-Warning "Pattern '$pattern' found in: $line"
        }
    }
}

Counting Pattern Occurrences

To understand which patterns appear most often:

$results = @{}

foreach ($pattern in $matchStrings) {
    $count = ($code | Select-String -Pattern $pattern).Count
    $results[$pattern] = $count
}

$results | Format-Table

This builds a table showing how many times each pattern matched across the entire file.

Practical Tips

Start simple. Build patterns incrementally. Test each addition to ensure it behaves as expected.

Use verbose mode for complex patterns. PowerShell supports (?x) which allows whitespace and comments inside patterns:

$pattern = @"
(?x)        # Enable verbose mode
^           # Start of line
\s*         # Optional whitespace
proc        # Literal "proc"
\s+         # Required whitespace
sort        # Literal "sort"
\b          # Word boundary
"@

Test against known examples. Create a small set of test strings that should and should not match:

$shouldMatch = @("proc sort;", "  PROC SORT data=x;")
$shouldNotMatch = @("proc sorting", "# proc sort")

foreach ($test in $shouldMatch) {
    if ($test -notmatch $pattern) {
        Write-Warning "Failed to match: $test"
    }
}

Document your patterns. Regular expressions can be cryptic. Add comments explaining what each pattern does and why it exists:

# Match macro variables starting with %m or %v, stopping at delimiters
$pattern1 = "%(m|v).*?(?=[,(;\s])"

# Match library prefixes (raw., sdtm., adam.) before delimiters
$pattern2 = "(raw\.|sdtm\.|adam\.).*?(?=[,(;\s])"

Two Approaches to the Same Problem

The document you started with presents two arrays of patterns. One is extended with negative lookaheads to handle ambiguous cases. The other is simplified for cleaner codebases. Understanding why both exist teaches an important lesson: regex is not one-size-fits-all.

The extended version handles edge cases:

$pattern = "(?i)(?!(?:r_options))\b(raw\.|sdtm\.|adam\.|r_)(?!options)\w*?(?=[,(;\s])"

The simplified version assumes those edge cases do not occur:

$pattern = "(raw\.|sdtm\.|adam\.).*?(?=[,(;\s])"

Choose the approach that matches your data. If you know your text follows strict conventions, simpler patterns are easier to maintain. If you face ambiguity, add the precision you need (but no more).

Moving Forward

This primer has introduced the building blocks: literals, special characters, anchors, quantifiers, groups, lookaheads and modifiers. You have seen how to apply patterns in PowerShell using -match, Select-String and [regex]::Matches. You have also learnt to validate patterns, scan files and count occurrences.

The best way to learn is to experiment. Take a simple text file and try to match patterns in it. Build patterns incrementally. When something does not work as expected, break the pattern down into smaller pieces and test each part separately.

Regular expressions are not intuitive at first, but they become clear with practice. The examples here are drawn from SAS code analysis, yet the techniques apply broadly. Whether you are scanning logs, parsing configuration files or extracting data from reports, the principles remain the same: understand what you want to match, build the pattern step by step, test thoroughly and document your work.

Rendering Markdown in WordPress without plugins by using Parsedown

4th November 2025

Much of what is generated using GenAI as articles is output as Markdown, meaning that you need to convert the content when using it in a WordPress website. Naturally, this kind of thing should be done with care to ensure that you are the creator and that it is not all the work of a machine; orchestration is fine, regurgitation does that add that much. Naturally, fact checking is another need as well.

Writing plain Markdown has secured its own following as well, with WordPress plugins switching over the editor to facilitate such a mode of editing. When I tried Markup Markdown, I found it restrictive when it came to working with images within the text, and it needed a workaround for getting links to open in new browser tabs as well. Thus, I got rid of it to realise that it had not converted any Markdown as I expected, only to provide rendering at post or page display time. Rather than attempting to update the affected text, I decided to see if another solution could be found.

This took me to Parsedown, which proved to be handy for accomplishing what I needed once I had everything set in place. First, that meant cloning its GitHub repo onto the web server. Next, I created a directory called includes under that of my theme. Into there, I copied Parsedown.php to that location. When all was done, I ensured that file and directory ownership were assigned to www-data to avoid execution issues.

Then, I could set to updating the functions.php file. The first line to get added there included the parser file:

require_once get_template_directory() . '/includes/Parsedown.php';

After that, I found that I needed to disable the WordPress rendering machinery because that got in the way of Markdown rendering:

remove_filter('the_content', 'wpautop');
remove_filter('the_content', 'wptexturize');

The last step was to add a filter that parsed the Markdown and passed its output to WordPress rendering to do the rest as usual. This was a simple affair until I needed to deal with code snippets in pre and code tags. Hopefully, the included comments tell you much of what is happening. A possible exception is $matches[0]which itself is an array of entire <pre>...</pre> blocks including the containing tags, with $i => $block doing a $key (not the same variable as in the code, by the way) => $value lookup of the values in the array nesting.

add_filter('the_content', function($content) {
    // Prepare a store for placeholders
    $placeholders = [];

    // 1. Extract pre blocks (including nested code) and replace with safe placeholders
    preg_match_all('//si', $content, $pre_matches);
    foreach ($pre_matches[0] as $i => $block) {
        $key = "§PREBLOCK{$i}§";
        $placeholders[$key] = $block;
        $content = str_replace($block, $key, $content);
    }

    // 2. Extract standalone code blocks (not inside pre)
    preg_match_all('/).*?<\/code>/si', $content, $code_matches);
    foreach ($code_matches[0] as $i => $block) {
        $key = "§CODEBLOCK{$i}§";
        $placeholders[$key] = $block;
        $content = str_replace($block, $key, $content);
    }

    // 3. Run Parsedown on the remaining content
    $Parsedown = new Parsedown();
    $content = $Parsedown->text($content);

    // 4. Restore both pre and code placeholders
    foreach ($placeholders as $key => $block) {
        $content = str_replace($key, $block, $content);
    }

    // 5. Apply paragraph formatting
    return wpautop($content);
}, 12);

All of this avoided dealing with extra plugins to produce the required result. Handily, I still use the Classic Editor, which makes this work a lot more easily. There still is a Markdown import plugin that I am tempted to remove as well to streamline things. That can wait, though. It best not add any more of them any way, not least avoid clashes between them and what is now in the theme.

Building an email summariser for Apple Mail using both OpenAI and Shortcuts

3rd November 2025

One thing that I am finding useful in Outlook is the ability to summarise emails using Copilot, especially for those that I do not need to read in full. While Apple Mail does have something similar, I find it to be very terse in comparison. Thus, I started to wonder about just that by using the OpenAI API and the Apple Shortcuts app. All that follows applies to macOS Sequoia, though the Tahoe version is with us too.

Prerequisite

While you can have the required OpenAI API key declared within the Shortcut, that is a poor practice from a security point of view. Thus, you will need this to be stored in the macOS keychain, which can be accomplished within a Terminal session and issuing a command like the following:

security add-generic-password -a openai -s openai_api_key -w [API Key]

In the command above, you need to add the actual API key before executing it to ensure that it is available to the steps that follow. To check that all is in order, issue the following command to see the API key again:

security find-generic-password -a openai -s openai_api_key -w

This process also allows you to rotate credentials without editing the workflow, allowing for a change of API keys should that ever be needed.

Building the Shortcut

With the API safely stored, we can move onto the actual steps involved in setting up the Email Summarisation Shortcut that we need.

Step 1: Collect Selected Email Messages

First, open the Shortcuts app and create a new Shortcut. Then, add a Run AppleScript action and that contains the following code:

tell application "Mail"
    set selectedMessages to selection
    set collectedText to ""
    repeat with msg in selectedMessages
        set msgSubject to subject of msg
        set msgBody to content of msg
        set collectedText to collectedText & "Subject: " & msgSubject & return & msgBody & return & return
    end repeat
end tell
return collectedText

This script loops through the selected Mail messages and combines their subjects and bodies into a single text block.

Step 2: Retrieve the API Key

Next, add a Run Shell Script action and paste this command:

security find-generic-password -a openai -s openai_api_key -w | tr -d 'n'

This reads the API key from the keychain and strips any trailing newline characters that could break the authentication header, the first of several gotchas that took me a while to sort.

Step 3: Send the Request to GPT-5

The, add a Get Contents of URL action and configure it as follows:

URL: https://api.openai.com/v1/chat/completions

Method: POST

Headers:

  • Authorization: Bearer [Shell Script result]
  • Content-Type: application/json

Request Body (JSON):

{
  "model": "gpt-5",
  "temperature": 1,
  "messages": [
    {
      "role": "system",
      "content": "Summarise the following email(s) clearly and concisely."
    },
    {
      "role": "user",
      "content": "[AppleScript result]"
    }
  ]
}

When this step is executed, it replaces [Shell Script result] with the output from Step 2, and [AppleScript result] with the output from Step 1. Here, GPT-5 only accepts a temperature value of 1 (a lower value would limit the variability in the output if it could be used), unlike other OpenAI models and what you may see documented elsewhere.

Step 4: Extract the Summary from the Response

The API returns a JSON response that you need to parse, an operation that differs according to the API; Anthropic Claude has a different structure, for example. To accomplish this for OpenAI's gateway, add these actions in sequence to replicate what is achieved using in Python by loading completion.choices[0].message.content:

  1. Get Dictionary from Input (converts the response to a dictionary)
  2. Get Dictionary Value for key "choices"
  3. Get Item from List (select item 1)
  4. Get Dictionary Value for key "message"
  5. Get Dictionary Value for key "content"

One all is done (and it took me a while to get that to happen because of the dictionary → list → dictionary → dictionary flow; figuring out that not everything in the nesting was a dictionary took some time), click the information button on this final action and rename it to Summary Text. This makes it easier to reference in later steps.

Step 5: Display the Summary

Add a Show action and select the Summary Text variable. This shows the generated summary in a window with Close and Share buttons. The latter allows you to send to output to applications like Notes or OneNote, but not to Pages or Word. In macOS Sequoia, the list is rather locked down, which means that you cannot extend it beyond the available options. In use or during setup testing, beware of losing the open summary window behind others if you move to another app because it is tricky to get back to without using the CTRL + UP keyboard shortcut to display all open windows at once.

Step 6: Copy to Clipboard

Given the aforementioned restrictions, there is a lot to be said for adding a Copy to Clipboard action with the Summary Text variable as input. This allows you to paste the summary immediately into other apps beyond those available using the Share facility.

Step 7: Return Focus to Mail

After all these, add another Run AppleScript action with this single line:

tell application "Mail" to activate

This brings the Mail app back to the front, which is particularly useful when you trigger the Shortcut via a keyboard shortcut or if you move to another app window.

Step 8: Make the New Shortcut Available for Use

Lastly, click the information button at the top of your Shortcut screen. One useful option that can be activated is the Pin in Menu Bar one, which adds a menu to the top bar with an entry for the new Email Summary Shortcut in there. Ticking the box for the Use as Quick Action option allows you to set a keyboard shortcut. Until, the menu bar option appealed to me, that did have its uses. You just have to ensure that what you select does not override any combination that is in use already. Handily, I also found icons for my Shortcuts in Launchpad as well, which means that they also could be added to the Dock, something that I also briefly did.

Using the Shortcut

After expending the effort needed to set it up, using the new email summariser is straightforward. In Apple Mail, select one or more messages that you want to summarise; there is no need to select and copy the contained textual content because the Shortcut does that for you. Using the previously assigned keyboard combination, menu or Launchpad icon then triggers the summarisation processing. Thus, a window appears moments later displaying the generated summary while the same text is copied to your clipboard, ready to paste anywhere you need it to go. When you dismiss the pop-up window, the Mail app then automatically comes back into focus again.

Comet and Atlas: Navigating the security risks of AI Browsers

2nd November 2025

The arrival of the ChatGPT Atlas browser from OpenAI on 21st October has lured me into some probing of its possibilities. While Perplexity may have launched its Comet browser first on 9th July, their tendency to put news under our noses in other places had turned me off them. It helps that the former is offered extra charge for ChatGPT users, while the latter comes with a free tier and an optional Plus subscription plan. My having a Mac means that I do not need to await Windows and mobile versions of Atlas, either.

Both aim to interpret pages, condense information and carry out small jobs that cut down the number of clicks. Atlas does so with a sidebar that can read multiple documents at once and an Agent Mode that can execute tasks in a semi-autonomous way, while Comet leans into shortcut commands that trigger compact workflows. However, both browsers are beset by security issues that give enough cause for concern that added wariness is in order.

In many ways, they appear to be solutions looking for problems to address. In Atlas, I found the Agent mode needed added guidance when checking the content of a personal website for gaps. Jobs can become too big for it, so they need everything broken down. Add in the security concerns mentioned below, and enthusiasm for seeing what they can do gets blunted. When you see Atlas adding threads to your main ChatGPT roster, that gives you a hint as to what is involved.

The Security Landscape

Both Comet and Atlas are susceptible to indirect prompt injection, where pages contain hidden instructions that the model follows without user awareness, and AI sidebar spoofing, where malicious sites create convincing copies of AI sidebars to direct users into compromising actions. Furthermore, demonstrations have included scenarios where attackers steal cryptocurrency and gain access to Gmail and Google Drive.

For instance, Brave's security team has described indirect prompt injection as a systemic challenge affecting the whole class of AI-augmented browsers. Similarly, Perplexity's security group has stated that the phenomenon demands rethinking security from the ground up. In a test involving 103 phishing attacks, Microsoft Edge blocked 53 percent and Google Chrome 47 percent, yet Comet blocked 7 percent and Atlas 5.8 percent.

Memory presents an additional attack surface because these tools retain information between sessions, and researchers have demonstrated that memory can be poisoned by carefully crafted content, with the taint persisting across sessions and devices if synchronisation is enabled. Shadow IT adoption has begun: within nine days of launch, 27.7 percent of enterprises had at least one Atlas download, with uptake in technology at 67 percent, pharmaceuticals at 50 percent and finance at 40 percent.

Mitigating the Risks

Sensibly, security practitioners recommend separating ordinary browsing from agentic browsing. Here, it helps that AI browsers are cut down items anyway, at least based on my experience of Atlas. Figuring out what you can do with them using public information in a read-only manner will be enough at this point. In any event, it is essential to keep them away from banking, health, personal accounts, credentials, payments and regulated data until security improves.

As one precaution, maintaining separate AI accounts could act as a boundary to contain potential compromises, though this does not address the underlying issue that prompt injection manipulates the agent's decision-making processes. With Atlas, disable Browser Memories and per-site visibility by default, with explicit opt-ins only on specific public sites. Additionally, use Agent Mode only when not logged into any accounts. Furthermore, do not import passwords or payment methods. With Comet, use narrowly scoped shortcuts that operate on public information and avoid workflows involving sign-ins, credentials or payments.

Small businesses can run limited pilots in non-sensitive areas with strict allow and deny lists, then reassess by mid-2026 as security hardens, while large enterprises should adopt a block-and-monitor stance while developing governance frameworks that anticipate safer releases in 2026 and 2027. In parallel, security teams should watch for circumvention attempts and prepare policies that separate public research from sensitive work, mandate safe defaults and prohibit connections to confidential systems. Finally, training is necessary because users need to understand the specific risks these browsers present.

How Competition Might Help

Established browser vendors are adding AI capabilities on top of existing security infrastructure. Chrome is integrating Gemini, and Edge is incorporating Copilot more tightly into the workflow. Meanwhile, Brave continues with a privacy-first stance through Leo, while Opera's Aria, Arc with Dia and SigmaOS reflect different approaches. Current projections suggest that major browsers will introduce safer AI features in the final quarter of 2025, that the first enterprise-ready capabilities will arrive in the first half of 2026 and that by 2027 AI-assisted browsing will be standard and broadly secure.

Competition from Chrome and Edge will drive AI assistance into more established security frameworks, while standalone AI browsers will work to address their security gaps. Mitigations for prompt injection and sidebar spoofing will likely involve layered approaches combining detection, containment and improved user interface signals. Until then, Comet and Atlas can provide productivity benefits in public-facing work and research, but their security posture is not suitable for sensitive tasks. Use the tools where the risk is acceptable, keep sensitive work in conventional browsers, and anticipate that safer versions will become standard over the next two years.

AI infrastructure under pressure: Outages, power demands and the race for resilience

1st November 2025

The past few weeks brought a clear message from across the AI landscape: adoption is racing ahead, while the underlying infrastructure is working hard to keep up. A pair of major cloud outages in October offered a stark stress test, exposing just how deeply AI has become woven into daily services.

At the same time, there were significant shifts in hardware strategy, a wave of new tools for developers and creators and a changing playbook for how information is found online. There is progress on resilience and efficiency, yet the system is still bending under demand. Understanding where it held, where it creaked and where it is being reinforced sets the scene for what comes next.

Infrastructure Stress and Outages

The outages dominated early discussion. An AWS incident that lasted around 15 hours and disrupted more than a thousand services was followed nine days later by a global Azure failure. Each cascaded across systems that depend on them, illustrating how AI now amplifies the consequences of platform problems.

This was less about a single point of failure and more about the growing blast radius when connected services falter. The effect on productivity was visible too: a separate 10-hour ChatGPT downtime showed how fast outages of core AI tools now translate into lost work time.

Power Demand and Grid Strain

Behind the headlines sits a larger story about electricity, grids and planning. Data centres accounted for roughly 4% of US electricity use in 2024, about 183 TWh and the International Energy Agency projects around 945 TWh by 2030, with AI as a principal driver.

The averages conceal stark local effects. Wholesale prices near dense clusters have spiked by as much as 267% at times, household bills are rising by about $16–$18 per month in affected areas and capacity prices in the PJM market jumped from $28.92 per megawatt to $329.17. The US grid faces an upgrade bill of about $720 billion by 2030, yet permitting and build timelines are long, creating a bottleneck just as demand accelerates.

Technical Grid Issues

Technical realities on the grid add another layer of challenge. Fast load swings from AI clusters, harmonic distortions and degraded power quality are no longer theoretical concerns. A Virginia incident in which 60 data centres disconnected simultaneously did not trigger a collapse but did reveal the fragility introduced by concentrated high-performance compute.

Security and New Failure Modes

Security risks are evolving in parallel. Agentic systems that can plan, reason and call tools open new failure modes. AI-enabled spear phishing appears to be 350% more effective than traditional attempts and could be 50 times more profitable, a worrying backdrop when outages already have a clear link to lost productivity.

Security considerations now reach into the tools people use to access AI as well. New AI browsers attract attention, and with that comes scrutiny. OpenAI's Atlas and Perplexity's Comet launched with promising features, yet researchers flagged critical issues.

Comet is vulnerable to "CometJacking", a malicious URL hijack that enables data theft, while Atlas suffered a cross-site request forgery weakness that allowed persistent code injection into ChatGPT memory. Both products have been noted for assertive data collection.

Caution and good hygiene are prudent until the fixes and policies settle. It is a reminder that the convenience of integrating models directly into browsing comes with a new attack surface.

Efficiency and Mitigation Strategies

Industry responses are gathering pace. Efficiency remains the first lever. Hyperscalers now report power usage effectiveness around 1.08 to 1.09, compared with more typical figures of 1.5 to 1.6. Direct chip cooling can cut energy needs by up to 40%.

Grid-interactive operations and more work at the edge offer ways to smooth demand and reduce concentration risk, while new power partnerships hint at longer-term change. Microsoft's agreement with Constellation on nuclear power is one example of how compute providers are thinking beyond incremental efficiency gains.

An emerging pattern is becoming visible through these efforts. Proactive regional planning and rapid efficiency improvements could allow computational output to grow by an order of magnitude, while power use merely doubles. More distributed architectures are being explored to reduce the hazard of over-concentration.

A realistic outlook sets data centres at around 3% of global electricity use by 2030, which is notable but still smaller than anticipated growth from electric vehicles or air conditioning. If the $720 billion in grid investment materialises, it could add around 120 GW of capacity by 2030, as much as half of which would be absorbed by data centres. The resilience gap is real, but it appears to be narrowing, provided the sector moves quickly to apply lessons from each failure.

Regional and Policy Responses

Regional policies are starting to encourage resilience too. Oregon's POWER Act asks operators to contribute to grid robustness, Singapore's tight focus on efficiency has delivered around a 30% power reduction even as capacity expands and a moratorium in Dublin has pushed growth into more distributed build-outs. On the U.S. federal government side, the Department of Homeland Security updated frameworks after a 2024 watchdog warning, with AI risk programmes now in place for 15 of the 16 critical infrastructure sectors.

Hardware Competition and Strategy

Competition is sharpening. Anthropic deepened its partnership with Google Cloud to train on TPUs, a move that challenges Nvidia's dominance and signals a broader rebalancing in AI hardware. Nvidia's chief executive has acknowledged TPUs as robust competition.

Another fresh entry came from Extropic, which unveiled thermodynamic sampling units, a probabilistic chip design that claims up to 10,000-fold lower energy use than GPUs for AI workloads. Development kits are shipping and a Z-1 chip is planned for next year, yet as with any radical architecture, proof at scale will take time.

Nvidia, meanwhile, presented an ambitious outlook, targeting $500 billion in chip revenue by 2026 through its Blackwell and Rubin lines. The US Department of Energy plans seven supercomputers comprising more than 100,000 Blackwell GPUs and the company announced partnerships spanning pharmaceuticals, industrials and consumer platforms.

A $1 billion investment in Nokia hints at the importance of AI-centric networks. New open-source models and datasets accompanied the announcements, and the company's share price surged to a record.

Corporate Restructuring

Corporate strategy and hardware choices also entered a new phase. OpenAI completed its restructuring into a public benefit corporation, with a rebranded OpenAI Foundation holding around $130 billion in equity and allocating $25 billion to health and AI resilience. Microsoft's stake now sits at about 27% and is worth roughly $135 billion, with technology rights retained through 2032. Both parties have scope to work with other partners. OpenAI committed around $250 billion to Azure yet retains the ability to use other compute providers. An independent panel will verify claims of artificial general intelligence, an unusual governance step that will be watched closely.

Search and Discovery Evolution

Away from infrastructure, the way audiences find and trust information is shifting. Search is moving from the old aim of ranking for clicks to answer engine optimisation, where the goal is to be quoted by systems such as ChatGPT, Claude or Perplexity.

The numbers explain why. Google handled more than five trillion queries in 2024, while generative platforms now process around 37.5 million prompt-like searches per day. Google's AI Overviews, which surface summary answers above organic results, have reshaped click behaviour.

Independent analyses report top-ranking pages seeing click-through rates fall by roughly a third where Overviews appear, with some keywords faring worse, and a Pew study finds overall clicks on such results dropping from 15% to 8%. Zero-click searches rose from around 56% to 69% between May 2024 and May 2025.

Chegg's non-subscriber traffic fell by 49% in this period, part of an ongoing dispute with Google. Google counters that total engagement in covered queries has risen by about 10%. Whichever way that one reads the data, the direction is clear: visibility is less about rank position and more about being cited by a summarising engine.

In practice, that means structuring content, so a model can parse, trust and attribute it. Clear Q&A-style sections with direct answers, followed by context and cited evidence, help models extract usable statements. Schema markup for FAQs and how-to content improves machine readability.

Measuring success also changes. Traditional analytics rarely show when an LLM quotes a source, so teams are turning to tools that track citations in AI outputs and tying those to conversion quality, branded search volume and more in-depth engagement with pricing or documentation. It is not a replacement for SEO so much as a layer that reinforces it in an AI-first environment.

Developer Tools and Agentic Workflows

On the tools front, developers saw an acceleration in agent-centred workflows. Cursor launched its first in-house coding model, Composer, which aims for near-frontier quality while generating code around four times faster, often in under 30 seconds.

The broader Cursor 2.0 update added multi-agent capabilities, with as many as eight assistants able to work in parallel, alongside browsing, a test browser and voice controls. The direction of travel is away from single-shot completions and towards orchestration and review. Tutorials are following suit, demonstrating how to scaffold tasks such as a Next.js to-do application using planning files, parallel agent tasks and quick integration, with voice prompts in the loop.

Open-source and enterprise ecosystems continue to expand. GitHub introduced Agent HQ for coordinating coding agents, Google released Pomelli to generate marketing campaigns and IBM's Granite 4.0 Nano models brought larger on-device options in the 350 million to 1.5 billion parameter range.

FlowithOS reported strong scores on agentic web tasks, while Mozilla announced an open speech dataset initiative, and Kilo Code, Hailuo 2.3 and other projects broadened choice across coding and video. Grammarly rebranded as Superhuman, adding "Superhuman Go" agents to speed up writing tasks.

Creative Tools and Partnerships

Creative workflows are evolving quickly, too. Adobe used its MAX event to add AI assistants to Photoshop and Express, previewed an agent called Project Moonlight, and upgraded Firefly with conversational "Prompt to Edit" controls, custom image models and new video features including soundtracks and voiceovers. Partnerships mean Gemini, Veo and Imagen will sit inside Adobe tools, and Premiere's editing capabilities now extend to YouTube Shorts.

Figma acquired Weavy and rebranded it as Figma Weave for richer creative collaboration, and Canva unveiled its own foundation "Design Model" alongside a Creative Operating System meant to produce fully editable, AI-generated designs. New Canva features take in a revised video suite, forms, data connectors, email design, a 3D generator and an ad creation and performance tool called Grow, while Affinity is relaunching as a free, integrated professional app. Other entrants are trying to blend model strengths: one agent was trailed with Sora 2 clip stitching, Veo 3.1 visuals and multimodel blending for faster design output.

Music rights and AI found a new footing. Universal Music Group settled a lawsuit with Udio, the AI music generator, and the two will form a joint venture to launch a licensed platform in 2026. Artists who opt in will be paid both for training models on their catalogues and for remixes. Udio disabled song downloads following the deal, which annoyed some users, and UMG also announced a "responsible AI" alliance with Stability AI to build tools for artists. These arrangements suggest a path towards sanctioned use of style and catalogue, with compensation built in from the start.

Research and Introspection

Research and science updates added depth. Anthropic reported that its Claude system shows limited introspection, detecting planted concepts only about 20% of the time, separating injected "thoughts" from text and modulating its internal focus. That highlights both the promise and limits of transparency techniques, and the potential for models to conceal or fail to surface certain internal states.

UC Berkeley researchers demonstrated an AI-driven load balancing algorithm with around 30% efficiency improvements, a result that could ripple through cloud performance. IBM ran quantum algorithms on AMD FPGAs, pointing to progress in hybrid quantum-classical systems.

OpenAI launched an AI-integrated web browser positioned as a challenger to incumbents, Perplexity released a natural-language patents search and OpenAI's Aardvark, a GPT-5-based security agent, entered private beta.

Anthropic opened a Tokyo office and signed a cooperation pact with Japan's AI Safety Institute. Tether released QVAC Genesis I, a large open STEM dataset of more than one million data points and a local workbench app aimed at making development more private and less dependent on big platforms.

Age Restrictions and Policy

Meanwhile, policy considerations are reaching consumer platforms. Character AI will restrict users under 18 from open-ended chatbot conversations from late November, replacing them with creative tools and adding behaviour-based age detection, a response to pressure and proposals such as the GUARD Act.

Takeaways

Put together, the picture is one of rapid interdependence and swift correction. The infrastructure is not breaking, but it is being stretched, and recent failures have usefully mapped the weak points. If the sector continues to learn quickly from its own missteps, the resilience gap will continue to narrow, and the next round of outages will be less disruptive than the last.

Investment is flowing into grids and cooling, policy is nudging towards resilience, and compute providers are hedging hardware bets by searching for efficiency and supply assurance. On the application layer, agents are becoming a primary interface for work, creative tools are converging around editability and control, and discovery is shifting towards being quoted by machines rather than clicked by humans.

Security lapses at the interface are a reminder that novelty often arrives before maturity. The most likely path from here is uneven but forward: data centre power may rise, yet efficiency and distribution can blunt the impact; answer engines may compress clicks, yet they can send higher intent visitors to clear, well-structured sources; hardware competition may fragment the stack, yet it can also reduce concentration risk.

WARNING: Unsupported device 'SVG' for RTF destination. Using default device 'EMF'.

31st October 2025

When running SAS programs to create some outputs, I met with the above message because I was sending output to an RTF file and these do not support SVG files. This must be a system default because there was nothing in the programs that should trigger the warning. Getting rid of the message uses one of two approaches:

goptions device=emf;

ods graphics / outputfmt=emf;

The first applies to legacy code generating graphics using SAS/GRAPH procedures, while the second is for the more modern ODS graphics procedures like SGPLOT. While the first one may work in all cases, that cannot be assumed without further testing.

There was another curiosity in this: the system setup should have taken care of things to stop the warning from appearing in the log. However, these were programs created from SAS logs to capture all code generated by any SAS macros that were called for regulatory submission purposes. Thus, they were self-contained and missed off the environment setup code, which is how I came to see what I did.

Remote access between Mac and Linux, Part 3: SSH, RDP and TigerVNC

30th October 2025

This is Part 3 of a three-part series on connecting a Mac to a Linux Mint desktop. Part 1 introduced the available options, whilst Part 2 covered x11vnc for sharing physical desktops.

Whilst x11vnc excels at sharing an existing desktop, many scenarios call for terminal access or a fresh graphical session. This article examines three alternatives: SSH for command-line work, RDP for responsive remote desktops with Xfce, and TigerVNC for virtual Cinnamon sessions.

Terminal Access via SSH

For many administrative tasks, a secure shell session is enough. On the Linux machine, the OpenSSH server needs to be installed and running. On Debian or Ubuntu-based systems, including Linux Mint, the required packages are available with standard tools.

Installing with sudo apt install openssh-server followed by enabling the service with sudo systemctl enable ssh and starting it with sudo systemctl start ssh is all that is needed. The machine's address on the local network can be identified with ip addr show, and it is the entry under inet for the active interface that will be used.

From the Mac, a terminal session to that address is opened with a command of the form ssh username@192.168.1.xxx and this yields a full shell on the Linux machine without further configuration. On a home network, there is no need for router changes and SSH requires no extra client software on macOS.

SSH forms the foundation for secure operations beyond terminal access. It enables file transfer via scp and rsync, and can be used to create encrypted tunnels for other protocols when access from outside the local network is required.

RDP for New Desktop Sessions

Remote Desktop Protocol creates a new login session on the Linux machine and tends to feel smoother over imperfect links. On Linux Mint with Cinnamon, RDP is often the more responsive choice on a Mac, but Cinnamon's reliance on 3D compositing means xrdp does not work with it reliably. The usual workaround is to keep Cinnamon for local use and install a lightweight desktop specifically for remote sessions. Xfce works well in this role.

Setting Up xrdp with Xfce

After updating the package list, install xrdp with sudo apt install xrdp, set it to start automatically with sudo systemctl enable xrdp, and start it with sudo systemctl start xrdp. If a lightweight environment is not already available, install Xfce with sudo apt install xfce4, then tell xrdp to use it by creating a simple session file for the user account with echo "startxfce4" > ~/.xsession. Restarting the service with sudo systemctl restart xrdp completes the server side.

The Linux machine's IP address can be checked again so it can be entered into Microsoft Remote Desktop, which is a free download from the Mac App Store. Adding a new connection with the Linux IP and the user's credentials often suffices, and the first connection may present a certificate prompt that can be accepted.

RDP uses port 3389 by default, which needs no router configuration on the same network. It creates a new session rather than attaching to the one already shown on the Linux monitor, so it is not a means to view the live Cinnamon desktop, but performance is typically smooth and latency is well handled.

Why RDP with Xfce?

It is common for xrdp on Ubuntu-based distributions to select a simpler session type unless the user instructs it otherwise, which is why the small .xsession file pointing to Xfce helps. The combination of RDP's protocol efficiency and Xfce's lightweight nature delivers the most responsive experience for new sessions. The protocol translates keyboard and mouse input in a way that many clients have optimised for years, making it the most forgiving route when precise input behaviour matters. The trade-off is that what is shown is a separate desktop session, which can be a benefit or a drawback depending on the task.

TigerVNC for New Cinnamon Sessions

Those who want to keep Cinnamon for remote use can do so with a VNC server that creates a new virtual desktop. TigerVNC is a common choice on Linux Mint. Installing tigervnc-standalone-server, setting a password with vncpasswd and creating an xstartup file under ~/.vnc that launches Cinnamon will provide a new session for each connection.

Configuring TigerVNC

A minimal xstartup for Cinnamon sets the environment to X11, establishes the correct session variables and starts cinnamon-session. Making this file executable and then launching vncserver :1 starts a VNC server on port 5901. The server can be stopped later with vncserver -kill :1.

The xstartup script determines what desktop environment a virtual session launches, and setting the environment variables to Cinnamon then starting cinnamon-session is enough to present the expected desktop. Marking that startup file as executable is easy to miss, and it is required for TigerVNC to run it.

From the Mac, the built-in Screen Sharing app can be used from Finder's Connect to Server entry by supplying vnc://192.168.1.xxx:5901, or a third-party viewer such as RealVNC Viewer can connect to the same address and port. This approach provides the Cinnamon look and feel, though it can be less responsive than RDP when the network is not ideal, and it also creates a new desktop session rather than sharing the one already in use on the Linux screen.

Clipboard Support in TigerVNC

For TigerVNC, clipboard support typically requires the vncconfig helper application to be running on the server. Starting vncconfig -nowin & in the background, often by adding it to the ~/.vnc/xstartup file, enables clipboard synchronisation between the VNC client and server for plain text.

File Transfer

File transfer between the machines is best handled using the command-line tools that accompany SSH. On macOS, scp file.txt username@192.168.1.xxx:/home/username/ sends a file to Linux and scp username@192.168.1.xxx:/home/username/file.txt ~/Desktop/ retrieves one, whilst rsync with -avz flags can be used for larger or incremental transfers.

These tools work reliably regardless of which remote access method is being used for interactive sessions. File copy-paste is not supported by VNC protocols, making scp and rsync the dependable choice for moving files between machines.

Operational Considerations

Port Management

Understanding port mappings helps avoid connection issues. VNC display numbers map directly to TCP ports, so :0 means 5900, :1 means 5901 and so on. RDP uses port 3389 by default. When connecting with viewers, supplying the address alone will use the default port for that protocol. If a specific port must be stated, use a single colon with the actual TCP port number.

First Connection Issues

If a connection fails unexpectedly, checking whether a server is listening with netstat can save time. On first-time connections to an RDP server, the client may display a certificate warning that can be accepted for home use.

Making Services Persistent

For regular use, enabling services at boot removes the need for manual intervention. Both xrdp and TigerVNC can be configured to start automatically, ensuring that remote access is available whenever the Linux machine is running. The systemd service approach described for x11vnc in Part 2 can be adapted for TigerVNC if automatic startup of virtual sessions is desired.

Security and Convenience

Security considerations in a home setting are straightforward. When both machines are on the same local network, there is no need to adjust router settings for any of these methods. If remote access from outside the home is required, port forwarding and additional protections would be needed.

SSH can be exposed with careful key-based authentication, RDP should be placed behind a VPN or an SSH tunnel, and VNC should not be left open to the internet without an encrypted wrapper. For purely local use, enabling the necessary services at boot or keeping a simple set of commands to hand often suffices.

xrdp can be enabled once and left to run in the background, so the Mac's Microsoft Remote Desktop app can connect whenever needed. This provides a consistent way to access a fresh Xfce session without affecting what is displayed on the Linux machine's monitor.

Summary and Recommendations

The choice between these methods ultimately comes down to the specific use case. SSH provides everything necessary for administrative work and forms the foundation for secure file transfer. RDP into an Xfce session is a sensible choice when responsiveness and clean input handling are the priorities and a separate desktop is acceptable. TigerVNC can launch a full Cinnamon session for those who value continuity with the local environment and do not mind the slight loss of responsiveness that can accompany VNC.

For file transfer, the command-line tools that accompany SSH remain the most reliable route. Clipboard synchronisation for plain text is available in each approach, though TigerVNC typically needs vncconfig running on the server to enable it.

Having these options at hand allows a Mac and a Linux Mint desktop to work together smoothly on a home network. The setup is not onerous, and once a choice is made and the few necessary commands are learned, the connection can become an ordinary part of using the machines. After that, the day-to-day experience can be as simple as opening a single app on the Mac, clicking a saved connection and carrying on from where the Linux machine last left off.

The Complete Picture

Across this three-part series, we have examined the full range of remote access options between Mac and Linux:

  • Part 1 provided the decision framework for choosing between terminal access, new desktop sessions and sharing physical displays.
  • Part 2 explored x11vnc in detail, including performance tuning, input handling with KVM switches, clipboard troubleshooting and systemd service configuration.
  • Part 3 covered SSH for terminal access, RDP with Xfce for responsive remote sessions, TigerVNC for virtual Cinnamon desktops, and file transfer considerations.

Each approach has its place, and understanding the trade-offs allows the right tool to be selected for the task at hand.

Remote access between Mac and Linux, Part 2: x11vnc for sharing physical desktops

29th October 2025

This is Part 2 of a three-part series on connecting a Mac to a Linux Mint desktop. Part 1 introduced the available options, whilst Part 3 covers SSH, RDP and TigerVNC.

Sharing the existing physical desktop is the point at which x11vnc enters the picture. Unlike a virtual VNC server, x11vnc mirrors the live X display, so what appears on the Linux monitor is exactly what is seen remotely. This is often preferred when a process or window must be observed without starting another session, making it particularly useful for monitoring ongoing work or guiding someone seated at the machine.

Basic Setup

Installing x11vnc and creating a password with x11vnc -storepasswd sets up the basics. The command x11vnc -usepw -display :0 -forever shares the current display on port 5900. The -display :0 flag is the important part, as :0 corresponds to the active physical display.

On the Mac side, RealVNC Viewer can connect by supplying either the IP address alone or the address with :5900 appended. If an "Invalid endpoint: port not correctly specified" error appears, it almost always stems from an incorrect address format. The correct forms are 192.168.1.50 or 192.168.1.50:5900, whereas entries such as 192.168.1.50::5900 or 192.168.1.50:0 will fail.

If there is any uncertainty about the port in use, running netstat -tlnp | grep x11vnc on the Linux machine should show a listener on 0.0.0.0:5900 when sharing display :0.

VNC display numbers map directly to TCP ports, so :0 means 5900, :1 means 5901 and so on. When connecting with RealVNC Viewer, supplying the address alone will use the default VNC port, which suits x11vnc on display :0. If a port must be stated, use a single colon with the actual TCP port, not the X display number.

Performance Tuning

Performance tuning is possible with a few additional options. x11vnc can enable a client-side pixel cache with -ncache 10, which asks the viewer to keep a set of off-screen image buffers for faster redraws, and -ncache_cr improves visible smoothness when windows are moved by encouraging copyrect updates.

When scrolling seems to stutter, -scrollcopyrect can help by detecting scroll operations and drawing them efficiently. -grabptr can improve pointer synchronisation between client and server.

On composited desktops, the X DAMAGE extension is used by default to hint at changed regions, but some compositing window managers interfere with this. In such cases, adding -noxdamage can avoid issues where updates are missed, at the cost of more conservative screen polling. x11vnc prints guidance about these behaviours on startup, which can be useful if any anomalies are encountered.

Input Handling and Mouse Button Issues

Input behaviour over VNC often differs from what is experienced when using the Linux machine directly. x11vnc does not carry the Mac's system preferences for the mouse or trackpad across the network; rather, it synthesises standard events that the X server interprets. As a result, wheel scrolling may feel different and the mapping of mouse buttons can be surprising.

When everything works perfectly whilst sitting at the Linux desktop but changes when connecting remotely, it is often because the events are being translated in a generic fashion over VNC. Trying a different viewer on the Mac sometimes improves matters because each has its own handling of input. The macOS Screen Sharing app, RealVNC Viewer and the TigerVNC viewer are all viable choices, and they do not feel identical in use.

Working with KVM Switches

It is also worth noting that hardware in the path can alter what the Linux system receives. If a mouse is connected through a KVM switch, the KVM may re-encode USB Human Interface Device signals, so the computer sees a very generic device. This can remap buttons in unexpected ways, and that mapping can differ between direct local use and a VNC session that injects events at another layer.

One way to correct misassigned buttons is to ask x11vnc to translate them with -buttonmap, which swaps the first three buttons so that the injected events match what the desktop expects. In simple cases where, for example, middle and right are reversed, -buttonmap 132 can restore the standard order of left, middle and right.

However, when a device presents right-click as a higher-numbered button such as 8, x11vnc's simple mapping is not enough because it only translates the primary trio. In that situation, the correction needs to happen before x11vnc sees the events by using the X input system.

The xinput list command identifies the device and xinput get-button-map shows how its buttons are currently enumerated. Applying xinput set-button-map with the desired sequence, for instance xinput set-button-map <ID> 1 2 3 4 5 6 7 8 9, forces the right codes at the operating system level and resolves the issue for both local and remote sessions.

These changes are not persistent across reboots unless added to a startup script or represented with an Xorg configuration snippet. Alternatively, some KVM models offer a HID pass-through or transparent mode that preserves the device's native signalling, which avoids the need for remapping. Bypassing the KVM for the mouse entirely is another way to ensure that the Linux machine sees the expected button layout, though this obviously changes how inputs are switched between computers.

Even after addressing button mapping, some scrolling oddities can remain. Combining the caching options with -scrollcopyrect and -grabptr, then reconnecting with a different viewer if needed, often gets closer to the feel of direct use.

Clipboard Support

Clipboard support is an area that can require attention depending on the tools in use. With x11vnc, clipboard synchronisation for plain text is supported, though the exact behaviour and configuration requirements can vary. Version 0.9.16 includes clipboard functionality, but in practice users often report difficulties with bidirectional clipboard operation and may need to use helper utilities such as autocutsel to achieve reliable two-way clipboard sharing.

When properly configured, copying text on one side and pasting on the other should work in both directions, with Command-C and Command-V on macOS and the usual Ctrl shortcuts on Linux. Rich text and images are not transferred, and file copy-paste is not supported by x11vnc at all.

Troubleshooting Clipboard Synchronisation

If clipboard synchronisation is not working as expected, x11vnc can be run with verbose logging to observe clipboard events in real time. This is particularly useful for diagnosing whether clipboard data are being detected and transferred between the client and server.

To enable detailed logging, first stop any running x11vnc service:

sudo systemctl stop x11vnc.service

Then start x11vnc manually in a terminal with increased verbosity:

x11vnc -usepw -display :0 -forever -verbose -o ~/x11vnc_debug.log

Connect from the Mac via RealVNC Viewer and attempt to copy and paste text in both directions. In another terminal, monitor the log file:

tail -f ~/x11vnc_debug.log

Lines mentioning clipboard activity, such as "Clipboard: received new client data" or "Clipboard: sending text selection to X server", indicate that x11vnc is detecting and transferring clipboard changes. If no clipboard-related messages appear, the VNC viewer may have clipboard synchronisation disabled in its settings.

In RealVNC Viewer, check Preferences → Inputs to ensure that clipboard synchronisation is enabled. On the macOS Screen Sharing app, clipboard support should work by default when connecting to x11vnc.

After testing, stop the manual run with Ctrl+C and restart the service:

sudo systemctl start x11vnc.service

To make verbose logging permanent, the -verbose flag and log output option can be added to the systemd service file's ExecStart line, allowing clipboard activity to be monitored in /var/log/x11vnc.log at any time.

Running x11vnc as a System Service

If x11vnc is to be used regularly and convenience matters, it can be launched at startup as a service so that a remote viewer can connect without a preparatory shell session on the Linux side. Setting up x11vnc as a systemd service ensures that the VNC server starts automatically whenever the Linux Mint machine boots, even before logging in.

To create the service, run:

sudo nano /etc/systemd/system/x11vnc.service

Then paste the following configuration, adjusting the username as needed:

[Unit]
Description=Start x11vnc at startup for user johnh
After=display-manager.service
Requires=display-manager.service

[Service]
Type=simple
ExecStart=/usr/bin/x11vnc -auth guess -forever -loop -noxdamage 
  -display :0 -rfbauth /home/johnh/.vnc/passwd 
  -buttonmap 138456729 -ncache 10 -ncache_cr 
  -o /var/log/x11vnc.log
User=johnh
Group=johnh
Restart=on-failure

[Install]
WantedBy=multi-user.target

The -auth guess option allows x11vnc to attach to the X session automatically, whilst logging is directed to /var/log/x11vnc.log. The button mapping and caching options shown here incorporate the performance tuning and input corrections discussed earlier. Adjust /home/johnh and the username to match the account in use.

After creating the service file, reload systemd and enable the service:

sudo systemctl daemon-reload
sudo systemctl enable x11vnc.service
sudo systemctl start x11vnc.service

Check that the service is running:

systemctl status x11vnc.service

The output should show Active: active (running). Once this is in place, the Mac can connect to vnc://<linux-ip>:5900 at any time, even immediately after the Linux machine boots.

If x11vnc does not reconnect cleanly after logging out of Cinnamon, adding ExecStopPost=/bin/sleep 2 to the [Service] section will give systemd a brief pause before restarting the service.

Security Considerations

Security considerations in a home setting are straightforward. When both machines are on the same local network, there is no need to adjust router settings. If remote access from outside the home is required, port forwarding and additional protections would be needed.

VNC should not be left open to the internet without an encrypted wrapper. For purely local use, enabling the service at boot as described above often suffices, but for external access an SSH tunnel or VPN is essential.

Conclusion

x11vnc provides a reliable way to share the physical desktop of a Linux machine with a Mac on the same network. The setup is not onerous, and once the service is configured and the few necessary commands are learned, the connection can become an ordinary part of using the machines. The pitfalls that appear repeatedly, such as port format in viewer applications, clipboard configuration, or the way KVM switches can alter mouse behaviour, are easily overcome with a little attention to detail.

In Part 3, we examine the alternatives: SSH for terminal access, RDP with Xfce for responsive remote desktop sessions, and TigerVNC for virtual Cinnamon desktops.

  • The content, images, and materials on this website are protected by copyright law and may not be reproduced, distributed, transmitted, displayed, or published in any form without the prior written permission of the copyright holder. All trademarks, logos, and brand names mentioned on this website are the property of their respective owners. Unauthorised use or duplication of these materials may violate copyright, trademark and other applicable laws, and could result in criminal or civil penalties.

  • All comments on this website are moderated and should contribute meaningfully to the discussion. We welcome diverse viewpoints expressed respectfully, but reserve the right to remove any comments containing hate speech, profanity, personal attacks, spam, promotional content or other inappropriate material without notice. Please note that comment moderation may take up to 24 hours, and that repeatedly violating these guidelines may result in being banned from future participation.

  • By submitting a comment, you grant us the right to publish and edit it as needed, whilst retaining your ownership of the content. Your email address will never be published or shared, though it is required for moderation purposes.