Technology Tales

Notes drawn from experiences in consumer and enterprise technology

Preventing authentication credentials from entering Git repositories

6th February 2026

Keeping credentials out of version control is a fundamental security practice that prevents numerous problems before they occur. Once secrets enter Git history, remediation becomes complex and cannot guarantee complete removal. Understanding how to structure projects, configure tooling, and establish workflows that prevent credential commits is essential for maintaining secure development practices.

Understanding What Needs Protection

Credentials come in many forms across different technology stacks. Database passwords, API keys, authentication tokens, encryption keys, and service account credentials all represent sensitive data that should never be committed. Configuration files containing these secrets vary by platform but share common characteristics: they hold information that would allow unauthorised access to systems, data, or services.

# This should never be in Git
database:
  password: secretpassword123
api:
  key: sk_live_abc123def456
# Neither should this
DB_PASSWORD=secretpassword123
API_KEY=sk_live_abc123def456
JWT_SECRET=mysecretkey

Even hashed passwords require careful consideration. Whilst bcrypt hashes with appropriate cost factors (10 or higher, requiring approximately 65 milliseconds per computation) provide protection against immediate exploitation, they still represent sensitive data. The hash format includes a version identifier, cost factor, salt, and hash components that could be targeted by offline attacks if exposed. Plaintext credentials offer no protection whatsoever and represent immediate critical exposure.

Establishing Git Ignore Rules from the Start

The foundation of credential protection is proper .gitignore configuration established before any sensitive files are created. This proactive approach prevents problems rather than requiring remediation after discovery. Begin every project by identifying which files will contain secrets and excluding them immediately.

# Credentials and secrets
.env
.env.*
!.env.example
config/secrets.yml
config/database.yml
config/credentials/

# Application-specific sensitive files
wp-config.php
settings.php
configuration.php
appsettings.Production.json
application-production.properties

# User data and session storage
/storage/credentials/
/var/sessions/
/data/users/

# Keys and certificates
*.key
*.pem
*.p12
*.pfx
!public.key

# Cache and logs that might leak data
/cache/
/logs/
/tmp/
*.log

The negation pattern !.env.example demonstrates an important technique: excluding all .env files whilst explicitly including example files that show structure without containing secrets. This pattern ensures that developers understand what configuration is needed without exposing actual credentials.

Notice the broad exclusions for entire categories rather than specific files. Excluding *.key prevents any private key files from being committed, whilst !public.key allows the explicit inclusion of public keys that are safe to share. This defence-in-depth approach catches variations and edge cases that specific file exclusions might miss.

Separating Examples from Actual Configuration

Version control should contain example configurations that demonstrate structure without exposing secrets. Create .example or .sample files that show developers what configuration is required, whilst keeping actual credentials out of Git entirely.

# config/secrets.example.yml
database:
  host: localhost
  username: app_user
  password: REPLACE_WITH_DATABASE_PASSWORD

api:
  endpoint: https://api.example.com
  key: REPLACE_WITH_API_KEY
  secret: REPLACE_WITH_API_SECRET

encryption:
  key: REPLACE_WITH_32_BYTE_ENCRYPTION_KEY

Documentation should explain where developers obtain the actual values. For local development, this might involve running setup scripts that generate credentials. For production, it involves deployment processes that inject secrets from secure storage. The example file serves as a template and checklist, ensuring nothing is forgotten whilst preventing accidental commits of real secrets.

Using Environment Variables

Environment variables provide a standard mechanism for separating configuration from code. Applications read credentials from the environment rather than from files tracked in Git. This pattern works across virtually all platforms and languages.

// Instead of hardcoding
$db_password = 'secretpassword123';

// Read from environment
$db_password = getenv('DB_PASSWORD');
// Instead of requiring a config file with secrets
const apiKey = 'sk_live_abc123def456';

// Read from environment
const apiKey = process.env.API_KEY;

Environment files (.env) provide convenience for local development, but must be excluded from Git. The pattern of .env for actual secrets and .env.example for structure becomes standard across many frameworks. Developers copy the example to create their local configuration, filling in actual values that never leave their machine.

Implementing Pre-Commit Hooks

Pre-commit hooks provide automated checking before changes enter the repository. These hooks scan staged files for patterns that match secrets and block commits when suspicious content is detected. This automated enforcement prevents mistakes that manual review might miss.

The pre-commit framework manages hooks across multiple repositories and languages. Installation is straightforward, and configuration defines which checks run before each commit.

pip install pre-commit

Create a configuration file defining which hooks to run:

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.4.0
    hooks:
      - id: check-added-large-files
      - id: check-json
      - id: check-yaml
      - id: detect-private-key
      - id: end-of-file-fixer
      - id: trailing-whitespace

  - repo: https://github.com/Yelp/detect-secrets
    rev: v1.4.0
    hooks:
      - id: detect-secrets
        args: ['--baseline', '.secrets.baseline']

Install the hooks in your repository:

pre-commit install

Now every commit triggers these checks automatically. The detect-private-key hook catches SSH keys and other private key formats. The detect-secrets hook uses entropy analysis and pattern matching to identify potential credentials. When suspicious content is detected, the commit is blocked and the developer is alerted to review the flagged content.

Configuring Git-Secrets

The git-secrets tool from AWS specifically targets secret detection. It scans commits, commit messages, and merges to prevent credentials from entering repositories. Installation and configuration establish patterns that identify secrets.

# Install via Homebrew
brew install git-secrets

# Install hooks in a repository
cd /path/to/repo
git secrets --install

# Register AWS patterns
git secrets --register-aws

# Add custom patterns
git secrets --add 'password\s*=\s*["\047][^\s]+'
git secrets --add 'api[_-]?key\s*=\s*["\047][^\s]+'

The tool maintains a list of prohibited patterns and scans all content before allowing commits. Custom patterns can be added to match organisation-specific secret formats. The --register-aws command adds patterns for AWS access keys and secret keys, whilst custom patterns catch application-specific credential formats.

For teams, establishing git-secrets across all repositories ensures consistent protection. Template directories provide a mechanism for automatic installation in new repositories:

# Create a template with git-secrets installed
git secrets --install ~/.git-templates/git-secrets

# Use the template for all new repositories
git config --global init.templateDir ~/.git-templates/git-secrets

Now, every git init automatically includes secret scanning hooks.

Enabling GitHub Secret Scanning

GitHub Secret Scanning provides server-side protection that cannot be bypassed by local configuration. GitHub automatically scans repositories for known secret patterns and alerts repository administrators when matches are detected. This works for both new commits and historical content.

For public repositories, secret scanning is enabled by default. For private repositories, it requires GitHub Advanced Security. Enable it through repository settings under Security & Analysis. GitHub maintains partnerships with service providers to detect their specific secret formats, and when a partner pattern is found, both you and the service provider are notified.

The scanning covers not just code but also issues, pull requests, discussions, and wiki content. This comprehensive approach catches secrets that might be accidentally pasted into comments or documentation. The detection happens continuously, so even old content gets scanned when new patterns are added.

Custom patterns extend detection to organisation-specific secret formats. Define regular expressions that match your internal API key formats, authentication tokens, or other proprietary credentials. These custom patterns apply across all repositories in your organisation, providing consistent protection.

Structuring Projects for Credential Isolation

Project structure itself can prevent credentials from accidentally entering Git. Establish clear separation between code that belongs in version control and configuration that remains environment-specific. Create dedicated directories for credentials and ensure they are excluded from tracking.

project/
├── src/                    # Code - belongs in Git
├── tests/                  # Tests - belongs in Git
├── config/
│   ├── app.yml            # General config - belongs in Git
│   ├── secrets.example.yml # Example - belongs in Git
│   └── secrets.yml        # Actual secrets - excluded from Git
├── credentials/           # Entire directory excluded
│   ├── database.yml
│   └── api-keys.json
├── .env.example           # Example - belongs in Git
├── .env                   # Actual secrets - excluded from Git
└── .gitignore             # Defines exclusions

This structure makes it obvious which files contain secrets. The credentials/ directory is clearly separated from source code, and its exclusion from Git is explicit. Developers can see at a glance that this directory requires different handling.

Documentation should explain the structure and the reasoning behind it. New team members need to understand why certain directories are empty in their fresh clones and where to obtain the configuration files that populate them. Clear documentation prevents confusion and ensures everyone follows the same patterns.

Managing Development Credentials

Development environments require credentials but should never use production secrets. Generate separate development credentials that provide access to development resources only. These credentials can be less stringently protected whilst still not being committed to Git.

Development credential management varies by organisation size and infrastructure. For small teams, shared development credentials stored in a team password manager might suffice. For larger organisations, each developer receives individual credentials for development resources, with access controlled through identity management systems.

Some teams commit development credentials intentionally, arguing that development databases contain no sensitive data and convenience outweighs risk. This approach is controversial and depends on your security model. If development credentials can access any production resources or if development data has any sensitivity, they must be protected. Even purely synthetic development data might reveal business logic or system architecture worth protecting.

The safer approach maintains the same credential handling patterns across all environments. This ensures that developers build habits that prevent production credential exposure. When development and production follow identical patterns, muscle memory built during development prevents mistakes in production.

Provisioning Production Credentials

Production credentials should never touch developer machines or version control. Deployment processes inject credentials at runtime through environment variables, secret management services, or deployment-time configuration.

Continuous deployment pipelines read credentials from secret stores and make them available to applications without exposing them to humans. GitHub Actions, GitLab CI, Jenkins, and other CI/CD systems provide secure variable storage that is injected during builds and deployments.

# .github/workflows/deploy.yml
jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Deploy to production
        env:
          DB_PASSWORD: ${{ secrets.DB_PASSWORD }}
          API_KEY: ${{ secrets.API_KEY }}
        run: |
          ./deploy.sh

The secrets.DB_PASSWORD syntax references encrypted secrets stored in GitHub's secure storage. These values are never exposed in logs or visible to anyone except during the deployment process. The deployment script receives them as environment variables and can configure the application appropriately.

Secret management services like HashiCorp Vault, AWS Secrets Manager, Azure Key Vault, or Google Cloud Secret Manager provide centralised credential storage with access controls, audit logging, and automatic rotation. Applications authenticate to these services and retrieve credentials at runtime, ensuring that secrets are never stored on disk or in environment files.

Rotating Credentials Regularly

Regular credential rotation limits exposure duration if secrets are compromised. Establish rotation schedules based on credential sensitivity and access patterns. Database passwords might rotate quarterly, API keys monthly, and authentication tokens weekly or daily. Automated rotation reduces operational burden and ensures consistency.

Rotation requires coordination between secret generation, distribution, and application updates. Secret management services can automate much of this process, generating new credentials, updating secure storage, and triggering application reloads. Manual rotation involves generating new credentials, updating all systems that use them, and verifying functionality before disabling old credentials.

The rotation schedule balances security against operational complexity. More frequent rotation provides better security but increases the risk of service disruption if processes fail. Less frequent rotation simplifies operations but extends exposure windows. Find the balance that matches your risk tolerance and operational capabilities.

Training and Culture

Technical controls provide necessary guardrails, but security ultimately depends on people understanding why credentials matter and how to protect them. Training should cover the business impact of credential exposure, the techniques for keeping secrets out of Git, and the procedures for responding if mistakes occur.

New developer onboarding should include credential management as a core topic. Before developers commit their first code, they should understand what constitutes a secret, why it must stay out of Git, and how to configure their local environment properly. This prevents problems rather than correcting them after they occur.

Regular security reminders reinforce good practices. When new secret types are introduced or new tools are adopted, communicate the changes and update documentation. Security reviews should check credential handling practices, not just looking for exposed secrets, but also verifying that proper patterns are followed.

Creating a culture where admitting mistakes is safe encourages early reporting. If a developer accidentally commits a credential, they should feel comfortable immediately alerting the security team, rather than hoping no one notices. Early detection enables faster response and reduces damage.

Responding to Detection

Despite best efforts, secrets sometimes enter repositories. Rapid response limits damage. Immediate credential rotation assumes compromise and prevents exploitation. Removing the file from future commits whilst leaving it in history provides no security benefit, as the exposure has already occurred.

Tools like BFG Repo-Cleaner can remove secrets from Git history, but this is complex and cannot guarantee complete removal. Anyone who cloned the repository before clean-up retains the compromised credentials in their local copy. Forks, clones on other systems, and backup copies may all contain the secrets.

The most reliable response is assuming the credential is compromised and rotating it immediately. History clean-up can follow as a secondary measure to reduce ongoing exposure, but it should never be the primary response. Treat any secret that entered Git as if it were publicly posted because once in Git history, it effectively was.

Continuous Improvement

Credential management practices should evolve with your infrastructure and team. Regular reviews identify gaps and opportunities for improvement. When new credential types are introduced, update .gitignore patterns, secret scanning rules, and documentation. When new developers join, gather feedback on clarity and completeness of onboarding materials.

Metrics help track effectiveness. Monitor secret scanning alerts, track rotation compliance, and measure time-to-rotation when credentials are exposed. These metrics identify areas needing improvement and demonstrate progress over time.

Summary

Preventing credentials from entering Git repositories requires multiple complementary approaches. Establish comprehensive .gitignore configurations before creating any credential files. Separate example configurations from actual secrets, keeping only examples in version control. Use environment variables to inject credentials at runtime rather than storing them in configuration files. Implement pre-commit hooks and server-side scanning to catch mistakes before they enter history. Structure projects to clearly separate code from credentials, making it obvious what belongs in Git and what does not.

Train developers on credential management and create a culture where security is everyone's responsibility. Provision production credentials through deployment processes and secret management services, ensuring they never touch developer machines or version control. Rotate credentials regularly to limit exposure windows. Respond rapidly when secrets are detected, assuming compromise and rotating immediately.

Security is not a one-time configuration but an ongoing practice. Regular reviews, continuous improvement, and adaptation to new threats and technologies keep credential management effective. The investment in prevention is far less than the cost of responding to exposed credentials, making it essential to get right from the beginning.

Related Resources

When Operations and Machine Learning meet

5th February 2026

Here's a scenario you'll recognise: your SRE team drowns in 1,000 alerts daily. 95% are false positives. Meanwhile, your data scientists built five ML models last quarter, and none have reached production. These problems are colliding, and solving each other. Machine learning is moving out of research labs and into the operations that keep your systems running. At the same time, DevOps practices are being adapted to get ML models into production reliably. Since this convergence has created three new disciplines (AIOps, MLOps and LLM observability), here is what you need to know.

Why Traditional Operations Can't Keep Up

Modern systems generate unprecedented volumes of operational data. Logs, metrics, traces, events and user interaction signals create a continuous stream that's too large and too fast for manual analysis.

Your monitoring system might send thousands of alerts per day, but most are noise. A CPU spike in one microservice cascades into downstream latency warnings, database connection errors and end-user timeouts, generating dozens or hundreds of alerts from a single root cause. Without intelligent correlation, engineers waste hours manually connecting the dots.

Meanwhile, machine learning models that could solve real business problems sit in notebooks, never making it to production. The gap between data science and operations is costly. Data scientists lack the infrastructure to deploy models reliably. Operations teams lack the tooling to monitor models that do make it live.

The complexity of cloud-native architectures, microservices and distributed systems has outpaced traditional approaches. Manual processes that worked for simpler systems simply cannot scale.

Three Emerging Practices Changing the Game

Three distinct but related practices have emerged to address these challenges. Each solves a specific problem whilst contributing to a broader transformation in how organisations build and run digital services.

AIOps: Intelligence for Your Operations

AIOps (Artificial Intelligence for IT Operations) applies machine learning to the work of IT operations. Originally coined by Gartner, AIOps platforms collect data from across your environment, analyse it in real-time and surface patterns, anomalies or likely incidents.

The key capability is event correlation. Instead of presenting 1,000 raw alerts, AIOps systems analyse metadata, timing, topological dependencies and historical patterns to collapse related events into a single coherent incident. What was 1,000 alerts becomes one actionable event with a causal chain attached.

Beyond detection, AIOps platforms can trigger automated responses to common problems, reducing time to remediation. Because they learn from historical data, they can offer predictive insights that shift operations away from constant firefighting.

Teams implementing AIOps report measurable improvements: 60-80% reduction in alert volume, 50-70% faster incident response and significant reductions in operational toil. The technology is maturing rapidly, with Gartner predicting that 60% of large enterprises will have adopted AIOps platforms by 2026.

MLOps: Getting Models into Production

Whilst AIOps uses ML to improve operations, MLOps (Machine Learning Operations) is about operationalising machine learning itself. Building a model is only a small part of making it useful. Models change, data changes, and performance degrades over time if the system isn't maintained.

MLOps is an engineering culture and practice that unifies ML development and ML operations. It extends DevOps by treating machine learning models and data assets as first-class citizens within the delivery lifecycle.

In practice, this means continuous integration and continuous delivery for machine learning. Changes to models and pipelines are tested and deployed in a controlled way. Model versioning tracks not just the model artefact, but also the datasets and hyperparameters that produced it. Monitoring in production watches for performance drift and decides when to retrain or roll back.

The MLOps market was valued at $2.2 billion in 2024 and is projected to reach $16.6 billion by 2030, reflecting rapid adoption across industries. Organisations that successfully implement MLOps report that up to 88% of ML initiatives that previously failed to reach production are now being deployed successfully.

A typical MLOps implementation looks like this: data scientists work in their preferred tools, but when they're ready to deploy, the model goes through automated testing, gets versioned alongside its training data and deploys with built-in monitoring for performance drift. If the model degrades, it can automatically retrain or roll back.

The SRE Automation Opportunity

Site Reliability Engineering, originally created at Google, applies software engineering principles to operations problems. It encompasses availability, latency, performance, efficiency, change management, monitoring, emergency response and capacity planning. Rather than replacing AIOps, the likely outcome is convergence. Analytics, automation and reliability engineering become mutually reinforcing, with organisations adopting integrated approaches that combine intelligent monitoring, automated operations and proactive reliability practices.

What This Looks Like in the Real World

The difference between traditional operations and ML-powered operations shows up in everyday scenarios.

Before: An application starts responding slowly. Monitoring systems fire hundreds of alerts across different tools. An engineer spends two hours correlating logs, metrics and traces to identify that a database connection pool is exhausted. They manually scale the service, update documentation and hope to remember the fix next time.

After: The same slowdown triggers anomaly detection. The AIOps platform correlates signals across the stack, identifies the connection pool issue and surfaces it as a single incident with context. Either an automated remediation kicks in (scaling the pool based on learned patterns) or the engineer receives a notification with diagnosis complete and remediation steps suggested. Resolution time drops from hours to minutes.

Before: A data science team builds a pricing optimisation model. After three months of development, they hand a trained model to engineering. Engineering spends another month building deployment infrastructure, writing monitoring code and figuring out how to version the model. By the time it reaches production, the model is stale and performs poorly.

After: The same team works within an MLOps platform. Development happens in standard environments with experiment tracking. When ready, the data scientist triggers deployment through a single interface. The platform handles testing, versioning, deployment and monitoring. The model reaches production in days instead of months, and automatic retraining keeps it current.

These patterns extend across industries. Financial services firms use MLOps for fraud detection models that need continuous updating. E-commerce platforms use AIOps to manage complex microservices architectures. Healthcare organisations use both to ensure critical systems remain available whilst deploying diagnostic models safely.

The Tech Behind the Transformation (Optional Deep Dive)

If you want to understand why this convergence is happening now, it helps to know about transformers and vector embeddings. If you're more interested in implementation, skip to the next section.

The breakthrough that enabled modern AI came in 2017 with a paper titled "Attention Is All You Need". Ashish Vaswani and colleagues at Google introduced the transformer architecture, a neural network design that processes sequential data (like sentences) by computing relationships across the entire sequence at once, rather than step by step.

The key innovation is self-attention. Earlier models struggled with long sequences because they processed data sequentially and lost context. Self-attention allows a model to examine all parts of an input simultaneously, computing relationships between each token and every other token. This parallel processing is a major reason transformers scale well and perform strongly on large datasets.

Transformers underpin models like GPT and BERT. They enable applications from chatbots to content generation, code assistance to semantic search. For operations teams, transformer-based models power the natural language interfaces that let engineers query complex systems in plain English and the embedding models that enable semantic search across logs and documentation.

Vector embeddings represent concepts as dense vectors in high-dimensional space. Similar concepts have embeddings that are close together, whilst unrelated concepts are far apart. This lets models quantify meaning in a way that supports both understanding and generation.

In operations contexts, embeddings enable semantic search. Instead of searching logs for exact keyword matches, you can search for concepts. Query "authentication failures" and retrieve related events like "login rejected", "invalid credentials" or "session timeout", even if they don't contain your exact search terms.

Retrieval-Augmented Generation (RAG) combines these capabilities to make AI systems more accurate and current. A RAG system pairs a language model with a retrieval mechanism that fetches external information at query time. The model generates responses using both its internal knowledge and retrieved context.

This approach is particularly valuable for operations. A RAG-powered assistant can pull current runbook procedures, recent incident reports and configuration documentation to answer questions like "how do we handle database failover in the production environment?" with accurate, up-to-date information.

The technical stack supporting RAG implementations typically includes vector databases for similarity search. As of 2025, commonly deployed options include Pinecone, Milvus, Chroma, Faiss, Qdrant, Weaviate and several others, reflecting a fast-moving landscape that's becoming standard infrastructure for many AI implementations.

Where to Begin

Starting with ML-powered operations doesn't require a complete transformation. Begin with targeted improvements that address your most pressing problems.

If you're struggling with alert-fatigue...

Start with event correlation. Many AIOps platforms offer this as an entry point without requiring full platform adoption. Look for solutions that integrate with your existing monitoring tools and can demonstrate noise reduction in a proof of concept.

Focus on one high-volume service or team first. Success here provides both immediate relief and a template for broader rollout. Track metrics like alerts per day, time to acknowledge and time to resolution to demonstrate impact.

Tools worth considering include established platforms like Datadog, Dynatrace and ServiceNow, alongside newer entrants like PagerDuty AIOps and specialised incident response platforms like incident.io.

If you have ML models stuck in development...

Begin with MLOps fundamentals before investing in comprehensive platforms. Focus on model versioning first (track which code, data and hyperparameters produced each model). This single practice dramatically improves reproducibility and makes collaboration easier.

Next, automate deployment for one model. Choose a model that's already proven valuable but requires manual intervention to update. Build a pipeline that handles testing, deployment and basic monitoring. Use this as a template for other models.

Popular MLOps platforms include MLflow (open source), cloud provider offerings like AWS SageMaker, Google Vertex AI and Azure Machine Learning, and specialised platforms like Databricks and Weights & Biases.

If you're building with LLMs...

Implement observability from day one. LLM applications are different from traditional software. They're probabilistic, can be expensive to run, and their behaviour varies with prompts and context. You need to monitor performance (response times, throughput), quality (output consistency, appropriateness), bias, cost (token usage) and explainability.

Common pitfalls include underestimating costs, failing to implement proper prompt versioning, neglecting to monitor for model drift and not planning for the debugging challenges that come with non-deterministic systems.

The LLM observability space is evolving rapidly, with platforms like LangSmith, Arize AI, Honeycomb and others offering specialised tooling for monitoring generative AI applications in production.

Why This Matters Beyond the Tech

The convergence of ML and operations isn't just a technical shift. It requires cultural change, new skills and rethinking of traditional roles.

Teams need to understand not only deployment automation and infrastructure as code, but also concepts like attention mechanisms, vector embeddings and retrieval systems because these directly influence how AI-enabled services behave in production. They also need operational practices that can handle both deterministic systems and probabilistic ones, whilst maintaining reliability, compliance and cost control.

Data scientists are increasingly expected to understand production concerns like latency budgets, deployment strategies and operational monitoring. Operations engineers are expected to understand model behaviour, data drift and the basics of ML pipelines. The gap between these roles is narrowing.

Security and governance cannot be afterthoughts. As AI becomes embedded in tooling and operations become more automated, organisations need to integrate security testing throughout the development cycle, implement proper access controls and audit trails, and ensure models and automated systems operate within appropriate guardrails.

The organisations succeeding with these practices treat them as both a technical programme and an organisational transformation. They invest in training, establish cross-functional teams, create clear ownership and accountability, and build platforms that reduce cognitive load whilst enabling self-service.

Moving Forward

The convergence of machine learning and operations isn't a future trend, it's happening now. AIOps platforms are reducing alert noise and accelerating incident response. MLOps practices are getting models into production faster and keeping them performing well. The economic case for SRE automation is driving investment and innovation.

The organisations treating this as transformation rather than tooling adoption are seeing results: fewer outages, faster deployments, models that actually deliver value. They're not waiting for perfect solutions. They're starting with focused improvements, learning from what works and scaling gradually.

The question isn't whether to adopt these practices. It's whether you'll shape the change or scramble to catch up. Start with the problem that hurts most (alert fatigue, models stuck in development, reliability concerns) and build from there. The convergence of ML and operations offers practical solutions to real problems. The hard part is committing to the cultural and organisational changes that make the technology work.

Adding a dropdown calendar to the macOS desktop with Itsycal

26th January 2026

In Linux Mint, there is a dropdown calendar that can be used for some advance planning. On Windows, there is a pop-up one on the taskbar that is as useful. Neither of these possibilities is there on a default macOS desktop, and I missed the functionality. Thus, a search began.

That ended with my finding Itsycal, which does exactly what I need. Handily, it also integrates with the macOS Calendar app, though I use other places for my appointments. In some ways, that is more than I need. The dropdown pane with the ability to go back and forth through time suffices for me.

While it would be ideal if I could go year by year as well as month by month, which is the case on Linux Mint, I can manage with just the latter. Anything is better than having nothing at all. Sometimes, using more than one operating system broadens a mind.

Switching from uBlock Origin to AdGuard and Stylus

15th January 2026

A while back, uBlock Origin broke this website when I visited it. There was a long AI conversation that left me with the impression that the mix of macOS, Firefox and WordPress presented an edge case that could not be resolved. Thus, I went looking for alternatives because I may not be able to convince else to look into it, especially when the issue could be so niche.

One thing the uBlock Origin makes very easy is the custom blocking of web page elements, so that was one thing that I needed to replace. A partial solution comes in the form of the Stylus extension. Though the CSS rules may need to be defined manually after interrogating a web page structure, the same effects came be achieved. In truth, it is not as slick as using a GUI element selector, but I have learned to get past that.

For automatic ad blocking, I have turned to AdGuard AdBlocker. Thus far, it is doing what I need it to do. One thing to note is that does nothing to stop your registering in website visitor analytics, not that it bothers me at all. That was something that uBlock Origin does out of the box, while my new ad blocker sticks more narrowly to its chosen task, and that suffices for now.

In summary, I have altered my tooling for controlling what websites show me. It is all too easy for otherwise solid tools to register false positives and cause other obstructions. That is why I find myself swapping between them every so often; after all, website code can be far too variable.

Maybe it highlights how challenging it is to make ad blocking and other similar software when your test cases cannot be as extensive as they need to be. Add in something of an arms race between advertisers and ad blockers for the ante to be upped even more. It does not help when we want the things free of charge too.

Finding a better way to uninstall Mac applications

14th January 2026

If you were to consult an AI about uninstalling software under macOS, you would be given a list of commands to run in the Terminal. That feels far less slick than either Linux or Windows. Thus, I set to looking for a cleaner solution. It came in the form of AppCleaner from FreeMacSoft.

This finds the files to remove once you have supplied the name of the app that you wish to uninstall. Once you have reviewed those, you can set it to remove them to the recycling bin, after which they can be expunged from there. Handily, this automates the manual graft that otherwise would be needed.

It amazes me that such an operation is not handled within macOS itself, instead of leaving it to the software providers themselves, or third-party tools like this one. Otherwise, a Mac could get very messy, though Homebrew offers ways of managing software installations for certain cases. Surprisingly, the situation is more free-form than on iOS, too.

Locking your computer screen faster using keyboard shortcuts

11th January 2026

When you are doing paid work on a computer, locking one's screen is a healthy practice for ensuring privacy and confidentiality while you are away from your desk for a short while. For years, I have been doing this on Windows using the WIN (Windows key) + L keyboard combination. It is possible on a Mac too, albeit using a different set of keys: CTRL (Control) + CMD (Command) + Q. While the Lock Screen item on the Apple menu will accomplish the same result, a simple keyboard shortcut works much, much faster. On Linux, things are a lot more varied with different desktop environments working in their own way, even making terminal commands a way to go if you can use a heavily abbreviated alias.

Streamlining text case conversion across multiple apps with a reusable macOS terminal command

10th January 2026

Changing text from mixed case to lower case is something that I often do. Much of the time until recently, this has been accomplished manually, but I started to wonder if a quicker way could be found. Thus, here is one that I use when working in macOS. It involves using a command that works in the terminal, and I have added a short alias for it to my .zshrc file. Here is the full pipeline:

pbpaste | tr '[:upper:]' '[:lower:]' | pbcopy

In the above, pbpaste reads from the paste buffer (or clipboard) while pbcopy writes the final output to the clipboard, replacing what was there before. In between those, tr '[:upper:]' '[:lower:]' changes any lower case letters to lower case ones.

With that, the process becomes this: copy the text into the paste buffer, run the command, paste output where it is wanted. While there may be a few steps, it is quicker than doing everything manually or opening another app to do the job. This suffices for now, and I. may get to look at something similar for Linux in time.

How to persist R packages across remote Windows server sessions

9th January 2026

Recently, I was using R to automate some code changes that needed implementation when porting code from a vendor to client systems. While I was doing so, I noticed that packages needed to be reinstalled every time that I logged into their system. This was because they were going into a temporary area by default. The solution was to define another location where the packages could be persisted.

That meant creating a .Renviron file, with Windows Explorer making that manoeuvre an awkward one that could not be completed. Using PowerShell was the solution for this. There, I could use the following command to do what I needed:

New-Item -ItemType File "$env:USERPROFILE\Documents\.Renviron" -Force

That gave me an empty .Renviron file, to which I could add the following text for where the packages should be kept (the path may differ on your system):

R_LIBS_USER=C:/R/packages

Here, the paths are only examples and do not always represent what the real ones were, and that is by design for reasons of client confidentiality. Restarting RStudio to give me a fresh R session meant that I now could install packages using commands like this one:

install.packages("tidyverse")

Version constraints meant for compilation from source in my case, making for a long wait time for completion. Once that was done, though, there was no need for a repeat operation.

One final remark is that file creation and population could be done in the same command in PowerShell:

'R_LIBS_USER=C:/R/packages' | Out-File -Encoding ascii "$env:USERPROFILE\Documents\.Renviron"

It places the text into a new file or completely overwrites an existing, meaning that you really want to do this once should you decide to add any more setting details to .Renviron later on.

Not so fast: When tasks using AI may take more time and attention than you expect

29th November 2025

If you believed all the hype that surrounds AI, you might believe that all of us would out of work before we knew it. The truth is that the new technology is not that miraculous, especially when based on some experiences that I have been having. Firstly, there are deficiencies and then there will be new things that need doing as well as becoming possible for the first time.

PowerShell Scripting

One pertained to spinning up PowerShell scripts for doing code reviews of SAS programs submitted by a vendor to a client of mine. While all worked well for simple cases, I found that more complex tasks like finding the datasets using in code and comparing them against what is listed in the program headers became too complicated and probably needed a week of my time to get things in order, which was the amount of time that I did not have.

Picking out macro calls from code and comparing them against lists in the headers was more successful because the code situations were less variable. Other tasks were really handy, though, even if I would benefit from AI teaching me how to write PowerShell scripts by myself. That would give me more scope to critique the code that was being produced. Starting simple and progressing one step at a time would ensure sounder embedding of PowerShell commands in my memory.

Article Writing

It is all too tempting to get AI to write articles on subjects of your choosing for website content production. That which sounds like a labour-saving way to go can command a higher amount of attention than some realise. Sometimes, writing it all by yourself might be a better approach, one that I am using for this piece.

My workflow often involves these steps when AI is involved: assembly of the source material, conversion of source material into an article by one AI, fact checking of the same text by another AI and restructuring by that second AI with added links for those wanting to find out more. While human content production is reduced, the need for human oversight, along with fact and link checking, means that time is used in other ways.

In short, it is best not to rush this, as I found when assembling two articles on Canadian rail travel. You also need to watch how much content is being processed because that can both overwhelm human bandwidth and undermine human engagement. This is more than proofreading of what is produced; you need to know something about a given subject yourself too.

Image Production

While AI can do well with producing some images, there are ones where it will struggle because of lack of training. An example is when I asked for an image with cyclists placing bicycles on a bus before boarding it. None of the generated images worked, meaning that a trip to a stock library was in order.

While some can specify everything in a prompt at one sitting, I work more iteratively, which probably adds to any task, especially with image generation. It proves that still is a place for stock libraries and having your own personal library as well. We need to remain as orchestrators in all of this, and lack of personal talent can remain a limitation.

System Administration

While this may not be something that I do professionally, my keeping an eye on the worlds of DevOps and DevSecOps means that I am seeing that the presence of AI is adding work of its own. This has no sign of lessening, proving that work is changing dramatically instead of reducing, especially you bring Agentic AI into the equation.

It feels much like the advent of personal computing and that produced a similar seismic shift in the workplace in more innocent times. This time around, nefarious actors are misusing AI, a not unexpected if ominous trend, adding to the security woes that have beset computing for a few decades now.

A Human in the Loop?

At a recent conference, much was being made of keeping humanity in the loop when it came to using AI. There is a catch, though: how do we have engaged humans in the loop? After all, creating computer code allows one to get into flow and remain engaged, possibly overriding any feelings of fatigue. This is what needs replicating, hardly an experience reported with automation in other professions.

The use of AI is a developing field, bringing new challenges as well as solving old problems. That also means upskilling on a grand scale, something happened over time with personal and business computing. While it looks as if the process could be faster this time around, it is too early to know enough about where this revolution is going to take us. That may be enough to keep us engaged.

Launching SAS Analytics Pro on Viya with automated Docker image clean-up

28th November 2025

For my freelancing, I have a licensed version of SAS Analytics Pro running in a Docker container on my main Linux workstation. Every time there is a new release, a new Docker is made available, which means that a few of them could accumulate on your system. Aside from taking up disk space that could have other uses, it also makes it tricky to automate the startup of the associate Docker container. Avoiding this means pruning the Docker images available on the system, something that also needs automation.

To make things clearer, let me work through the launch script that I use; this is called by the script that checks for and then downloads any new image that is available, should that be needed. First up is the shebang, and this uses the -e switch to exit the script in the event of there being an error. That puts a stop to any potentially destructive outcomes from later commands being executed afterwards and without having the input that they need.

#!/bin/bash -e

Next comes the command to shut down the existing container. Should a new image get instated, this would lock up the old one, preventing its removal. Also, doing the rest of the steps with an already running container will result in errors anyway.

if docker container ls -a --format '{{.Names}}' | grep -q '^sas-analytics-pro$'; then
    docker container stop sas-analytics-pro
fi

After that, the step to find the latest image is performed. Once, I did this by looping through the ages by days, weeks and months, hardly an elegant or robust approach. What follows is something all the more effective.

# Find latest SAS Analytics Pro image
IMAGE=$(docker image ls --format '{{.Repository}}:{{.Tag}} {{.CreatedAt}}' \
    | grep 'sas-analytics-pro' \
    | sort -k2,3r \
    | head -n 1 \
    | awk '{print $1}')

echo "Chosen image: $IMAGE"

Since there is quite a lot happening above, let us unpack the actions. The first part lists all Docker images, formatting each line to show the image name (repository:tag) followed by its creation timestamp: docker image ls --format '{{.Repository}}:{{.Tag}} {{.CreatedAt}}'. The next piece picks out all the images that are for SAS Analytics Pro: grep 'sas-analytics-pro'. The crucial step, sort -k2,3r, comes next and this sorts the results by the second and third fields (the creation date and time) in reverse order, so the newest images appear first. With that done, it is time to pick out the most recent image using head -n 1. To pick out the image name, you need awk '{print $1}. This wrapped within IMAGE=$(...) to assign the result to a variable that is printed to the console using an echo statement.

With the image selected, you can then spin up the container once you specify the other parameters to use and allow some sleep time afterwards before proceeding to the clean-up steps:

run_args="
-e SASLOCKDOWN=0
--name=sas-analytics-pro
--rm
--detach
--hostname sas-analytics-pro
--env RUN_MODE=developer
--env SASLICENSEFILE=[Path to SAS licence file]
--publish 8080:80
--volume ${PWD}/sasinside:/sasinside
--volume ${PWD}/sasdemo:/data2
--volume [location of SAS files on the system]:/data
--cap-add AUDIT_WRITE
--cap-add SYS_ADMIN
--publish 8222:22
"

if ! docker run -u root ${run_args} "$IMAGE" "$@" > /dev/null 2>&1; then
    echo "Failed to run the image."
    exit 1
fi

sleep 5

With the new container in action, the subsequent step is to find the older images and delete those. Again, the docker image command is invoked, with its output fed to a selection command for SAS Analytics Pro images. Once the current image has been removed from the listing by the grep -v command, the list of images to be deleted is assigned to the IMAGES_TO_REMOVE variable.

IMAGES_TO_REMOVE=$(docker image ls --format '{{.Repository}}:{{.Tag}}' \
    | grep 'sas-analytics-pro' \
    | grep -v "^$IMAGE$")

echo "Will remove older images:"
echo "$IMAGES_TO_REMOVE"

After that has happened, iterating through the list of images using a for loop will remove them one at a time using the docker image rm command:

for OLD in $IMAGES_TO_REMOVE; do
    echo "Removing $OLD"
    docker image rm "$OLD" || echo "Could not remove $OLD"
done

All this concludes the operation of spinning up a new SAS Analytics Pro Docker container while also removing any superseded Docker images. One last step is to capture the password to use for logging into the SAS Studio interface that is available at localhost:8080 or whatever address and port is being used to serve the application:

docker logs sas-analytics-pro 2>&1 | grep "Password=" > pw.txt

Folding updating and housekeeping into the same activity as spinning up the Docker container means that I need not think of doing anything else. The time taken by the other activities repay the effort by always having the latest version running in a tidy environment. That just saves having to remember to do all of this, which is what is needed without automation.

  • The content, images, and materials on this website are protected by copyright law and may not be reproduced, distributed, transmitted, displayed, or published in any form without the prior written permission of the copyright holder. All trademarks, logos, and brand names mentioned on this website are the property of their respective owners. Unauthorised use or duplication of these materials may violate copyright, trademark and other applicable laws, and could result in criminal or civil penalties.

  • All comments on this website are moderated and should contribute meaningfully to the discussion. We welcome diverse viewpoints expressed respectfully, but reserve the right to remove any comments containing hate speech, profanity, personal attacks, spam, promotional content or other inappropriate material without notice. Please note that comment moderation may take up to 24 hours, and that repeatedly violating these guidelines may result in being banned from future participation.

  • By submitting a comment, you grant us the right to publish and edit it as needed, whilst retaining your ownership of the content. Your email address will never be published or shared, though it is required for moderation purposes.