AI code agents | Technology Tales

Not so fast: When tasks using AI may take more time and attention than you expect

29^th November 2025

If you believed all the hype that surrounds AI, you might believe that all of us would out of work before we knew it. The truth is that the new technology is not that miraculous, especially when based on some experiences that I have been having. Firstly, there are deficiencies and then there will be new things that need doing as well as becoming possible for the first time.

PowerShell Scripting

One pertained to spinning up PowerShell scripts for doing code reviews of SAS programs submitted by a vendor to a client of mine. While all worked well for simple cases, I found that more complex tasks like finding the datasets using in code and comparing them against what is listed in the program headers became too complicated and probably needed a week of my time to get things in order, which was the amount of time that I did not have.

Picking out macro calls from code and comparing them against lists in the headers was more successful because the code situations were less variable. Other tasks were really handy, though, even if I would benefit from AI teaching me how to write PowerShell scripts by myself. That would give me more scope to critique the code that was being produced. Starting simple and progressing one step at a time would ensure sounder embedding of PowerShell commands in my memory.

Article Writing

It is all too tempting to get AI to write articles on subjects of your choosing for website content production. That which sounds like a labour-saving way to go can command a higher amount of attention than some realise. Sometimes, writing it all by yourself might be a better approach, one that I am using for this piece.

My workflow often involves these steps when AI is involved: assembly of the source material, conversion of source material into an article by one AI, fact checking of the same text by another AI and restructuring by that second AI with added links for those wanting to find out more. While human content production is reduced, the need for human oversight, along with fact and link checking, means that time is used in other ways.

In short, it is best not to rush this, as I found when assembling two articles on Canadian rail travel. You also need to watch how much content is being processed because that can both overwhelm human bandwidth and undermine human engagement. This is more than proofreading of what is produced; you need to know something about a given subject yourself too.

Image Production

While AI can do well with producing some images, there are ones where it will struggle because of lack of training. An example is when I asked for an image with cyclists placing bicycles on a bus before boarding it. None of the generated images worked, meaning that a trip to a stock library was in order.

While some can specify everything in a prompt at one sitting, I work more iteratively, which probably adds to any task, especially with image generation. It proves that still is a place for stock libraries and having your own personal library as well. We need to remain as orchestrators in all of this, and lack of personal talent can remain a limitation.

System Administration

While this may not be something that I do professionally, my keeping an eye on the worlds of DevOps and DevSecOps means that I am seeing that the presence of AI is adding work of its own. This has no sign of lessening, proving that work is changing dramatically instead of reducing, especially you bring Agentic AI into the equation.

It feels much like the advent of personal computing and that produced a similar seismic shift in the workplace in more innocent times. This time around, nefarious actors are misusing AI, a not unexpected if ominous trend, adding to the security woes that have beset computing for a few decades now.

A Human in the Loop?

At a recent conference, much was being made of keeping humanity in the loop when it came to using AI. There is a catch, though: how do we have engaged humans in the loop? After all, creating computer code allows one to get into flow and remain engaged, possibly overriding any feelings of fatigue. This is what needs replicating, hardly an experience reported with automation in other professions.

The use of AI is a developing field, bringing new challenges as well as solving old problems. That also means upskilling on a grand scale, something happened over time with personal and business computing. While it looks as if the process could be faster this time around, it is too early to know enough about where this revolution is going to take us. That may be enough to keep us engaged.

Mixing local and cloud capabilities in an AI toolkit

9^th September 2025

The landscape of AI development is shifting towards systems that prioritise local control, privacy and efficient resource management whilst maintaining the flexibility to integrate with external services when needed. This guide explores how to build a comprehensive AI toolkit that balances these concerns through seven key principles: local-first architecture, privacy preservation, standardised tool integration, workflow automation, autonomous agent development, efficient resource management and multi-modal knowledge handling.

Local-First Architecture and Control

The foundation of a robust AI toolkit begins with maintaining direct control over core components. Rather than relying entirely on cloud services, a local-first approach provides predictable costs, enhanced privacy and improved reliability whilst still allowing selective use of external resources.

Llama-Swap exemplifies this philosophy as a lightweight proxy that manages multiple language models on a single machine. This tool listens for OpenAI-style API calls, inspects the model field in each request, and ensures that the correct backend handles that call. The proxy intelligently starts or stops local LLM servers so only the required model runs at any given time, making efficient use of limited hardware resources.

Setting up this system requires minimal infrastructure: Python 3, Homebrew on macOS for package management, llama.cpp for hosting GGUF models locally and the Hugging Face CLI for model downloads. The proxy itself is a single binary that can be configured through a simple YAML file, specifying model paths and commands. This approach transforms model switching from a manual process of stopping and starting different servers into a seamless experience where clients can request different models through a single port.

The local-first principle extends beyond model hosting. Obsidian demonstrates this with its markdown-based knowledge management system that stores everything locally whilst providing rich linking capabilities and plugin extensibility. This gives users complete control over their data, whilst maintaining the ability to sync across devices when desired.

Privacy and Data Sovereignty

Privacy considerations permeate every aspect of AI toolkit design. Local processing inherently reduces exposure of sensitive data to external services, but even when cloud services are necessary, careful evaluation of data handling practices becomes crucial.

Voice processing illustrates these concerns clearly. ElevenLabs offers high-quality text-to-speech and voice cloning capabilities but requires careful assessment of consent and security policies when handling voice data. Similarly, services like NoteGPT that process documents and videos must be evaluated against regional regulations such as GDPR, particularly when handling sensitive information.

The principle of data minimisation suggests using local processing wherever feasible and cloud services only when their capabilities significantly outweigh privacy concerns. This might mean running smaller language models locally for routine tasks, whilst reserving larger cloud models for complex reasoning that exceeds local capacity.

Tool Integration and Standardisation

As AI systems become more sophisticated, the ability to integrate diverse tools through standardised protocols becomes essential. The Model Context Protocol (MCP) addresses this need by defining how lightweight servers present databases, file systems and web services to AI models in a secure, auditable manner.

MCP servers act as bridges between AI models and real systems, whilst MCP clients are applications that discover and utilise these servers. This standardisation enables a rich ecosystem of tools that can be mixed and matched according to specific needs.

Several clients demonstrate different approaches to MCP integration. Claude Desktop auto-starts configured servers on launch, making tools immediately available. Cursor AI and Windsurf integrate MCP servers directly into coding environments, allowing function calls to route to custom servers automatically. Continue provides open-source alternatives for VS Code and JetBrains, whilst LibreChat offers a flexible chat interface that can connect to various model providers and MCP servers.

The standardisation extends to development workflows through tools like Claude Code, which integrates with GitHub repositories to automate routine tasks. By creating a Claude GitHub App, developers can use natural language comments to trigger actions like generating Docker configurations, reviewing code or updating documentation.

Workflow Automation and Productivity

Effective AI toolkits streamline repetitive tasks and augment human decision-making, rather than replacing it entirely. This automation spans from simple content generation to complex research workflows that combine multiple tools and services.

A practical research workflow demonstrates this integration. Beginning with a focused question, Perplexity AI can generate citation-backed reports using its deep research capability. These reports, exported as PDFs, can then be uploaded to NotebookLM for interactive exploration. NotebookLM transforms static content into searchable material, generates audio overviews that render complex topics as podcast-style conversations, and builds mind maps to reveal relationships between concepts.

This multi-stage process turns surface reading into grounded understanding by enabling different modes of engagement with the same material. The automation handles the mechanical aspects of research synthesis, whilst preserving human judgement about relevance and interpretation.

Repository management represents another automation frontier. GitHub integrations can handle issue triage, code review, documentation updates and refactoring through natural language instructions. This reduces cognitive overhead for routine maintenance whilst maintaining developer control over significant decisions.

Agentic AI and Autonomous Systems

The evolution from reactive prompt-response systems to goal-oriented agents represents a fundamental shift in AI system design. Agentic systems can plan across multiple steps, initiate actions when conditions warrant, and pursue long-running objectives with minimal supervision.

These systems typically combine several architectural components: a reasoning engine (usually an LLM with structured prompting), memory layers for preserving context, knowledge bases accessible through vector search and tool interfaces that standardise how agents discover and use external capabilities.

Patterns like ReAct interleave reasoning steps with tool calls, creating observe-think-act loops that enable continuous adaptation. Modern AI systems employ planning-first agents that formulate strategies before execution and adapt dynamically, alongside multi-agent architectures that coordinate specialist roles through hierarchical or peer-to-peer protocols.

Practical applications illustrate these concepts clearly. An autonomous research agent might formulate queries, rank sources, synthesise material and draft reports, demonstrating how complex goals can be decomposed into manageable subtasks. A personal productivity assistant could manage calendars, emails and tasks, showing how agents can integrate with external APIs whilst learning user preferences.

Safety and alignment remain paramount concerns. Constraints, approval gates and override mechanisms guard against harmful behaviour, whilst feedback mechanisms help maintain alignment with human intent. The goal is augmentation rather than replacement, with human oversight remaining essential for significant decisions.

Resource Management and Efficiency

Efficient resource utilisation becomes critical when running multiple AI models and services on limited hardware. This involves both technical optimisation and strategic choices about when to use local versus cloud resources.

Llama-Swap's selective concurrency feature exemplifies intelligent resource management. Whilst the default behaviour runs only one model at a time to conserve resources, groups can be configured to allow several smaller models to remain active together whilst maintaining swapping for larger models. This provides predictable resource usage without sacrificing functionality.

Model quantisation represents another efficiency strategy. GGUF variants of models like SmolLM2-135M-Instruct and Qwen2.5-0.5B-Instruct can run effectively on modest hardware whilst still providing distinct capabilities for different tasks. The trade-off between model size and capability can be optimised for specific use cases.

Cloud services complement local resources by handling computationally intensive tasks that exceed local capacity. The key is making these transitions seamless, so users can benefit from both approaches without managing complexity manually.

Multi-Modal Knowledge Management

Modern AI toolkits must handle diverse content types and enable fluid transitions between different modes of interaction. These span text processing, audio generation, visual content analysis and format conversion.

NotebookLM demonstrates sophisticated multi-modal capabilities by accepting various input formats (PDFs, images, tables) and generating different output modes (summaries, audio overviews, mind maps, study guides). This flexibility enables users to engage with information in ways that match their learning preferences and situational constraints.

NoteGPT extends this concept to video and presentation processing, extracting transcripts, segmenting content and producing summaries with translation capabilities. The challenge lies in preserving nuance during automated processing whilst making content more accessible.

Integration between different knowledge management approaches creates additional value. Notion's workspace approach combines notes, tasks, wikis and databases with recent additions like email integration and calendar synchronisation. Evernote focuses on mixed media capture and web clipping with cross-platform synchronisation.

The goal is creating systems that can capture information in its natural format, process it intelligently, and present it in ways that facilitate understanding and action.

Conclusion

Building an effective AI toolkit requires balancing multiple concerns: maintaining control over sensitive data whilst leveraging powerful cloud services, automating routine tasks whilst preserving human judgement, and optimising resource usage whilst maintaining system flexibility. The market demand for these skills is growing rapidly, with companies actively seeking professionals who can implement RAG systems, build reliable agents and manage hybrid AI architectures.

The local-first approach provides a foundation for this balance, giving users control over their data and computational resources whilst enabling selective integration with external services. RAG has evolved from a technical necessity for small context windows to a strategic choice for cost reduction and reliability improvement. Standardised protocols like MCP make it practical to combine diverse tools without vendor lock-in. Workflow automation reduces cognitive overhead for routine tasks, and agentic capabilities enable more sophisticated goal-oriented behaviour.

Success depends on thoughtful integration rather than simply accumulating tools. The most effective systems combine local processing for privacy-sensitive tasks, cloud services for capabilities that exceed local resources, and standardised interfaces that enable experimentation and adaptation as needs evolve. Whether the goal is reducing API costs through efficient RAG implementation or building agents that prevent hallucinations through grounded retrieval, the principles remain consistent: maintain control, optimise resources and preserve human oversight.

This approach creates AI toolkits that are not only adaptable, secure and efficient but also commercially viable and career-relevant in a rapidly evolving landscape where the ability to build reliable, cost-effective AI systems has become a competitive necessity.