Model Context Protocol | Technology Tales

Latest developments in the AI landscape: Consolidation, implementation and governance

22^nd November 2025

Artificial intelligence is moving through another moment of consolidation and capability gain. New ways to connect models to everyday tools now sit alongside aggressive platform plays from the largest providers, a steady cadence of model upgrades, and a more defined conversation about risk and regulation. For companies trying to turn all this into practical value, the story is becoming less about chasing the latest benchmark and more about choosing a platform, building the right connective tissue, and governing data use with care. The coming year looks set to reward those who simplify the user experience, embed AI directly into work and adopt proportionate controls rather than blanket bans.

I. Market Structure and Competitive Dynamics

Platform Consolidation and Lock-In

Enterprise AI appears to be settling into a two-platform market. Analysts describe a landscape defined more by integration and distribution than raw model capability, evoking the cloud computing wars. On one side sit Microsoft and OpenAI, on the other Google and Gemini. Recent signals include the pricing of Gemini 3 Pro at around two dollars per million tokens, which undercuts much of the market, Alphabet's share price strength, and large enterprise deals for Gemini integrated with Google's wider software suite. Google is also promoting Antigravity, an agent-first development environment with browser control, asynchronous execution and multi-agent support, an attempt to replicate the pull of VS Code within an AI-native toolchain.

The implication for buyers is higher switching costs over time. Few expect true multi-cloud parity for AI, and regional splits will remain. Guidance from industry commentators is to prioritise integration across the existing estate rather than incremental model wins, since platform choices now look like decade-long commitments. Events lined up for next year are already pointing to that platform view.

Enterprise Infrastructure Alignment

A wider shift in software development is also taking shape. Forecasts for 2026 emphasise parallel, multi-agent systems where a planning agent orchestrates a set of execution agents, and harnesses tune themselves as they learn from context. There is growing adoption of a mix-of-models approach in which expensive frontier models handle planning, and cheaper models do the bulk of execution, bringing near-frontier quality for less money and with lower latency. Team structures are changing as a result, with more value placed on people who combine product sense with engineering craft and less on narrow specialisms.

ServiceNow and Microsoft have announced a partnership to coordinate AI agents across organisations with tighter oversight and governance, an attempt to avoid the sprawl that plagued earlier automation waves. Nvidia has previewed Apollo, a set of open AI physics models intended to bring real-time fidelity to simulations used in science and industry. Albania has appointed an AI minister, which has kicked off debate about how governments should manage and oversee their own AI use. CIOs are being urged to lead on agentic AI as systems become capable of automating end-to-end workflows rather than single steps.

New companies and partnerships signal where capital and talent are heading. Jeff Bezos has returned to co-lead Project Prometheus, a start-up with $6.2 billion raised and a team of about one hundred hires from major labs, focused on AI for engineering and manufacturing in the physical world, an aim that aligns with Blue Origin interests. Vik Bajaj is named as co-CEO.

Deals underline platform consolidation. Microsoft and Nvidia are investing up to $5 billion and $10 billion respectively (totalling $15 billion) in Anthropic, whilst Anthropic has committed $30 billion in Azure capacity purchases with plans to co-design chips with Nvidia.

Commercial Model Evolution

Events and product launches continue at pace. xAI has released Grok 4.1 with an emphasis on creativity and emotional intelligence while cutting hallucinations. On the tooling front, tutorials explain how ChatGPT's desktop app can record meetings for later summarisation. In a separate interview, DeepMind's Demis Hassabis set out how Gemini 3 edges out competitors in many reasoning and multimodal benchmarks, slightly trails Claude Sonnet 4.5 in coding, and is being positioned for foundations in healthcare and education though not as a medical-grade system. Google is encouraging developers towards Antigravity for agentic workflows.

Industry leaders are also sketching commercial models that assume more agentic behaviour, with Microsoft's Satya Nadella promising a "positive-sum" vision for AI while hinting at per-agent pricing and wider access to OpenAI IP under Microsoft's arrangements.

II. Technical Implementation and Capability

Practical Connectivity Over Capability

A growing number of organisations are starting with connectors that allow a model to read and write across systems such as Gmail, Notion, calendars, CRMs, and Slack. Delivered via the Model Context Protocol, these links pull the relevant context into a single chat, so users spend less time switching windows and more time deciding what to do. Typical gains are in hours saved each week, lower error rates, and quicker responses. With a few prompts, an assistant can draft executive email summaries, populate a Notion database with leads from scattered sources, or propose CRM follow-ups while showing its working.

The cleanest path is phased: enable one connector using OAuth, trial it in read-only mode, then add simple routines for briefs, meeting preparation or weekly reports before switching on write access with a "show changes before saving" step. Enterprise controls matter here. Connectors inherit user permissions via OAuth 2.0, process data in memory, and vendors point to SOC 2, GDPR and CCPA compliance alongside allow and block lists, policy management, and audit logs. Many governance teams prefer to begin read-only and require approvals for writes.

There are limits to note, including API rate caps, sync delays, context window constraints and timeouts for long workflows. They are poor fits for classified data, considerable bulk operations or transactions that cannot tolerate latency. Some industry observers regard Claude's current MCP implementation, particularly on desktop, as the most capable of the group. Playbooks for a 30-day rollout are beginning to circulate, as are practitioner workshops introducing go-to-market teams to these patterns.

Agentic Orchestration Entering Production

Practical comparisons suggest the surrounding tooling can matter more than the raw model for building production-ready software. One report set a 15-point specification across several environments and found that Claude Code produced all features end-to-end. The same spec built with Gemini 3 inside Antigravity delivered two thirds of the features, while Sonnet 4.5 in Antigravity delivered a little more than half, with omissions around batching, progress indicators and robust error handling.

Security remains a live issue. One newsletter reports that Anthropic said state-backed Chinese hackers misused Claude to autonomously support a large cyberattack, which has intensified calls for governance. The background hum continues, from a jump in voice AI adoption to a German ruling on lyric copyright involving OpenAI, new video guidance steps in Gemini, and an experimental "world model" called Marble. Tools such as Yorph are receiving attention for building agentic data pipelines as teams look to productionise these patterns.

Tooling Maturity Defining Outcomes

In engineering practice, Google's Code Wiki brings code-aware documentation that stays in sync with repositories using Gemini, supported by diagrams and interactive chat. GitLab's latest survey suggests AI increases code creation but also pushes up demand for skilled engineers alongside compliance and human oversight. In operations, Chronosphere has added AI remediation guidance to cut observability noise and speed root-cause analysis while performance testing is shifting towards predictive, continuous assurance rather than episodic tests.

Vertical Capability Gains

While the platform picture firms up, model and product updates continue at pace. Google has drawn attention with a striking upgrade to image generation, based on Gemini 3. The system produces 4K outputs with crisp text across multiple languages and fonts, can use up to 14 reference images, preserves identity, and taps Google Search to ground data for accurate infographics.

Separately, OpenAI has broadened ChatGPT Group Chats to as many as 20 people across all pricing tiers, with privacy protections that keep group content out of a user's personal memory. Consumer advocates have used the moment to call out the risks of AI toys, citing safety, privacy and developmental concerns, even as news continues to flow from research and product teams, from the release of OLMo 3 to mobile features from Perplexity and a partnership between Stability and Warner Music Group.

Anthropic has answered with Claude Opus 4.5, which it says is the first model to break the 80 percent mark on SWE-Bench Verified while improving tool use and reasoning. Opus 4.5 is designed to orchestrate its smaller Haiku models and arrives with a price cut of roughly two thirds compared to the 4.1 release. Product changes include unlimited chat length, a Claude Code desktop app, and integrations that reach across Chrome and Excel.

OpenAI's additions have a more consumer flavour, with a Shopping Research feature in ChatGPT that produces personalised product guidance using a GPT-5 mini variant and plans for an Instant Checkout flow. In government, a new US executive order has launched the "Genesis Mission" under the Department of Energy, aiming to fuse AI capabilities across 17 national labs for advances in fields such as biotechnology and energy.

Coding tools are evolving too. OpenAI has previewed GPT-5.1-Codex-Max, which supports long-running sessions by compacting conversational history to preserve context while reducing overhead. The company reports 30 percent fewer tokens and faster performance over sessions that can run for more than a day. The tool is already available in the Codex CLI and IDE, with an API promised.

Infrastructure news out of the Middle East points to large-scale investment, with Saudi HUMAIN announcing data centre plans including xAI's first international facility alongside chips from Nvidia and AWS, and a nationwide rollout of Grok. In computer vision, Meta has released SAM 3 and SAM 3D as open-source projects, extending segmentation and enabling single-photo 3D reconstruction, while other product rollouts continue from GPT-5.1 Pro availability to fresh funding for audio generation and a marketing tie-up between Adobe and Semrush.

On the image side, observers have noted syntax-aware code and text generation alongside moderation that appears looser than some rivals. A playful "refrigerator magnet" prompt reportedly revealed a portion of the system prompt, a reminder that prompt injection is not just a developer concern.

Video is another area where capabilities are translating into business impact. Sora 2 can generate cinematic, multi-shot videos with consistent characters from text or images, which lets teams accelerate marketing content, broaden A/B testing and cut the need for studios on many projects. Access paths now span web, mobile, desktop apps and an API, and the market has already produced third-party platforms that promise exports without watermarks.

Teams experimenting with Sora are being advised to measure success by outcomes such as conversion rates, lower support loads or improved lead quality rather than just aesthetic fidelity. Implementation advice favours clear intent, structured prompts and iterative variation, with more advanced workflows assembling multi-shot storyboards, using match cuts to maintain rhythm, controlling lighting for continuity and anchoring character consistency across scenes.

III. Governance, Risk and Regulation

Governance as a Product Requirement

Amid all this activity, data risk has become a central theme for AI leaders. One governance specialist has consolidated common problem patterns into the PROTECT framework, which offers a way to map and mitigate the most material risks.

The first concern is the use of public AI tools for work content, which raises the chance of leakage or unwanted training on proprietary data. The recommended answer combines user guidance, approved internal alternatives, and technical or legal controls such as data scanning and blocking.

A second pressure point is rogue internal projects that bypass review, create compliance blind spots and build up technical debt. Proportionate oversight is key, calibrated to data sensitivity and paired with streamlined governance, so teams are not incentivised to route around it.

Third-party vendors can be opportunistic with data, so due diligence and contractual clauses need to prevent cross-customer training and make expectations clear with templates and guidance.

Technical attacks are another strand, from prompt injection to data exfiltration or the misuse of agents. Layered defences help here, including input validation, prompt sanitisation, output filtering, monitoring, red-teaming, and strict limits on access and privilege.

Embedded assistants and meeting bots come with permission risks when they operate over shared drives and channels, and agentic systems can amplify exposure if left unchecked, so the advice is to enforce least-privilege access, start on low-risk data, and keep robust audit trails.

Compliance risks span privacy laws such as GDPR with their demands for a lawful basis, IP and copyright constraints, contractual obligations, and the AI Act's emphasis on data quality. Legal and compliance checks need to be embedded at data sourcing, model training and deployment, backed by targeted training.

Finally, cross-border restrictions matter. Transfers should be mapped across systems and sub-processors, with checks for Data Privacy Framework certification, standard contractual clauses where needed, and transfer impact assessments that take account of both GDPR and newer rules such as the US Bulk Data Transfer Rule.

Regulatory Pragmatism

Regulators are not standing still, either. In the European Commission has proposed amendments to the AI Act through a Digital Omnibus package as the trilogue process rolls on. Six changes are in focus:

High-risk timelines would be tied to the approval of standards, with a backstop of December 2027 for Annex III systems and August 2028 for Annex I products if delays continue, though the original August 2026 date still holds otherwise.
Transparency rules on AI-detectable outputs under Article 50(2) would be delayed to February 2027 for systems placed on the market before August 2026, with no delay for newer systems.
The plan removes the need to register Annex III systems in the public database where providers have documented under Article 6(3) that a system is not high risk.
AI literacy would shift from a mandatory organisation-wide requirement to encouragement, except where oversight of high-risk systems demands it.
There is also a move to centralise supervision by the AI Office for systems built on general-purpose models by the same provider, and for huge online platforms and search engines, which is intended to reduce fragmentation across member states.
Finally, proportionality measures would define Small Mid-Cap companies and extend simplified obligations and penalty caps that currently apply to SMEs.

If adopted, the package would grant more time and reduce administrative load in some areas, at the expense of certainty and public transparency.

IV. Strategic Implications

The picture that emerges is one of pragmatic integration. Connectors make it feasible to keep work inside a single chat while drawing on the systems people already use. Platform choices are converging, so it makes sense to optimise for the suite that fits the current stack and to plan for switching costs that accumulate over time.

Agentic orchestration is moving from slides to code, but teams will get further by focusing on reliable tooling, clear governance and value measures that match business goals. Regulation is edging towards more flexible timelines and centralised oversight in places, which may lower administrative load without removing the need for discipline.

The sensible posture is measured experimentation: start with read-only access to lower-risk data, design routines that remove drudgery, introduce write operations with approvals, and monitor what is actually changing. The tools are improving quickly, yet the organisations that benefit most will be those that match innovation with proportionate controls and make thoughtful choices now that will hold their shape for the decade ahead.

Mixing local and cloud capabilities in an AI toolkit

9^th September 2025

The landscape of AI development is shifting towards systems that prioritise local control, privacy and efficient resource management whilst maintaining the flexibility to integrate with external services when needed. This guide explores how to build a comprehensive AI toolkit that balances these concerns through seven key principles: local-first architecture, privacy preservation, standardised tool integration, workflow automation, autonomous agent development, efficient resource management and multi-modal knowledge handling.

Local-First Architecture and Control

The foundation of a robust AI toolkit begins with maintaining direct control over core components. Rather than relying entirely on cloud services, a local-first approach provides predictable costs, enhanced privacy and improved reliability whilst still allowing selective use of external resources.

Llama-Swap exemplifies this philosophy as a lightweight proxy that manages multiple language models on a single machine. This tool listens for OpenAI-style API calls, inspects the model field in each request, and ensures that the correct backend handles that call. The proxy intelligently starts or stops local LLM servers so only the required model runs at any given time, making efficient use of limited hardware resources.

Setting up this system requires minimal infrastructure: Python 3, Homebrew on macOS for package management, llama.cpp for hosting GGUF models locally and the Hugging Face CLI for model downloads. The proxy itself is a single binary that can be configured through a simple YAML file, specifying model paths and commands. This approach transforms model switching from a manual process of stopping and starting different servers into a seamless experience where clients can request different models through a single port.

The local-first principle extends beyond model hosting. Obsidian demonstrates this with its markdown-based knowledge management system that stores everything locally whilst providing rich linking capabilities and plugin extensibility. This gives users complete control over their data, whilst maintaining the ability to sync across devices when desired.

Privacy and Data Sovereignty

Privacy considerations permeate every aspect of AI toolkit design. Local processing inherently reduces exposure of sensitive data to external services, but even when cloud services are necessary, careful evaluation of data handling practices becomes crucial.

Voice processing illustrates these concerns clearly. ElevenLabs offers high-quality text-to-speech and voice cloning capabilities but requires careful assessment of consent and security policies when handling voice data. Similarly, services like NoteGPT that process documents and videos must be evaluated against regional regulations such as GDPR, particularly when handling sensitive information.

The principle of data minimisation suggests using local processing wherever feasible and cloud services only when their capabilities significantly outweigh privacy concerns. This might mean running smaller language models locally for routine tasks, whilst reserving larger cloud models for complex reasoning that exceeds local capacity.

Tool Integration and Standardisation

As AI systems become more sophisticated, the ability to integrate diverse tools through standardised protocols becomes essential. The Model Context Protocol (MCP) addresses this need by defining how lightweight servers present databases, file systems and web services to AI models in a secure, auditable manner.

MCP servers act as bridges between AI models and real systems, whilst MCP clients are applications that discover and utilise these servers. This standardisation enables a rich ecosystem of tools that can be mixed and matched according to specific needs.

Several clients demonstrate different approaches to MCP integration. Claude Desktop auto-starts configured servers on launch, making tools immediately available. Cursor AI and Windsurf integrate MCP servers directly into coding environments, allowing function calls to route to custom servers automatically. Continue provides open-source alternatives for VS Code and JetBrains, whilst LibreChat offers a flexible chat interface that can connect to various model providers and MCP servers.

The standardisation extends to development workflows through tools like Claude Code, which integrates with GitHub repositories to automate routine tasks. By creating a Claude GitHub App, developers can use natural language comments to trigger actions like generating Docker configurations, reviewing code or updating documentation.

Workflow Automation and Productivity

Effective AI toolkits streamline repetitive tasks and augment human decision-making, rather than replacing it entirely. This automation spans from simple content generation to complex research workflows that combine multiple tools and services.

A practical research workflow demonstrates this integration. Beginning with a focused question, Perplexity AI can generate citation-backed reports using its deep research capability. These reports, exported as PDFs, can then be uploaded to NotebookLM for interactive exploration. NotebookLM transforms static content into searchable material, generates audio overviews that render complex topics as podcast-style conversations, and builds mind maps to reveal relationships between concepts.

This multi-stage process turns surface reading into grounded understanding by enabling different modes of engagement with the same material. The automation handles the mechanical aspects of research synthesis, whilst preserving human judgement about relevance and interpretation.

Repository management represents another automation frontier. GitHub integrations can handle issue triage, code review, documentation updates and refactoring through natural language instructions. This reduces cognitive overhead for routine maintenance whilst maintaining developer control over significant decisions.

Agentic AI and Autonomous Systems

The evolution from reactive prompt-response systems to goal-oriented agents represents a fundamental shift in AI system design. Agentic systems can plan across multiple steps, initiate actions when conditions warrant, and pursue long-running objectives with minimal supervision.

These systems typically combine several architectural components: a reasoning engine (usually an LLM with structured prompting), memory layers for preserving context, knowledge bases accessible through vector search and tool interfaces that standardise how agents discover and use external capabilities.

Patterns like ReAct interleave reasoning steps with tool calls, creating observe-think-act loops that enable continuous adaptation. Modern AI systems employ planning-first agents that formulate strategies before execution and adapt dynamically, alongside multi-agent architectures that coordinate specialist roles through hierarchical or peer-to-peer protocols.

Practical applications illustrate these concepts clearly. An autonomous research agent might formulate queries, rank sources, synthesise material and draft reports, demonstrating how complex goals can be decomposed into manageable subtasks. A personal productivity assistant could manage calendars, emails and tasks, showing how agents can integrate with external APIs whilst learning user preferences.

Safety and alignment remain paramount concerns. Constraints, approval gates and override mechanisms guard against harmful behaviour, whilst feedback mechanisms help maintain alignment with human intent. The goal is augmentation rather than replacement, with human oversight remaining essential for significant decisions.

Resource Management and Efficiency

Efficient resource utilisation becomes critical when running multiple AI models and services on limited hardware. This involves both technical optimisation and strategic choices about when to use local versus cloud resources.

Llama-Swap's selective concurrency feature exemplifies intelligent resource management. Whilst the default behaviour runs only one model at a time to conserve resources, groups can be configured to allow several smaller models to remain active together whilst maintaining swapping for larger models. This provides predictable resource usage without sacrificing functionality.

Model quantisation represents another efficiency strategy. GGUF variants of models like SmolLM2-135M-Instruct and Qwen2.5-0.5B-Instruct can run effectively on modest hardware whilst still providing distinct capabilities for different tasks. The trade-off between model size and capability can be optimised for specific use cases.

Cloud services complement local resources by handling computationally intensive tasks that exceed local capacity. The key is making these transitions seamless, so users can benefit from both approaches without managing complexity manually.

Multi-Modal Knowledge Management

Modern AI toolkits must handle diverse content types and enable fluid transitions between different modes of interaction. These span text processing, audio generation, visual content analysis and format conversion.

NotebookLM demonstrates sophisticated multi-modal capabilities by accepting various input formats (PDFs, images, tables) and generating different output modes (summaries, audio overviews, mind maps, study guides). This flexibility enables users to engage with information in ways that match their learning preferences and situational constraints.

NoteGPT extends this concept to video and presentation processing, extracting transcripts, segmenting content and producing summaries with translation capabilities. The challenge lies in preserving nuance during automated processing whilst making content more accessible.

Integration between different knowledge management approaches creates additional value. Notion's workspace approach combines notes, tasks, wikis and databases with recent additions like email integration and calendar synchronisation. Evernote focuses on mixed media capture and web clipping with cross-platform synchronisation.

The goal is creating systems that can capture information in its natural format, process it intelligently, and present it in ways that facilitate understanding and action.

Conclusion

Building an effective AI toolkit requires balancing multiple concerns: maintaining control over sensitive data whilst leveraging powerful cloud services, automating routine tasks whilst preserving human judgement, and optimising resource usage whilst maintaining system flexibility. The market demand for these skills is growing rapidly, with companies actively seeking professionals who can implement RAG systems, build reliable agents and manage hybrid AI architectures.

The local-first approach provides a foundation for this balance, giving users control over their data and computational resources whilst enabling selective integration with external services. RAG has evolved from a technical necessity for small context windows to a strategic choice for cost reduction and reliability improvement. Standardised protocols like MCP make it practical to combine diverse tools without vendor lock-in. Workflow automation reduces cognitive overhead for routine tasks, and agentic capabilities enable more sophisticated goal-oriented behaviour.

Success depends on thoughtful integration rather than simply accumulating tools. The most effective systems combine local processing for privacy-sensitive tasks, cloud services for capabilities that exceed local resources, and standardised interfaces that enable experimentation and adaptation as needs evolve. Whether the goal is reducing API costs through efficient RAG implementation or building agents that prevent hallucinations through grounded retrieval, the principles remain consistent: maintain control, optimise resources and preserve human oversight.

This approach creates AI toolkits that are not only adaptable, secure and efficient but also commercially viable and career-relevant in a rapidly evolving landscape where the ability to build reliable, cost-effective AI systems has become a competitive necessity.

An Overview of MCP Servers in Visual Studio Code

29^th August 2025

Agent mode in Visual Studio Code now supports an expanding ecosystem of Model Context Protocol servers that equip the editor’s built-in assistant with practical tools. By installing these servers, an agent can connect to databases, invoke APIs and perform automated or specialised operations without leaving the development environment. The result is a more capable workspace where routine tasks are streamlined, and complex ones are broken into more manageable steps. The catalogue spans developer tooling, productivity services, data and analytics, business platforms, and cloud or infrastructure management. If something you rely on is not yet present, there is a route to suggest further additions. Guidance on using MCP tools in agent mode is available in the documentation, and the Command Palette, opened with Ctrl+Shift+P, remains the entry point for many workflows.

The servers in the developer tools category concentrate on everyday software tasks. GitHub integration brings repositories, issues and pull requests into reach through a secure API, so that code review and project coordination can continue without switching context. For teams who use design files as a source of truth, Figma support extracts UI content and can generate code from designs, with the note that using the latest desktop app version is required for full functionality. Browser automation is covered by Playwright from Microsoft, which drives tests and data collection using accessibility trees to interact with the page, a technique that often results in more resilient scripts. The attention to quality and reliability continues with Sentry, where an agent can retrieve and analyse application errors or performance issues directly from Sentry projects to speed up triage and resolution.

The breadth of developer capability extends to machine learning and code understanding. Hugging Face integration provides access to models, datasets and Spaces on the Hugging Face Hub, which is useful for prototyping, evaluation or integrating inference into tools. For source exploration beyond a single repository, DeepWiki by Kevin Kern offers querying and information extraction from GitHub repositories indexed on that service. Converting documents is handled by MarkItDown from Microsoft, which takes common files like PDF, Word, Excel, images or audio and outputs Markdown, unifying content for notes, documentation or review. Finding accurate technical guidance is eased by Microsoft Docs, a Microsoft-provided server that searches Microsoft Learn, Azure documentation and other official technical resources. Complementing this is Context7 from Upstash, which returns up-to-date, version-specific documentation and code examples from any library or framework, an approach that addresses the common problem of answers drifting out of date as software evolves.

Visual assets and code health have their own role. ImageSorcery by Sunrise Apps performs local image processing tasks, including object detection, OCR, editing and other transformations, a capability that supports anything from quick asset tweaks to automated checks in a content pipeline. Codacy completes the developer picture with comprehensive code quality and security analysis. It covers static application security testing, secrets detection, dependency scanning, infrastructure as code security and automated code review, which helps teams maintain standards while moving quickly.

Productivity services focus on planning, tracking and knowledge capture. Notion’s server allows viewing, searching, creating and updating pages and databases, meaning an agent can assemble notes or checklists as it progresses. Linear integration brings the ability to create, update and track issues in Linear’s project management platform, reflecting a growing preference for lightweight, developer-centred planning. Asana support provides task and project management together with comments, allowing multi-team coordination. Atlassian’s server connects to Jira and Confluence for issue tracking and documentation, which suits organisations that rely on established workflows for governance and audit trails. Monday.com adds another project management option, with management of boards, items, users, teams and workspace operations. These capabilities sit alongside automation from Zapier, which can create workflows and execute tasks across more than 30,000 connected apps to remove repetitive steps and bind systems together when native integrations are limited.

Two Model Context Protocol utilities add cognitive structure to how the agent works. Sequential Thinking helps break down complex tasks into manageable steps with transparent tracking, so progress is visible and revisable. Memory provides long-lived context across sessions, allowing an agent to store and retrieve relevant information rather than relying on a single interaction. Together, they address the practicalities of working on multi-stage tasks where recalling decisions, constraints or partial results is as important as executing the next action. Used with the productivity servers, these tools underpin a systematic approach to projects that span hours or days.

The data and analytics group is comprehensive, stretching from lightweight local analysis to cloud-scale services. DuckDB by Kentaro Tanaka enables querying and analysis of DuckDB databases both locally and in the cloud, which suits ad hoc exploration as well as embedded analytics in applications. Neon by neondatabase labs provides access to Postgres with the notable addition of natural language operations for managing and querying databases, which lowers the barrier to occasional administrative tasks. Prisma Postgres from Prisma brings schema management, query execution, migrations and data modelling to the agent, supporting teams who already use Prisma’s ORM in their applications. MongoDB integration supports database operations and management, with the ability to execute queries, manage collections, build aggregation pipelines and perform document operations, allowing front-end and back-end tasks to be coordinated through a single interface.

Observability and product insight are also represented. PostHog offers analytics access for creating annotations and retrieving product usage insights so that changes can be correlated with user behaviour. Microsoft Clarity provides analytics data including heatmaps, session recordings and other user behaviour insights that complement quantitative metrics and highlight usability issues. Web data collection has two strong options. Apify connects the agent with Apify’s Actor ecosystem to extract data from websites and automate broader workflows built on that platform. Firecrawl by Mendable focuses on extracting data from websites using web scraping, crawling and search with structured data extraction, a combination that suits building datasets or feeding search indexes. These tools bridge real-world usage and the development cycle, keeping decision-making grounded in how software is experienced.

The business services category addresses payments, customer engagement and web presence. Stripe integration allows the creation of customers, management of subscriptions and generation of payment links through Stripe APIs, which is often enough to pilot monetisation or administer accounts. PayPal provides the ability to create invoices, process payments and access transaction data, ensuring another widely used channel can be managed without bespoke scripts. Square rounds out payment options with facilities to process payments and manage customers across its API ecosystem. Intercom support brings access to customer conversations and support tickets for data analysis, allowing an agent to summarise themes, surface follow-ups or route issues to the right place. For building and running sites, Wix integration helps with creating and managing sites that include e-commerce, bookings and payment features, while Webflow enables creating and managing websites, collections and content through Webflow’s APIs. Together, these options cover a spectrum of online business needs, from storefronts to content-led marketing.

Cloud and infrastructure operations are often the backbone of modern projects, and the MCP catalogue reflects this. Convex provides access to backend databases and functions for real-time data operations, making it possible to work with stateful server logic directly from agent mode. Azure integration supports management of Azure resources, database queries and access to Azure services so that provisioning, configuration and diagnostics can be performed in context. Azure DevOps extends this to project and release processes with management of projects, work items, repositories, builds, releases and test plans, providing an end-to-end view for teams invested in Microsoft’s tooling. Terraform from HashiCorp introduces infrastructure as code management, including plan, apply and destroy operations, state management and resource inspection. This combination makes it feasible to review and adjust infrastructure, coordinate deployments and correlate changes with code or issue history without switching tools.

These servers are designed to be installed like other VS Code components, visible from the MCP section and accessible in agent mode once configured. Many entries provide a direct route to installation, so setup friction is limited. Some include specific requirements, such as Figma’s need for the latest desktop application, and all operate within the Model Context Protocol so that the agent can call tools predictably. The documentation explains usage patterns for each category, from parameterising database queries to invoking external APIs, and clarifies how capabilities appear inside agent conversations. This is useful for understanding the scope of what an agent can do, as well as for setting boundaries in shared environments.

In day-to-day use, the value comes from combining servers to match a workflow. A developer investigating a production incident might consult Sentry for errors, query Microsoft Docs for guidance, pull related issues from GitHub and draft changes to documentation with MarkItDown after analysing logs held in DuckDB. A product manager could retrieve usage insights from PostHog, review session recordings in Microsoft Clarity, create follow-up tasks in Linear and brief customer support by summarising Intercom conversations, all while keeping a running Memory of key decisions. A data practitioner might gather inputs from Firecrawl or Apify, store intermediates in MongoDB, perform local analysis in DuckDB and publish a report to Notion, building a repeatable chain with Zapier where steps can be automated. In infrastructure scenarios, Terraform changes can be planned and applied while Azure resources are inspected, with release coordination handled through Azure DevOps and updates documented in Confluence via the Atlassian server.

Security and quality concerns are woven through these flows. Codacy can evaluate code for vulnerabilities or antipatterns as changes are proposed, surfacing SAST findings, secrets detection problems or dependency risks before they progress. Stripe, PayPal and Square centralise payment operations to a few well-audited APIs rather than bespoke integrations, which reduces surface area and simplifies auditing. For content and data ingestion, ImageSorcery ensures that image transformations occur locally and MarkItDown produces traceable Markdown outputs from disparate file types, keeping artefacts consistent for reviews or archives. Sequential Thinking helps structure longer tasks, and Memory preserves context so that actions are explainable after the fact, which is helpful for compliance as well as everyday collaboration.

Discoverability and learning resources sit close to the tools themselves. The Visual Studio Code website’s navigation surfaces areas such as Docs, Updates, Blog, API, Extensions, MCP, FAQ and Dev Days, while the Download path remains clear for new installations. The MCP area groups servers by capability and links to documentation that explains how agent mode calls each tool. Outside the product, the project’s presence on GitHub provides a route to raise issues or follow changes. Community activity continues on channels including X, LinkedIn, Bluesky and Reddit, and there are broadcast updates through the VS Code Insiders Podcast, TikTok and YouTube. These outlets provide context for new server additions, changes to the protocol and examples of how teams are putting the pieces together, which can be as useful as the tools themselves when establishing good practices.

It is worth noting that the catalogue is curated but open to expansion. If there is an MCP server that you expect to see, there is a path to suggest it, so gaps can be addressed over time. This flows from the protocol’s design, which encourages clean interfaces to external systems, and from the way agent mode surfaces capabilities. The cumulative effect is that the assistant inside VS Code becomes a practical co-worker that can search documentation, change infrastructure, file issues, analyse data, process payments or summarise customer conversations, all using the same set of controls and the same context. The common protocol keeps these interactions predictable, so adding a new server feels familiar even when the underlying service is new.

As the ecosystem grows, the connection between development work and operations becomes tighter, and the assistant’s job is less about answering questions in isolation than orchestrating tools on the developer’s behalf. The MCP servers outlined here provide a foundation for that shift. They encapsulate the services that many teams already rely on and present them inside agent mode so that work can continue where the code lives. For those getting started, the documentation explains how to enable the tools, the Command Palette offers quick access, and the community channels provide a steady stream of examples and updates. The result is a VS Code experience that is better equipped for modern workflows, with MCP servers supplying the functionality that turns agent mode into a practical extension of everyday work.