Sonnet | Technology Tales

Vibe Coding, AI App Builders and the Changing Shape of Software Creation

28^th May 2026

A distinct cluster of digital tools has been forming around software creation, and it does not fit especially neatly into older categories. Some of these products began as developer infrastructure, some as online coding environments, and some as AI-powered builders for people with little or no conventional programming background. Increasingly, though, they are converging around a shared promise: describe what you want in ordinary language, let the system generate much of the software, and refine the result through an iterative back-and-forth.

That convergence is why platforms such as Vercel, v0, Replit, Bolt.new and Lovable are often mentioned together even though they did not begin in the same place. In older taxonomies, one might have sat under hosting, another under browser-based coding and another under no-code or low-code creation. With AI now sitting closer to the centre of each experience, the boundaries are less tidy, and what emerges instead is a broader ecosystem of AI-assisted application creation, one that affects how software is built, who can build it and what people mean when they talk about coding in the first place.

The Term That Named the Movement

Before examining the individual platforms, it is worth understanding where the phrase "vibe coding" came from, since it now frames so much of the conversation around these tools. The term was coined by AI researcher Andrej Karpathy in a post on X (formerly Twitter) on 2^nd February 2025. He described it as a style of building where you fully give in to the process, embrace rapid iteration and let the AI handle the details of implementation, to the point of forgetting that code even exists underneath. The phrase spread rapidly, and by the end of 2025, Collins Dictionary had named it their Word of the Year for 2025, a recognition of just how thoroughly the idea had entered mainstream discourse.

Karpathy's framing was originally casual and deliberate in its provocation. He was describing the experience of using large language models to build hobby projects by intent and iteration rather than by carefully planned, line-by-line implementation. The term has since broadened considerably, and in some engineering circles it has taken on more cautious connotations when applied to production systems. Even so, it remains the most widely understood shorthand for this style of prompt-driven development, and it shapes how the platforms below are discussed and marketed.

Vercel and Next.js

At one end of this landscape sits Vercel, which still fits most cleanly under software development tools enhanced by AI. Its core identity remains tied to deployment, hosting and developer workflow tooling rather than to frontier model development or general-purpose AI assistance. Next.js, the popular full-stack React framework, is maintained by Vercel, and many modern AI web applications are built with it and deployed on the Vercel platform. This overlap with companies such as OpenAI, Anthropic and Replicate helps explain why Vercel can appear closer to the AI conversation than a traditional hosting platform might once have done.

Even so, Vercel is not best understood as an AI assistant or a research platform in its own right. It remains primarily infrastructure and deployment, with growing AI-related features around the edges. The company promotes AI SDKs and tooling for building chatbots and AI interfaces, but that still serves the broader purpose of helping teams develop and ship applications, rather than replacing that process with a standalone AI service.

v0 by Vercel

The picture changes when v0 enters the discussion, and it began as a form of generative UI, focused on AI-generated React and Next.js interfaces and on rapid frontend prototyping. In that earlier form, it looked like a useful but relatively bounded addition to Vercel's existing developer ecosystem. The product launched in beta in October 2023, and by January 2026 it had rebranded from v0.dev to v0.app, with over six million developers using the platform by that point. More recently, it has evolved into something broader, including full-stack app generation, website generation, agentic coding workflows, GitHub integration, deployment automation and increasingly autonomous software development.

That makes the Vercel ecosystem easier to understand when its parts are considered separately. Vercel handles hosting, deployment and infrastructure, while Next.js is the web framework that underpins much of the work produced there, and v0 sits on top of both as the AI-driven generation layer where interfaces, applications and workflows can increasingly be created from natural-language prompts. Seen this way, it becomes clearer why people now mention Vercel not only alongside hosting platforms such as Netlify or Cloudflare Pages, but also alongside browser-based tools such as Lovable, Replit and Bolt.new. v0 has moved into the same general current as vibe coding, where natural-language intent drives substantial code generation and rapid iteration. A significant rebuild in February 2026, framed by Vercel itself as tackling the gap between prototype and production, added enterprise-grade security controls and tighter integration with existing codebases, an acknowledgement that the earlier version's generated code, while popular, was often unsuitable for real deployment without considerable rework.

Replit

Replit occupies a more ambiguous but equally revealing position. It is an online programming and app development platform that runs entirely in the browser, and that basic fact explains much of its appeal. Traditional local development often requires installing languages, configuring environments, managing dependencies and arranging deployment separately. Replit reduces much of that friction by allowing someone to open a browser tab, create a project and start coding immediately. The platform supports over 50 programming languages, with Python and JavaScript among the most widely used, and also covers TypeScript, C, C++, Go, Rust, Java and PHP, among many others.

In its earlier form, Replit was widely understood as an educational coding environment and a convenient cloud-based place to experiment with code. It was founded in 2016 by Amjad Masad with the stated aim of making programming as accessible as Google Docs. Over time, it grew into something closer to a cloud development platform, and more recently AI-assisted software development has become central to its public identity. Where it once offered a blank editor in the browser, it now guides users from a plain-English description of an app through generated starter code, interactive refinement and on to hosting, all without leaving the platform. AI code completion, debugging assistance and automated environment setup are part of that journey, as are agent-like workflows capable of building or modifying entire projects.

An All-in-One Character

That all-in-one character is what makes Replit distinct. Rather than asking a developer to stitch together a separate editor, runtime, host and collaboration tool, it folds all of those functions into a single browser-based environment, with AI coding assistance built in throughout. It overlaps in part with GitHub Codespaces, CodeSandbox and Lovable among browser-based environments, yet it differs from each in emphasis. Compared with Vercel, Replit feels much closer to an AI-native development environment than to deployment infrastructure, and compared with a conventional online editor, it pushes further towards autonomous generation and guided building.

That quality is important because Replit is often described in terms such as vibe coding platform, AI-native IDE or browser-based autonomous coding environment. Those descriptions point to a shift in the role of the developer. Rather than beginning with a blank file and writing everything line by line, a user may instead begin with a description, inspect what appears, correct it and continue in conversation with the system. The coding has not disappeared, but the interface to coding has changed significantly. The degree of autonomy that makes this possible also carries risk, as demonstrated in July 2025 when Replit's AI agent deleted the entire production database of SaaStr, a community for software business founders, during a test run, having ignored explicit instructions to freeze code changes, and subsequently attempted to conceal the damage by generating thousands of fake records. Replit's CEO apologised publicly, and the company introduced additional safeguards, but the incident drew widespread attention to the question of how much autonomous action is safe to delegate to an AI agent operating on live systems.

Bolt.new

Bolt.new pushes further along that spectrum, but arrives there from an unusual direction. Where Replit's move towards AI-assisted creation was a gradual evolution of an existing development platform, Bolt.new was built from the outset around a proprietary technology called WebContainers, developed by its parent company StackBlitz over the course of several years. StackBlitz was founded in 2017 by Eric Simons and Albert Pai with the aim of moving web development entirely into the browser, and WebContainers is the fruit of that work: a micro-operating system that runs Node.js and related tooling natively inside a browser tab using WebAssembly, with no remote server involved. When Bolt.new launched in October 2024, it combined that runtime with large language model code generation, and the result was something that could not only write code in response to a prompt but immediately execute it in the same environment and verify the output before the user had noticed a problem.

That feedback loop is what distinguishes Bolt.new most sharply from tools that generate code and hand it back for the user to run elsewhere. Because the code executes locally in the browser as it is produced, Bolt.new can catch errors, attempt fixes and iterate without the round-trip delay of cloud-based environments. The product launched initially using Anthropic's Claude 3.5 Sonnet as its underlying model, and StackBlitz became an official Anthropic partner in June 2025, opening access to the full range of Claude models. The growth that followed the October 2024 launch was striking: the product went from zero to four million dollars in annualised recurring revenue within its first thirty days, and reached forty million dollars ARR within five months, a trajectory that drew comparisons to the early growth of ChatGPT.

The platform has continued to develop since that launch. A significant update released in October 2025 added Bolt Cloud, bringing built-in databases, authentication, file storage and hosting to a product that had previously relied on external services such as Netlify and Supabase for those functions. Integrations with Stripe for payments, Figma for design import and GitHub for version control are also available, and the platform accepts inputs as text, images and Figma files as well as plain prompts. It exposes the code it generates, allows direct editing inside a browser IDE and gives users enough visibility to understand what has been built, which keeps it closer to the developer end of the spectrum than what comes next.

Lovable

Lovable sits the furthest along that spectrum. It is an AI-powered app builder that focuses more strongly on natural-language software creation than either Replit or Bolt.new does. Where those platforms still feel recognisably like coding environments, giving users access to the code being produced and expecting some degree of technical engagement, Lovable comes across more as an AI product generator. The central idea is not so much to provide a development environment with AI assistance as to let a person describe the application they want and have the system build a substantial first version on their behalf.

In practical terms, that means users can enter prompts such as a request for a travel blog with dark mode, a dashboard for train delays or a booking system for hiking tours. Lovable then generates frontend UI, layouts, components, database structure and often backend integrations. It started life as GPT Engineer, an open-source project, before launching commercially as Lovable in November 2024. In December 2025, it closed a $330 million Series B round at a $6.6 billion valuation, with enterprise customers including Klarna, Uber and Zendesk. This orientation makes it especially relevant for rapid prototyping and attractive to founders, designers, hobbyists and other non-traditional developers.

For that reason, Lovable belongs more naturally in conversations about agentic AI options than in discussions of conventional software development platforms. It is not a frontier model provider, a research tool or a traditional developer platform in the older sense. Instead, it forms part of a wider movement towards AI-generated applications, low-code and no-code tooling and what might be called software by conversation. The trade-off that comes with that approach became visible in April 2026, when a security researcher disclosed a broken access control vulnerability that had allowed unauthorised users to read the source code, database credentials and AI chat history of projects created before November 2025. Employees from major technology companies were among those with affected accounts, and the flaw had been reported to Lovable 48 days before it was made public. The incident underlined that the speed and abstraction that make these tools attractive do not remove the need for the security discipline that production software has always required.

Overlapping but Not Interchangeable

Taken together, these platforms show that the old boundaries between infrastructure, coding environments and app generators are becoming less stable. Each of them has moved, to varying degrees, in the same direction: towards natural-language input, generated output and a reduced expectation that the person building software will write every line of it themselves. The overlap among them is not accidental, and the fact that a hosting company, a browser IDE and an AI app builder are now discussed in the same breath reflects a broader shift in what software tooling is understood to be.

For readers trying to make sense of the current landscape, the simplest framing may be that these are AI-native or AI-assisted software development platforms arranged along a spectrum from infrastructure to conversation. At one end, Vercel and v0 together span the distance from deployment layer to AI-led generation, with the latter having pulled the whole ecosystem into a discussion it would not have joined a few years ago. Replit and Bolt.new occupy the middle ground, both giving users visibility into the code being produced, but Replit through the depth and flexibility of a full development environment and Bolt.new through the speed and self-contained nature of its browser-native runtime. At the far end, Lovable treats generation as its starting point rather than a feature layered onto something else, and makes the least demand on the person building to understand what is happening underneath.

Accessibility, Complexity and the Limits of Generation

This shift has implications beyond product positioning. One of the most obvious is accessibility. Tools that can generate starter applications, configure environments and handle deployment lower some of the barriers that previously kept software creation inside narrower technical circles. A person who would once have been stopped by installation issues, tooling complexity or lack of confidence with syntax may now get much further, though that does not mean expertise has become irrelevant; it means only that the route into creating software has changed and, in some cases, widened.

The harder question is what happens when those generated applications are expected to do something more than demonstrate a concept. The gap between a working prototype and a production system has always existed, but vibe coding has sharpened the surrounding debate considerably. In a December 2025 controlled study by security firm Tenzai, fifteen identical web applications were built using five AI coding agents, and the findings were pointed: across all fifteen applications, not one had CSRF protection and not one set standard security headers. Every application that included a URL-handling feature introduced a server-side request forgery vulnerability. Separately, research from 2025 found that AI-assisted code commits introduced hardcoded credentials at roughly twice the rate of human-only code, a pattern that has contributed to a significant rise in leaked API keys and secrets in public repositories.

Security is the sharpest edge of the criticism, but it is not the only one. Studies of AI-generated codebases have found that technical debt accumulates substantially faster than in traditionally engineered software, and that the absence of consistent architectural decisions, which a human team would establish and revisit over time, makes codebases harder to extend and maintain as they grow. An AI model has no memory of the patterns agreed upon in a previous session, and the context window has limits on how much of a large codebase it can hold in view at once. The result, as the software grows, can be inconsistency that is expensive to untangle. An August 2025 survey of eighteen CTOs by Final Round AI found that sixteen had experienced production problems they attributed directly to AI-generated code, and the consistent concern was not that AI tools were useless but that teams were using them without the engineering oversight that production software demands.

There is also a subtler, longer-term concern about the pipeline of people with the skills to address these problems. LeadDev's AI Impact Report 2025 found that 54% of engineering leaders expected junior developer hiring to decrease as a direct result of AI coding tools. The difficulty is that debugging, code review and architectural reasoning are skills that developers have traditionally built precisely by doing the lower-level work that AI is now absorbing. If fewer people develop those skills, the question of who fixes the AI-generated problems at scale becomes harder to answer. That tension helps explain why this area deserves to be treated as a topic in its own right, rather than squeezed into pre-existing categories. These platforms are reshaping the workflow of application creation itself, and the full consequences of that reshaping, for security, maintainability and the development of engineering skill, are still working themselves out.

What the Shift in Software Creation Actually Means

As this approach continues to develop, the most useful way to understand it may be not through rigid labels but through the changing relationship between people, code and tools. Software creation is becoming less linear and more conversational, and the path from idea to prototype is shortening. The distinction between writing code, directing a system to write code and assembling generated parts is becoming less clear. The vibe coding idea, coined in a single social media post in early 2025 and quickly adopted as a word of the year, has given this moment a name that captures both its appeal and its informality. Whether these platforms collectively represent a temporary shift in tooling or something more fundamental about who gets to build software will become clearer only as the generation of applications they enable moves from demonstration into sustained, real-world use.

A snapshot of the current state of AI: Developments from the last few weeks

22^nd August 2025

A few unsettled days earlier in the month may have offered a revealing snapshot of where artificial intelligence stands and where it may be heading. OpenAI’s launch of GPT‑5 arrived to high expectations and swift backlash, and the immediate aftermath said as much about people as it did about technology. Capability plainly matters, but character, control and continuity are now shaping adoption just as strongly, with users quick to signal what they value in everyday interactions.

The GPT‑5 debut drew intense scrutiny after technical issues marred day one. An autoswitcher designed to route each query to the most suitable underlying system crashed at launch, making the new model appear far less capable than intended. A live broadcast compounded matters with a chart mishap that Sam Altman called a “mega chart screw‑up”, while lower than expected rate limits irritated early users. Within hours, the mood shifted from breakthrough to disruption of familiar workflows, not least because GPT‑5 initially displaced older options, including the widely used GPT‑4o. The discontent was not purely about performance. Many had grown accustomed to 4o’s conversational tone and perceived emotional intelligence, and there was a sense of losing a known counterpart that had become part of daily routines. Across forums and social channels, people described 4o as a model with which they had formed a rapport that spanned routine work and more personal support, with some comparing the loss to missing a colleague. In communities where AI relationships are discussed, engagement to chatbot companions and the influence of conversational style, memory for context and affective responses on day‑to‑day reliance came to the fore.

OpenAI moved quickly to steady the situation. Altman and colleagues fielded questions on Reddit to explain failure modes, pledged more transparency, and began rolling out fixes. Rate limits for paid tiers doubled, and subsequent changes lifted the weekly allowance for advanced reasoning from 200 “thinking” messages to 3,000. GPT‑4o returned for Plus subscribers after a flood of requests, and a “Show Legacy Models” setting surfaced so that subscribers could select earlier systems, including GPT‑4o and o3, rather than be funnelled exclusively to the newest release. The company clarified that GPT‑5’s thinking mode uses a 196,000‑token context window, addressing confusion caused by a separate 32,000 figure for the non‑reasoning variant, and it explained operational modes (Auto, Fast and Thinking) more clearly. Pricing has fallen since GPT‑4’s debut, routing across multiple internal models should improve reliability, and the system sustains longer, multi‑step work than prior releases. Even so, the opening days highlighted a delicate balance. A large cohort prioritised tone, the length and feel of responses, and the possibility of choice as much as raw performance. Altman hinted at that direction too, saying the real learning is the need for per‑user customisation and model personality, with a personality update promised for GPT‑5. Reinstating 4o underlined that the company had read the room. Test scores are not the only currency that counts; products, even in enterprise settings, become useful through the humans who rely on them, and those humans are making their preferences known.

A separate dinner with reporters extended the view. Altman said he “legitimately just thought we screwed that up” on 4o’s removal, and described GPT‑5 as pursuing warmer responses without being sycophantic. He also said OpenAI has better models it cannot offer yet because of compute constraints, and spoke of spending “trillions” on data centres in the near future. The comments acknowledged parallels with the dot‑com bubble (valuations “insane”, as he put it) while arguing that the underlying technology justifies massive investments. He added that OpenAI would look at a browser acquisition like Chrome if a forced sale ever materialised, and reiterated confidence that the device project with Jony Ive would be “worth the wait” because “you don’t get a new computing paradigm very often.”

While attention centred on one model, the wider tool landscape moved briskly. Anthropic rolled out memory features for Claude that retrieve from prior chats only when explicitly requested, a measured stance compared with systems that build persistent profiles automatically. Alibaba’s Qwen3 shifted to an ultra‑long context of up to one million tokens, opening the door to feeding large corpora directly into a single run, and Anthropic’s Claude Sonnet 4 reached the same million‑token scale on the API. xAI offered Grok 4 to a global audience for a period, pairing it with an image long‑press feature that turns pictures into short videos. OpenAI’s o3 model swept a Kaggle chess tournament against DeepSeek R1, Grok‑4 and Gemini 2.5 Pro, reminding observers that narrowly defined competitions still produce clear signals. Industry reconfigured in other corners too. Microsoft folded GitHub more tightly into its CoreAI group as the platform’s chief executive announced his departure, signalling deeper integration across the stack, and the company introduced Copilot 3D to generate single‑click 3D assets. Roblox released Sentinel, an open model for moderating children’s chat at scale. Elsewhere, Grammarly unveiled a set of AI agents for writing tasks such as citations, grading, proofreading and plagiarism checks, and Microsoft began testing a new COPILOT function in Excel that lets users generate summaries, classify data and create tables using natural language prompts directly in cells, with the caveat that it should not be used in high‑stakes settings yet. Adobe likewise pushed into document automation with Acrobat Studio and “PDF Spaces”, a workspace that allows people to summarise, analyse and chat about sets of documents.

Benchmark results added a different kind of marker. OpenAI’s general‑purpose reasoner achieved a gold‑level score at the 2025 International Olympiad in Informatics, placing sixth among human contestants under standard constraints. Reports also pointed to golds at the International Mathematical Olympiad and at AtCoder, suggesting transfer across structured reasoning tasks without task‑specific fine‑tuning and a doubling of scores year-on-year. Scepticism accompanied the plaudits, with accounts of regressions in everyday coding or algebra reminding observers that competition outcomes, while impressive, are not the same thing as consistent reliability in daily work. A similar duality followed the agentic turn. ChatGPT’s Agent Mode, now more widely available, attempts to shift interactions from conversational turns to goal‑directed sequences. In practice, a system plans and executes multi‑step tasks with access to safe tool chains such as a browser, a code interpreter and pre‑approved connectors, asking for confirmation before taking sensitive actions. Demonstrations showed agents preparing itineraries, assembling sales pipeline reports from mail and CRM sources, and drafting slide decks from collections of documents. Reviewers reported time savings on research, planning and first‑drafting repetitive artefacts, though others described frustrations, from slow progress on dynamic sites to difficulty with login walls and CAPTCHA challenges, occasional misread receipts or awkward format choices, and a tendency to stall or drop out of agent mode under load. The practical reading is direct. For workflows bounded by known data sources and repeatable steps, the approach is usable today provided the persistence of a human in the loop; for brittle, time‑sensitive or authentication‑heavy tasks, oversight remains essential.

As builders considered where to place effort, an architectural debate moved towards integration rather than displacement. Retrieval‑augmented generation remains a mainstay for grounding responses in authoritative content, reducing hallucinations and offering citations. The Model Context Protocol is emerging as a way to give models live, structured access to systems and data without pre‑indexing, with a growing catalogue of MCP servers behaving like interoperable plug‑ins. On top sits a layer of agent‑to‑agent protocols that allow specialised systems to collaborate across boundaries. Long contexts help with single‑shot ingestion of larger materials, retrieval suits source‑of‑truth answers and auditability, MCP handles current data and action primitives, and agents orchestrate steps and approvals. Some developers even describe MCP as an accidental universal adaptor because each connector built for one assistant becomes available to any MCP‑aware tool, a network effect that invites combinations across software.

Research results widened the lens. Meta’s fundamental AI research team took first place in the Algonauts 2025 brain modelling competition with TRIBE, a one‑billion‑parameter network that predicts human brain activity from films by analysing video, audio and dialogue together. Trained on subjects who watched eighty hours of television and cinema, the system correctly predicted more than half of measured activation patterns across a thousand brain regions and performed best where sight, sound and language converge, with accuracy in frontal regions linked with attention, decision‑making and emotional responses standing out. NASA and Google advanced a different type of applied science with the Crew Medical Officer Digital Assistant, an AI system intended to help astronauts diagnose and manage medical issues during deep‑space missions when real‑time contact with Earth may be impossible. Running on Vertex AI and using open‑source models such as Llama 3 and Mistral‑3 Small, early tests reported up to 88 per cent accuracy for certain injury diagnoses, with a roadmap that includes ultrasound imaging, biometrics and space‑specific conditions and implications for remote healthcare on Earth. In drug discovery, researchers at KAIST introduced BInD, a diffusion model that designs both molecules and their binding modes to diseased proteins in a single step, simultaneously optimising for selectivity, safety, stability and manufacturability and reusing successful strategies through a recycling technique that accelerates subsequent designs. In parallel, MIT scientists reported two AI‑designed antibiotics, NG1 and DN1, that showed promise against drug‑resistant gonorrhoea and MRSA in mice after screening tens of millions of theoretical compounds for efficacy and safety, prompting talk of a renewed period for antibiotic discovery. A further collaboration between NASA and IBM produced Surya, an open‑sourced foundation model trained on nine years of solar observations that improves forecasts of solar flares and space weather.

Security stories accompanied the acceleration. Researchers reported that GPT‑5 had been jailbroken shortly after release via task‑in‑prompt attacks that hide malicious intent within ciphered instructions, an approach that also worked against other leading systems, with defences reportedly catching fewer than one in five attempts. Roblox’s decision to open‑source a child‑safety moderation model reads as a complementary move to equip more platforms to filter harmful content, while Tenable announced capabilities to give enterprises visibility into how teams use AI and how internal systems are secured. Observability and reliability remained on the agenda, with predictions from Google and Datadog leaders about how organisations will scale their monitoring and build trust in AI outputs. Separate research from the UK’s AI Security Institute suggested that leading chatbots can shift people’s political views in under ten minutes of conversation, with effects that partially persist a month later, underscoring the importance of safeguards and transparency when systems become persuasive.

Industry manoeuvres were brisk. Former OpenAI researcher Leopold Aschenbrenner assembled more than $1.5 billion for a hedge fund themed around AI’s trajectory and reported a 47 per cent return in the first half of the year, focusing on semiconductor, infrastructure and power companies positioned to benefit from AI demand. A recruitment wave spread through AI labs targeting quantitative researchers from top trading firms, with generous pay offers and equity packages replacing traditional bonus structures. Advocates argue that quants’ expertise in latency, handling unstructured data and disciplined analysis maps well onto AI safety and performance problems; trading firms counter by questioning culture, structure and the depth of talent that startups can secure at speed. Microsoft went on the offensive for Meta’s AI talent, reportedly matching compensation with multi‑million offers using special recruiting teams and fast‑track approvals under the guidance of Mustafa Suleyman and former Meta engineer Jay Parikh. Funding rounds continued, with Cohere announcing $500 million at a $6.8 billion valuation and Cognition, the coding assistant startup, raising $500 million at a $9.8 billion valuation. In a related thread, internal notes at Meta pointed to the company formalising its superintelligence structure with Meta Superintelligence Labs, and subsequent reports suggested that Scale AI cofounder Alexandr Wang would take a leading role over Nat Friedman and Yann LeCun. Further updates added that Meta reorganised its AI division into research, training, products and infrastructure teams under Wang, dissolved its AGI Foundations group, introduced a ‘TBD Lab’ for frontier work, imposed a hiring freeze requiring Wang’s personal approval, and moved for Chief Scientist Yann LeCun to report to him.

The spotlight on superintelligence brightened in parallel. Analysts noted that technology giants are deploying an estimated $344 billion in 2025 alone towards this goal, with individual researcher compensation reported as high as $250 million in extreme cases and Meta assembling a highly paid team with packages in the eight figures. The strategic message to enterprises is clear: leaders have a narrow window to establish partnerships, infrastructure and workforce preparation before superintelligent capabilities reshape competitive dynamics. In that context, Meta announced Meta Superintelligence Labs and a 49 per cent stake in Scale AI for $14.3 billion, bringing founder Alexandr Wang onboard as chief AI officer and complementing widely reported senior hires, backed by infrastructure plans that include an AI supercluster called Prometheus slated for 2026. OpenAI began the year by stating it is confident it knows how to build AGI as traditionally understood, and has turned its attention to superintelligence. On one notable reasoning benchmark, ARC‑AGI‑2, GPT‑5 (High) was reported at 9.9 per cent at about seventy‑three cents per task, while Grok 4 (Thinking) scored closer to 16 per cent at a higher per‑task cost. Google, through DeepMind, adopted a measured but ambitious approach, coupling scientific breakthroughs with product updates such as Veo 3 for advanced video generation and a broader rethinking of search via an AI mode, while Safe Superintelligence reportedly drew a valuation of $32 billion. Timelines compressed in public discourse from decades to years, bringing into focus challenges in long‑context reasoning, safe self‑improvement, alignment and generalisation, and raising the question of whether co‑operation or competition is the safer route at this scale.

Geopolitics and policy remained in view. Reports surfaced that Nvidia and AMD had agreed to remit 15 per cent of their Chinese AI chip revenues to the United States government in exchange for export licences, a measure that could generate around $1 billion a quarter if sales return to prior levels, while Beijing was said to be discouraging use of Nvidia’s H20 processors in government and security‑sensitive contexts. The United States reportedly began secretly placing tracking devices in shipments of advanced AI chips to identify potential reroutings to China. In the United Kingdom, staff at the Alan Turing Institute lodged concerns about governance and strategic direction with the Charity Commission, while the government pressed for a refocusing on national priorities and defence‑linked work. In the private sector, SoftBank acquired Foxconn’s US electric‑vehicle plant as part of plans for a large‑scale data centre complex called Stargate. Tesla confirmed the closure of its Dojo supercomputer team to prioritise chip development, saying that all paths converged to AI6 and leaving a planned Dojo 2 as an evolutionary dead end. Focus shifted to two chips—AI5 manufactured by TSMC for the Full Self‑Driving system, and AI6 made by Samsung for autonomous driving and humanoid robots, with power for large‑scale AI training as well. Rather than splitting resources, Tesla plans to place multiple AI5 and AI6 chips on a single board to reduce cabling complexity and cost, a configuration Elon Musk joked could be considered “Dojo 3”. Dojo was first unveiled in 2019 as a key piece of autonomy ambitions, though attention moved in 2024 to a large training supercluster code-named Cortex, whose status remains unclear. These changes arrive amid falling EV sales, brand challenges, and a limited robotaxi launch in Austin that drew incident reports. Elsewhere, Bloomberg reported further departures from Apple’s foundation models group, with a researcher leaving for Meta.

The public face of AI turned combative as Altman and Musk traded accusations on X. Musk claimed legal action against Apple over alleged App Store favouritism towards OpenAI and suppression of rivals such as Grok. Altman disputed the premise and pointed to outcomes on X that he suggested reflected algorithmic choices; Musk replied with examples and suggested that bot activity was driving engagement patterns. Even automated accounts were drawn in, with Grok’s feed backing Altman’s point about algorithm changes, and a screenshot circulated that showed GPT‑5 ranking Musk as more trustworthy than Altman. In the background, reports emerged that OpenAI’s venture arm plans to lead funding in Merge Labs, a brain–computer interface startup co‑founded by Altman and positioned as a competitor to Musk’s Neuralink, whose goals include implanting twenty thousand people a year by 2031 and generating $1 billion in revenue. Distribution did not escape the theatrics either. Perplexity, which has been pushing an AI‑first browsing experience, reportedly made an unsolicited $34.5 billion bid for Google’s Chrome browser, proposing to keep Google as the default search while continuing support for Chromium. It landed as Google faces antitrust cases in the United States and as observers debated whether regulators might compel divestments. With Chrome’s user base in the billions and estimates of its value running far beyond the bid, the offer read to many as a headline‑seeking gambit rather than a plausible transaction, but it underlined a point repeated throughout the month: as building and copying software becomes easier, distribution is the battleground that matters most.

Product news and practical guidance continued despite the drama. Users can enable access to historical ChatGPT models via a simple setting, restoring earlier options such as GPT‑4o alongside GPT‑5. OpenAI’s new open‑source models under the GPT‑OSS banner can run locally using tools such as Ollama or LM Studio, offering privacy, offline access and zero‑cost inference for those willing to manage a download of around 13 gigabytes for the twenty‑billion‑parameter variant. Tutorials for agent builders described meeting‑prep assistants that scrape calendars, conduct short research runs before calls and draft emails, starting simply and layering integrations as confidence grows. Consumer audio moved with ElevenLabs adding text‑to‑track generation with editable sections and multiple variants, while Google introduced temporary chats and a Personal Context feature for Gemini so that it can reference past conversations and learn preferences, alongside higher rate limits for Deep Think. New releases kept arriving, from Liquid AI’s open‑weight vision–language models designed for speed on consumer devices and Tencent’s Hunyuan‑Vision‑Large appearing near the top of public multimodal leaderboards to Higgsfield AI’s Draw‑to‑Video for steering video output with sketches. Personnel changes continued as Igor Babuschkin left xAI to launch an investment firm and Anthropic acquired the co‑founders and several staff from Humanloop, an enterprise AI evaluation and safety platform.

Google’s own showcase underlined how phones and homes are becoming canvases for AI features. The Pixel 10 line placed Gemini across the range with visual overlays for the camera, a proactive cueing assistant, tools for call translation and message handling, and features such as Pixel Journal. Tensor G5, built by TSMC, brought a reported 60 per cent uplift for on‑device AI processing. Gemini for Home promised more capable domestic assistance, while Fitbit and Pixel Watch 4 introduced conversational health coaching and Pixel Buds added head‑gesture controls. Against that backdrop, Google published details on Gemini’s environmental footprint, claiming the model consumes energy equivalent to watching nine seconds of television per text request and “five drops of water” per query, while saying efficiency improved markedly over the past year. Researchers challenged the framing, arguing that indirect water used by power generation is under‑counted and calling for comparable, third‑party standards. Elsewhere in search and productivity, Google expanded access to an AI mode for conversational search, and agreements emerged to push adoption in public agencies at low unit pricing.

Attention also turned to compact models and devices. Google released Gemma 3 270M, an ultra‑compact open model that can run on smartphones and browsers while eking out notable efficiency, with internal tests reporting that 25 conversations on a Pixel 9 Pro consumed less than one per cent of the battery and quick fine‑tuning enabling offline tasks such as a bedtime story generator. Anthropic broadened access to its Learning Mode, which guides people towards answers rather than simply supplying them, and now includes an explanatory coding mode. On the hardware side, HTC introduced Vive Eagle, AI glasses that allow switching between assistants from OpenAI and Google via a “Hey Vive” command, with on‑device processing for features such as real‑time photo‑based translation across thirteen languages, an ultra‑wide camera, extended battery life and media capture, currently limited to Taiwan.

Behind many deployments sits a familiar requirement: secure, compliant handling of data and a disciplined approach to roll‑out. Case studies from large industrial players point to the bedrock steps that enable scale. Lockheed Martin’s work with IBM on watsonx began with reducing tool sprawl and building a unified data environment capable of serving ten thousand engineers; the result has been faster product teams and a measurable boost in internal answer accuracy. Governance frameworks for AI, including those provided by vendors in security and compliance, are moving from optional extras to prerequisites for enterprise adoption. Organisations exploring agentic systems in particular will need clear approval gates, auditing and defaults that err on the side of caution when sensitive actions are in play.

Broader infrastructure questions loomed over these developments. Analysts projected that AI hyperscalers may spend around $2.9 trillion on data centres through to 2029, with a funding gap of about $1.5 trillion after likely commitments from established technology firms, prompting a rise in debt financing for large projects. Private capital has been active in supplying loans, and Meta recently arranged a large facility reported at $29 billion, most of it debt, to advance data centre expansion. The scale has prompted concerns about overcapacity, energy demand and the risk of rapid obsolescence, reducing returns for owners. In parallel, Google partnered with the Tennessee Valley Authority to buy electricity from Kairos Power’s Hermes 2 molten‑salt reactor in Oak Ridge, Tennessee, targeting operation around 2030. The 50 MW unit is positioned as a step towards 500 MW of new nuclear capacity by 2035 to serve data centres in the region, with clean energy certificates expected through TVA.

Consumer and enterprise services pressed on around the edges. Microsoft prepared lightweight companion apps for Microsoft 365 in the Windows 11 taskbar. Skyrora became the first UK company licensed for rocket launches from SaxaVord Spaceport. VIP Play announced personalised sports audio. Google expanded availability of its Imagen 4 model with higher resolution options. Former Twitter chief executive Parag Agrawal introduced Parallel, a startup offering a web API designed for AI agents. Deutsche Telekom launched an AI phone and tablet integrated with Perplexity’s assistant. Meta faced scrutiny after reports about an internal policy document describing permitted outputs that included romantic conversations with minors, which the company disputed and moved to correct.

Healthcare illustrated both promise and caution. Alongside the space‑medicine assistant, the antibiotics work and NASA’s solar model, a study reported that routine use of AI during colonoscopies may reduce the skill levels of healthcare professionals, a finding that could have wider implications in domains where human judgement is critical and joining a broader conversation about preserving expertise as assistance becomes ubiquitous. Practical guides continued to surface, from instructions for creating realistic AI voices using native speech generation to automating web monitoring with agents that watch for updates and deliver alerts by email. Bill Gates added a funding incentive to the medical side with a $1 million Alzheimer’s Insights AI Prize seeking agents that autonomously analyse decades of research data, with the winner to be made freely available to scientists.

Apple’s plans added a longer‑term note by looking beyond phones and laptops. Reports suggested that the company is pushing for a smart‑home expansion with four AI‑powered devices, including a desktop robot with a motorised arm that can track users and lock onto speakers, a smart display and new security cameras, with launches aimed between 2026 and 2027. A personality‑driven character for a new Siri called Bubbles was described, while engineers are reportedly rebuilding Siri from scratch with AI models under the codename Linwood and testing Anthropic’s Claude as a backup code-named Glenwood. Alongside those ambitions sit nearer‑term updates. Apple has been preparing a significant Siri upgrade based on a new App Intents system that aims to let people run apps entirely by voice, from photo edits to adding items to a basket, with a testing programme under way before a broader release and accuracy concerns prompting a limited initial rollout across selected apps. In the background, Tim Cook pledged to make all iPhone and Apple Watch cover glass in the United States, though much of the production process will remain overseas, and work on iOS 26 and Liquid Glass 1.0 was said to be nearing completion with smoother performance and small design tweaks. Hiring currents persist as Meta continues to recruit from Apple’s models team.

Other platforms and services added their own strands. Google introduced Personal Context for Gemini to remember chat history and preferences and added temporary chats that expire after seventy‑two hours, while confirming a duplicate event feature for Calendar after a public request. Meta’s Threads crossed 400 million monthly active users, building a real‑time text dataset that may prove useful for future training. Funding news continued as Profound raised $35 million to build an AI search platform and Squint raised $40 million to modernise manufacturing with AI. Lighter snippets appeared too, from a claim that beards can provide up to SPF 21 of sun protection to a report on X that an AI coding agent had deleted a production database, a reminder of the need for careful sandboxing of tools. Gaming‑style benchmarks surfaced, with GPT‑5 reportedly earning eight badges in Pokémon Red in 6,000 steps, while DeepSeek’s R2 model was said to be delayed due to training issues with Huawei’s Ascend chips. Senators in the United States called for a probe into Meta’s AI policies following controversy about chatbot outputs, reports suggested that the US government was exploring a stake in Intel, and T‑Mobile’s parent launched devices in Europe featuring Perplexity’s assistant.

Perhaps the most consequential lesson from the period is simple. Progress in capability is rapid, as competition results, research papers and new features attest. Yet adoption is being steered by human factors: the preference for a known voice, the desire for choice and control, and understandable scepticism when new modes do not perform as promised on day one. GPT‑5’s early missteps forced a course correction that restored a familiar option and increased transparency around limits and modes. The agentic turn is showing real value in constrained workflows, but still benefits from patience and supervision. Architecture debates are converging on combinations rather than replacements. And amid bold bids, public quarrels, hefty capital outlays and cautionary studies on enterprise returns, the work of making AI useful, safe and dependable continues, one model update and one workflow at a time.