Industry / Deep Dive

Google Search Goes Agentic: The Death of the Web Referral

Cover: Google Search Goes Agentic: The Death of the Web Referral Feature / Industry
ELPA Analysis Editorial Deep Dive
Key Takeaways
  • The End of the Referral Economy: Google's evolution from a directory of external links to an autonomous agentic engine threatens to reduce outbound publisher click-through rates (CTR) to as low as 5%, starving creators of ad impressions and affiliate revenue.
  • Gemini 3.5 Flash at the Core: The integration of the low-latency Gemini 3.5 Flash model enables persistent, real-time background information agents that run continuously, monitoring the web and sending notifications while users sleep.
  • Disintermediation of Local Services: The expansion of autonomous Booking Agents (built on the remnants of Duplex) intercepts consumer-business relations by placing phone calls, obtaining price quotes, and securing reservations directly inside the search interface.
  • Antigravity & 'Vibe Coding': Google's real-time code-generation engine dynamically creates bespoke interactive widgets (in Astro/React-like structures) for complex requests, making standalone SaaS planning and simulation tools increasingly obsolete.
  • Survival Strategies for Creators: As search-referred traffic dries up, web publishers and developers must pivot toward walled-garden platforms, direct-to-consumer newsletter networks, and high-affinity offline communities.

The Unraveling of the Web's Core Social Contract

For nearly three decades, the global internet has flourished under a quiet, foundational agreement. Content creators, publishers, journalists, and developers invested resources to write, design, and host information. In return, search engines indexed their pages and directed user traffic to those sites, creating a symbiotic loop. The publisher monetized the traffic through ads, subscriptions, or affiliate products, while the search engine monetized the initial intent of the user. This delicate, multi-billion-dollar relationship fueled the rise of the modern, open web. However, the announcements at the recent Google I/O conference signal that this contract is being unilaterally rewritten. Google is shifting its core Search product from a directory of links to an autonomous, closed-loop agentic engine. The implications are clear: Google's AI intends to do the searching, browsing, analyzing, and execution for you. It no longer needs you to visit the open web.

The driving force behind this paradigm shift is Liz Reid, the newly appointed Head of Google Search. Under her leadership, Google is moving away from transactional inputs—where a user types a query, gets ten blue links, and navigates away—toward a persistent digital ecosystem. Reid’s strategy revolves around embedding multi-agent frameworks directly within the search box. Users will no longer simply query information; they will deploy highly personalized, autonomous agents tasked with continuous background execution. This structural pivot represents a transition from a pull-based search methodology to a push-based agentic architecture. By shifting the burden of information gathering and task execution onto background bots, Google is fundamentally altering user behavior, training a generation of consumers to view search as a terminal destination rather than a gateway.

You will be able to create, customize, and manage multiple AI agents for your many tasks, right in Search.

Liz Reid, VP, Head of Google Search

Persistent Bots and the Architectural Shift to Gemini 3.5 Flash

This new wave of search agents relies on substantial underlying model improvements, most notably the integration of the Gemini 3.5 Flash model. Engineered specifically for high-speed reasoning, low-latency API orchestration, and massive context windows, Gemini 3.5 Flash serves as the orchestration layer for Google's agentic search environment. Unlike legacy large language models that generate static textual answers, Gemini 3.5 Flash operates as a stateful coordinator. It maintains user context over extended periods, schedules recurrent web scraping runs, and structures output data dynamically. The transition to Flash represents a calculated engineering decision to prioritize sub-second inference times and API-calling efficiency over massive, compute-heavy parameters. This optimization is crucial for allowing hundreds of millions of persistent background agents to run concurrently without causing severe infrastructure bottlenecks.

TECHNICAL SPOTLIGHT: Gemini 3.5 Flash Architecture

The operational backbone of Google’s background search agents is the Gemini 3.5 Flash architecture. Designed as a lean, distillation-optimized model, Flash leverages a hybrid speculative decoding mechanism and an advanced multi-query attention (MQA) pattern. MQA significantly reduces key-value (KV) cache memory footprints, allowing the model to handle massive user context histories and long-running agent states with negligible memory overhead. Furthermore, its specialized mixture-of-experts (MoE) routing ensures that execution tasks—such as scraping, summarization, or local API schema parsing—are handled by highly targeted subnetworks. This architecture enables the sub-100ms time-to-first-token latency required to coordinate hundreds of parallel background threads, ensuring that personalized information tracking is both computationally sustainable and responsive in real time.

In practice, these info agents function as personalized web monitors. Instead of performing daily manual searches for stock movements, real-time product discounts, or industry trends, users instruct Search to monitor specific variables. Once deployed, the agents run autonomously, scraping relevant endpoints, parsing content updates, and using semantic evaluation to determine whether pre-set conditions have been met. For instance, a user might request real-time tracking of specialized athletic gear releases. The agent compiles a list of retail and news sources, periodically scans them, evaluates the matching criteria, and triggers push alerts containing structured purchasing metadata. This proactive flow shifts the cognitive load of finding and filtering information from the human to the machine.

Google Search AI Info Agent Dashboard
Liz Reid's vision of Info Agents working inside the Google Search UI to monitor and alert users of specific web events.

This continuous background presence is a significant departure from search as we have historically understood it. The active, intentional user browsing journey is replaced by automated summarization. Robby Stein, Google’s Vice President of Product for Search, emphasized this transition during his I/O presentations. Stein highlighted that the ultimate goal is to let search work for the user in the background, continuously processing the open web while the user is completely offline. The implications for consumer habits are profound: users are no longer required to actively navigate websites or even interact with Google's search bar to stay informed.

Ask Google to just keep you updated on anything, and now our agents can do work for you even if you're not using Google. So, you could be asleep, and it's still helping you.

Robby Stein, VP of Product, Google Search

Disintermediation of local commerce via booking agents

The capabilities of Google’s new agents extend far beyond passive data indexing and alert dispatching. Google is actively deploying booking agents—an advanced, practical evolution of the Duplex voice-agent technology that debuted to mixed reviews in 2018. While Duplex focused primarily on outbound voice synthesis to schedule restaurant tables, the modern booking agents are deeply integrated into Google Search's broader knowledge graph and local merchant APIs. These agents are fully autonomous, capable of identifying information gaps on the web and initiating real-world tasks to fill them. If a local shop, such as an independent barber or a boutique service provider, does not publish their specific pricing details or appointment availability online, Google's agents will bridge this gap. They will dial the merchant, conduct a natural spoken conversation with the staff, extract the required pricing quotes, and deliver the structured data directly to the user's dashboard.

Autonomous AI Booking Agent calling a local business
Autonomous booking agents bypass traditional web pages by directly calling local merchants to extract pricing and schedule services.

While this capability is marketed as a consumer convenience, it represents a deep disintermediation of the local service economy. Historically, local businesses attracted customers through localized SEO, digital storefronts, and direct customer interactions. By positioning an autonomous booking agent between the customer and the business, Google intercepts the entire discovery and booking funnel. The local business is no longer a destination that a user evaluates based on website design, customer reviews, or personal touchpoints. Instead, the business is reduced to a set of raw data parameters—price, availability, and location—parsed by a machine and presented in a comparison widget. This commoditizes local services, forcing merchants to compete on raw metrics rather than brand identity, while centralizing transaction control within the Google App.

Quantifying the Efficiency Divide

To visualize the impact of this automation on user behavior, we can analyze the extreme disparity in time and effort required for common online tasks. For example, monitoring sneaker drops, researching logistics, and scheduling appointments manually demands hours of active, repetitive web browsing. Users must open dozens of browser tabs, compare options across multiple vendor sites, fill out forms, and monitor updates over days or weeks. When outsourced to Google's persistent agents, this manual effort collapses to near zero. A user needs only a few seconds to configure the agent's parameters at the start, leaving the execution to background processes. This massive efficiency dividend creates a compelling user incentive: the convenience of agentic search makes traditional web browsing seem intolerably slow and inefficient.

Chart showing the massive gap in manual effort vs agent execution time
The Automation Efficiency Divide illustrating the dramatic reduction in user time when outsourcing web tasks to AI agents.

The Publisher's Dilemma: Zero-Click Search and the Evaporation of Referrals

For web publishers, indie developers, and content creators, Google’s agentic transition represents an existential crisis. Search engines have historically served as aggregators that redirected attention. They did not own the content; they indexed it and drove millions of visitors to independent sites daily. In the agentic era, however, search engines function as the final destination. When an AI agent browses a dozen websites, extracts their primary findings, compiles them into a clean overview, and presents it in a customized app layout, the source websites receive zero benefits. There are no ad impressions, no affiliate link clicks, and no direct user engagement. The referral traffic that once sustained independent journalism, niche tech blogs, and specialized communities is evaporating.

Early data points to a massive collapse in outbound referral volume. Under the classic "10 Blue Links" model, 100% of a search query's traffic value was distributed to the external websites that users clicked. The rollout of AI Overviews in 2024 and conversational AI Mode in 2025 significantly reduced this flow, capturing user attention within Google's own interface. As Google expands its "Super Widgets" and "Mini Apps"—interactive layouts generated directly on the search engine results page—outbound traffic is expected to shrink to a mere fraction of its historical levels. Publishers are effectively being turned into free training data for models that cannibalize their audience, while Google retains and monetizes the user's attention.

Line chart illustrating the decline of publisher CTR from 100% to 5% in agentic search
The collapse of web referral traffic across traditional, AI overview, and agentic search paradigms.

This data highlights the bleak economic reality facing the open web. Niche publishers, product review sites, and informational blogs—which rely on programmatic advertising and affiliate networks—will see their business models collapse as search-referred traffic approaches single digits. If a user can receive a comprehensive, structured comparison of the best running shoes without ever leaving the Google search results page, the motivation to click through to an independent testing site disappears. This creates an economic feedback loop: as traffic dries up, publishers lose the revenue needed to fund original research, testing, and writing, leading to a decline in the quality of the very content that AI agents rely on to generate answers.

Antigravity and the Rise of Dynamic 'Vibe Coding' Interfaces

One of the most technically advanced aspects of Google’s search upgrade is the integration of the Antigravity engine directly into the search delivery pipeline. Originally developed as an internal generative coding assistant—competing with tools like GitHub Copilot and Anthropic's Claude Code—Antigravity has been repurposed to build frontend interfaces on the fly. When a user inputs a query that is best served by an interactive tool rather than text, Antigravity dynamically generates a bespoke web application. For example, if a user searches for a visual representation of the gravitational lensing around a black hole, or requests a tool to manage the logistics of a multi-state move, the system does not return static text or links to calculators. Instead, it compiles and renders a custom interactive mini-app in real-time.

Dynamic widgets generated by the Antigravity engine in Google Search
Google's Antigravity engine generating bespoke, interactive mini-apps directly within search result pages.
TECHNICAL SPOTLIGHT: The Antigravity Vibe-Coding Engine

The Antigravity engine represents a radical departure from static frontend deployment. When a query requires interactive visualization or calculation, Antigravity’s code-generation layer operates by streaming declarative UI definitions—similar to Astro components or React server components—directly into a sandboxed rendering container. The engine utilizes a fine-tuned, low-latency code model that translates user intents into highly optimized component code on the fly. This interface generation is guided by strict constraint schemas that enforce accessible styling, prevent script injection, and ensure performance compatibility. By generating dynamic, interactive user interfaces in less than a second, Antigravity bypasses the traditional software delivery model, allowing Google to construct bespoke application frontends on demand and bypassing the need for standalone SaaS platforms.

This dynamic frontend capability will first roll out to Google AI Pro and Ultra subscribers in the United States. By allowing users to track diets, manage logistics, and run simulations directly inside Search, Google is transforming its interface into a universal, personalized operating system. The web pages that once hosted these tools and widgets are rendered obsolete. Why navigate to a standalone calorie tracker or moving planner when Google can synthesize a specialized widget for you in under a second?

This real-time generation presents a direct threat to the traditional Software-as-a-Service (SaaS) industry. For years, developers built simple utility tools—calculators, converters, planner templates, and visual simulators—and monetized them through ads or subscription tiers. Antigravity automates this entire tier of software development, generating bespoke applications tailored exactly to the user's specific context. The implications are clear: single-utility SaaS applications are being replaced by an on-demand generative interface, positioning Google as the primary layer of software consumption.

The Paradox of the Agentic Web: How Creators Can Adapt

Despite these sweeping changes, Google publicly maintains that it is not trying to kill the web. In official communications, the company stresses that it still includes links within AI Overviews and agentic summaries. Yet, early studies on user behavior show that very few people click these links when their intent is fully satisfied by the AI. The incentives for writing high-quality content are breaking down. If the reward for creating detailed guides is having a Google bot scrape your words and serve them to a user without a single pageview, why write them in the first place? This is the fundamental paradox of the agentic web: by starving creators of traffic, AI engines risk dry-docking the very content they rely on to generate answers.

As the agentic web approaches mainstream adoption, developers and creators must adapt to a new set of rules. The old playbook of optimizing for keyword density and backlink profiles is losing its power. In a world where AI agents do the browsing, content must be structured not just for human readers, but for machine synthesis. Creators will need to establish direct relationships with their audiences through newsletters, offline communities, and walled platforms. The era of the search-referred open web is drawing to a close, and a new, more fragmented digital landscape is emerging in its wake.

Ultimately, this shift represents a fragmentation of the digital ecosystem. While consumers will enjoy unprecedented convenience, the decentralized diversity of the open web will likely be replaced by consolidated platforms and walled content gardens. To survive, content creators must focus on brand loyalty, specialized expertise, and channels that cannot be easily intercepted by automated scraping bots. The web is not dying, but its relationship with search is fundamentally broken.