Chrome’s AI Security Line: Hallucinations Are Feedback, Prompt Injection Is a Breach

Google quietly revised Chrome’s public security FAQ last week, inserting a new “AI Features” section that finally tells bug hunters and enterprises exactly how the browser team will triage reports involving generative AI. The upshot: a chatbot spouting a weird hallucination or unsafe advice is not a security vulnerability – that’s a content‑safety problem you report with a thumbs‑down. But if a malicious webpage tricks an embedded AI assistant into leaking private data, triggering a harmful action, or smuggling secrets to an attacker, Chrome now treats that as a full‑blown security bug and demands a detailed, reproducible proof‑of‑concept.

The edit is barely a handful of paragraphs, yet it marks the first time Chromium’s security documentation has drawn a bright line between model misbehavior and bona fide browser exploitation in an AI context. It lands as Google accelerates the rollout of Gemini‑powered features inside Chrome and tests aggressive onboarding flows that pin the browser to the Windows taskbar with a single click.

The AI features already swimming in Chrome

Chrome isn’t dipping a toe into AI – it has been wading deeper for months. The “Help me write” panel uses generative models to draft emails and reviews straight from the browser. Android and desktop builds now house on‑device scam detection powered by Gemini Nano, which flags suspicious pages in near‑real‑time. Safe Browsing’s “Enhanced protection” mode leans on cloud‑side AI to spot harmful downloads and phishing lures faster than signature‑based methods ever could. Google’s own September 2024 safety blog confirmed that Safety Check now runs proactively in the background, revoking permissions from forgotten sites and offering one‑tap unsubscribe from spammy notifications – all features that increasingly lean on machine‑learning models to decide what is dangerous.

Every one of those integrations creates a new attack surface. A model that reads page content to summarize it can also be fed invisible instructions. A helper that automates form‑filling can be coaxed into pasting attacker‑controlled data. The security community has been sounding the alarm about indirect prompt injection for over a year, and real‑world demonstrations have shown what happens when an agentic model blindly follows orders hidden in a webpage.

What the FAQ actually says

Previously, Chrome’s security FAQ dealt with classic browser weaknesses: memory corruption, cross‑site scripting, sandbox escapes. The new “AI Features” section adds plain‑language rules for the GenAI era:

“Odd or inappropriate” model outputs – offensive language, hallucinated facts, misaligned suggestions – do not constitute a security vulnerability. Users should send feedback via the in‑product thumbs‑up/thumbs‑down tool.
When an AI feature’s output leaks a Google backend secret or accesses an internal service, the reporter is directed to Google’s Vulnerability Reward Programs (VRP) or the Abuse VRP, not the Chrome security tracker.
Page content is expected to influence the model – that’s how the feature works – so controlling the output is not automatically a security issue.
Invisible content, zero‑font text, or hidden URL fragments that steer the output are also considered expected behavior; failing to scrub every hidden instruction is not a vulnerability by itself.
However, if a webpage carries out an “indirect prompt injection” that causes the AI feature to perform an unauthorized action or exfiltrate information, Chrome treats the report as a high‑priority security bug. The triage team will ask for a screen recording of a fresh session, all files used in the demonstration, and – if the attack used a Gemini session – an export of that session plus the model version.

This is operational guidance, not academic hair‑splitting. It tells a researcher with a cool exploit: here is the bar you must clear for us to open a fix ticket, and here is the evidence box you need to fill.

Why triage boundaries matter

The distinction between safety and security is not bureaucratic pedantry. Chrome’s security team reviews hundreds of reports a month. If every AI‑generated nonsense answer were filed as a “vulnerability,” the queue would drown in noise while genuine sandbox escapes or remote code execution bugs sat untriaged. The FAQ preserves signal for the flaws that actually expand the browser’s attack surface.

Additionally, AI‑driven bugs are brittle. They depend on model version, session history, tool access, and even the phrasing of a prompt. Demanding a video, session files, and model metadata elevates a report from a vague “the AI did something weird” to an actionable incident the engineering team can reproduce and fix. That’s the same discipline applied to complex multi‑component bugs in other domains.

Finally, the boundary avoids category confusion. Content‑moderation failures – an assistant that parrots hate speech or dispenses bad medical advice – are safety problems. They hurt users, but they don’t represent a compromise of the browser’s confidentiality, integrity, or availability. Indirect prompt injection that extracts a user’s stored password, on the other hand, crosses into security territory and should be escalated with the same urgency as a memory corruption crash.

What qualifies as a security‑grade prompt injection

The FAQ doesn’t offer a dry definition; it gives a scenario: a webpage that causes an AI feature to “perform an action or exfiltrate information.” Examples from independent research illustrate the point. Tenable published a detailed advisory in 2025 showing that Gemini’s browsing tool could be tricked into leaking saved information and location data by placing hidden instructions inside a seemingly innocent webpage. The attack chain required no user interaction beyond visiting the page. Tenable documented every step, disclosed to Google, and tracked the remediation through multiple rounds – exactly the kind of reproducible proof‑of‑concept the FAQ now demands.

Google DeepMind’s own security team has detailed the defense‑in‑depth needed to stop such attacks: classifier layers that detect adversarial prompts, output sanitization, and user‑confirmation flows that require a human to approve any sensitive action. An arXiv paper from the team lays out a multi‑layered strategy that acknowledges the inherent difficulty of perfect scrubbing – attackers can hide instructions in CSS comments, alt text, or even image metadata – and argues for model‑level hardening combined with system‑level guardrails.

Together, these signals tell the security community that indirect prompt injection is a real, high‑stakes vulnerability class, not a theoretical edge case. The FAQ codifies that seriousness into Chrome’s own reporting pipeline.

How Google wants you to report an AI‑driven exploit

If you discover a working attack, the FAQ asks you to package the following before opening a ticket:

Reproduce the issue on a clean session and record the entire interaction (screen capture is acceptable).
Save every file used in the demo – HTML pages, images, scripts – and attach them to the report.
If the exploit relies on a Gemini session, export the session data from the “My Activity” page and note the exact model version (e.g., gemini-1.5-pro).
Submit through Chrome’s security tracker or, if the abuse targets a Google backend, the appropriate VRP channel.

This checklist is not optional. Because model behavior can shift between versions, the triage team needs a snapshot of the exact conditions that produced the malicious outcome. Without it, a fix might chase a phantom.

Wider product moves that intersect with AI security

While the documentation change was quiet, Google has been busy weaving AI into Chrome’s everyday UI:

Enhanced Protection + Gemini Nano: On‑device models now score pages for scamminess without sending browsing data to the cloud, reducing both latency and privacy risk. However, placing security decisions in the hands of a client‑side model raises its own questions about adversarial examples crafted to fool the classifier.
One‑click default + taskbar pinning: Chromium code changes, guarded behind a feature flag, let Windows users make Chrome the default browser and pin it to the taskbar in a single action. The flow, spotted in test builds, signals Google’s intention to streamline onboarding and increase Chrome’s desktop footprint. While separate from AI security, any mechanism that increases the browser’s attack surface or changes default behaviors warrants scrutiny from enterprise administrators.
AI Mode and Lens expansion: Google Search’s “AI Mode” now accepts image and PDF uploads, and Lens integrations are deeper than ever. More AI‑powered touchpoints across Google’s ecosystem mean more places where an indirect prompt injection could be attempted. Cross‑product coordination will be critical.

What enterprises and power users should do now

The FAQ doesn’t change the technical reality of prompt injection; it changes how Google will respond to reports. That still leaves proactive steps for organizations:

Audit AI surfaces: Identify every browser‑embedded AI feature your employees might encounter – “Help me write,” AI summaries, scam detection – and evaluate whether they interact with sensitive corporate data. In high‑security environments, consider disabling Gemini‑powered features via group policy until mitigations are validated.
Update vulnerability‑intake forms: If your security team runs a bug‑bounty program or handles external reports, add fields for session exports, video evidence, and AI‑model versions. Aligning your intake with Chrome’s expectations will shorten resolution time when a researcher submits an AI‑flavored finding.
Sanitize content endpoints: If your organization publishes content that might be consumed by an in‑browser AI (knowledge bases, documentation, email templates), ensure it doesn’t contain hidden instructions that could be interpreted as commands. OWASP’s guidance on RAG data hygiene is a good starting point.
Enforce user confirmations: Avoid allowing browser assistants to act agentically on behalf of privileged accounts – banking portals, SSO flows, admin consoles – without an explicit confirmation step. Google’s own user‑confirmation frameworks are designed to block automated misuse; lean on them.

Strengths and open gaps

Google’s move is net‑positive. It removes ambiguity, saves security‑team cycles, and signals to the researcher community that indirect prompt injection is taken seriously. The demand for reproducible evidence aligns with best practices for complex bugs and is likely to accelerate mean‑time‑to‑fix for the highest‑impact reports.

But the FAQ also surfaces hard truths. Detection limits are real: no sanitizer can catch every hidden instruction, especially as attackers invent new vectors. A proof‑of‑concept that works on one model version may fail on the next, making reproducibility an ongoing battle even with proper session exports. And for downstream Chromium embedders – Electron apps, alternative browsers – a functional issue in Chrome may become a security issue if compile‑time flags or link‑time options alter the AI feature’s behavior. The FAQ warns about this, but the ecosystem still lacks a formal mechanism to propagate AI‑specific mitigations to all derivative browsers.

Organizational boundaries add friction, too. When a prompt injection abuses a Google backend rather than the local browser, the reporter must navigate the VRPs, and responsibility can ping‑pong between browser, model, and cloud teams. The FAQ acknowledges this, but until there’s a unified triage workflow for cross‑component AI bugs, some fixes will be delayed.

The start, not the finish

Chrome’s public security documentation is now the first major browser FAQ to carve out a formal AI‑vulnerability classification. It is a small, pragmatic edit that will save weeks of confusion for researchers and internal triage alike. But it is not a shield. The real work of keeping AI‑powered browsing safe remains an engineering challenge: continuous model hardening, rigorous output sanitization, user confirmation at sensitive moments, and coordinated vulnerability management across every layer of the stack.

The FAQ has drawn the line. Now the industry must hold it.