Windows 11 Fluid Dictation Hands-On: On-Device AI Makes Speech-to-Text Smarter, but Hardware-Locked

Microsoft shipped Windows 11 Insider Preview builds 26220.5790 (Dev) and 26120.5790 (Beta) under KB5065779, and the headline feature is Fluid Dictation – an on-device AI engine inside Voice Access that automatically punctuates, filters filler words, and lightly corrects grammar in real time. The catch? It only works on Copilot+ PCs, and for now, only in English.

Voice Access has been the strategic replacement for the old Windows Speech Recognition since 2022, evolving from a basic command-and-dictate tool into an AI-powered accessibility surface. Fluid Dictation is the next leap: a small language model (SLM) running locally on the neural processing unit (NPU) that turns spoken words into near-ready prose, not a raw transcript.

What Fluid Dictation actually does

For years, dictation on Windows meant you had to speak clearly and then manually fix every missing comma, capitalize the first letter of sentences, and delete every “um” and “uh” yourself. Fluid Dictation changes that workflow from “dictate, then edit” to “speak and get polished text.”

Automatic punctuation – The model inserts periods, commas, question marks, and exclamation points based on natural speech rhythm. Say “I think we should meet tomorrow what time works for you” and Fluid Dictation produces: “I think we should meet tomorrow. What time works for you?”

Filler word removal – Common hesitations like “um,” “uh,” “like,” and repeated stopwords are stripped out. If you dictate “I um think we should like maybe uh schedule a call,” the output becomes “I think we should schedule a call.” The feature doesn’t just delete – it also applies light grammatical corrections to ensure the sentence remains coherent.

Context-aware smoothing – The SLM makes minor adjustments to subject-verb agreement and word order without altering meaning. This is not a full-fledged LLM rewriting your sentences; it’s a targeted normalization designed to clean up spoken language’s typical rough edges.

The goal is clear: reduce the post-dictation cleanup burden so you can draft emails, notes, and documents hands-free with less manual editing.

On-device AI: privacy and speed

Fluid Dictation runs entirely on device using a compact SLM optimized for the NPUs baked into Copilot+ PCs. That architecture delivers two immediate benefits:

Low latency – No round trips to the cloud. Speech is processed locally in near real time, so the corrected text appears almost instantly as you dictate.
Enhanced privacy – Audio snippets and intermediate representations stay on the machine. Microsoft does not upload your dictation to servers for punctuation or grammar cleanup.

The trade-off is capability: these SLMs are task-specific, not a replacement for a full language model like GPT-4o. They won’t rewrite a passage in a different tone or generate novel content. But for the job of cleaning up dictation, they’re fast and efficient.

Where Fluid Dictation works – and where it won’t

Fluid Dictation operates system-wide, in any text field: Notepad, Word, Outlook, web forms, even chat windows. But Microsoft has built in important guardrails:

Secure fields are off-limits – The feature automatically disables itself in password boxes, PIN fields, and other protected inputs to prevent accidental leakage.
Language lock – The initial rollout is English-only, across all supported locales. Multilingual users or those dictating in other languages will have to wait.
Hardware exclusivity – Fluid Dictation requires a Copilot+ PC with a qualifying NPU. Traditional x86 laptops and older Windows 11 machines won’t get it, no matter how powerful their CPU or GPU.

This hardware gating is perhaps the most significant limitation. While the SLM could theoretically run on any modern processor, Microsoft is using Fluid Dictation as a differentiator for its Copilot+ brand, tying AI features to the presence of an NPU.

How to enable Fluid Dictation

If you’re on a supported Copilot+ PC running the Dev (26220.5790) or Beta (26120.5790) build, Fluid Dictation is enabled by default. To verify or toggle it:

Launch Voice Access from Settings > Accessibility > Speech or from the Start menu.
Complete the first-run microphone setup if prompted.
Click the settings gear on the Voice Access bar, or say “voice access settings.”
Look for the Fluid Dictation toggle and ensure it’s on.
Alternatively, use the voice commands: “turn on fluid dictation” or “turn off fluid dictation.”

Once enabled, start speaking in any editable field. The output appears immediately with smarter punctuation and filling removed. To report bugs, use the Feedback Hub under Accessibility > Voice access.

Windows Studio Effects gets a camera boost

Alongside Fluid Dictation, the same Insider builds expand Windows Studio Effects – the AI-powered camera enhancements like background blur, automatic framing, eye contact correction, and lighting adjustments. Previously, these effects worked only with the built-in laptop camera. Now they can extend to one additional camera, such as a USB webcam.

Go to Settings > Bluetooth & devices > Cameras, select the secondary camera, and enable Use Windows Studio Effects under advanced options.
The feature rolls out first to Intel-powered Copilot+ PCs, with AMD and Snapdragon support following in subsequent weeks.
This is a boon for hybrid workers, streamers, and anyone using a dual-camera setup who wants consistent AI-enhanced video across all feeds.

Known issues: proceed with caution

Preview builds are test beds, and these releases come with significant stability risks that Microsoft has flagged:

Hibernation bug – Systems may bugcheck (green screen) when resuming from hibernation. The issue is intermittent but severe enough that Microsoft advises Insiders to avoid hibernation until a fix arrives.
Audio device errors – Several devices are showing yellow exclamation marks in Device Manager, citing components like “ACPI Audio Compositor.” This can lead to complete audio loss, requiring driver rollbacks or even system restore.

For a daily driver PC, these aren’t trivial. Insiders who rely on stable audio or frequently hibernate should hold off on installing these flights.

Privacy and security considerations

Fluid Dictation’s on-device processing is a win for privacy, but it’s not a silver bullet:

Local artifacts – The SLM and any temporary audio buffers stored on disk create a new attack surface. An attacker with physical or privileged access could potentially exfiltrate these files.
Sensitive content – While the feature avoids password fields, dictating confidential information in a Word doc or email still means that text passes through the local model. IT administrators must assess whether that’s acceptable under data governance policies.
Enterprise governance – Organizations need to manage model updates alongside drivers and firmware, and ensure that endpoint protection tools don’t interfere with Voice Access components.

Technical analysis: strengths and trade-offs

Strengths

Immediate, fluid output – On-device inference eliminates cloud latency, making dictation feel responsive.
Cleaner transcripts – Automatic punctuation and filler removal cut editing time significantly for most users.
Privacy-first design – Audio never leaves the device, aligning with strict regulatory requirements.
Accessibility foundation – Fluid Dictation builds on Voice Access, keeping everything in one app that replaces the deprecated Speech Recognition.

Trade-offs and limitations

Copilot+ exclusivity – The majority of Windows users are locked out, creating a two-tier experience.
Model constraints – SLMs are fast but limited. They won’t rephrase or stylize text; they just clean up.
Update complexity – On-device models require secure distribution and update pipelines that enterprises must manage.
Edge cases – Rapid, mumbled speech can confuse punctuation heuristics. Filler removal might strip words that carry rhetorical weight. Domain-specific jargon may be incorrectly “corrected.”

Potential failure modes

Over-punctuation – Fast talkers may get comma splices or periods breaking run-on sentences that should remain intact.
Meaning alteration – Stripping “like” when quoting someone’s speech (“she was like, ‘no way’”) changes the flavor. The model may be too aggressive in formalizing informal style.
Model drift – Without frequent fine-tuning, accuracy on niche vocabularies (medical, legal, coding) may degrade over time.

Practical tips for Insiders

If you’re diving in now:

Confirm your PC is a true Copilot+ model with an NPU. Not all “AI PCs” qualify.
After setup, test Fluid Dictation in multiple apps. Compare raw transcription (turn it off) vs. Fluid Dictation output to see the difference.
If you encounter the hibernation bug, disable hibernation via powercfg /h off until a patch lands.
For audio issues, check Device Manager for devices with yellow triangles; try rolling back drivers or updating to the latest from your OEM’s support page.
Use the Feedback Hub liberally – Microsoft’s engineering team relies on Insider data to refine the SLM.

What Fluid Dictation means for Windows and the PC industry

Fluid Dictation is the latest example of Microsoft embedding small, task-specific AI models across the OS. This “distributed AI” strategy – running SLMs on NPUs for latency-sensitive, privacy-critical features – is likely to proliferate. We’ll see more Copilot+ exclusive features that chip away at everyday friction: live translations, smart background noise removal, contextual suggestions.

For the industry, this accelerates the NPU arms race. OEMs must include capable NPUs to unlock these features, deepening the differentiation between budget and premium Windows laptops. It also raises the stakes for on-device model management and security, areas that will demand new IT frameworks.

Verdict: a meaningful step, with practical limits

Fluid Dictation delivers on a long-standing promise: dictation that feels natural, requires minimal cleanup, and respects your privacy. For Copilot+ PC users, it’s a tangible upgrade that makes Voice Access a viable tool for serious writing, not just quick messages. The simultaneous expansion of Windows Studio Effects shows Microsoft is committed to making on-device AI multi-modal.

But the rollout’s narrow scope – locked to new hardware and one language – means its impact will be felt mostly by early adopters. The stability bugs in these preview builds are another reminder that this is still experimental ground.

What to watch next

Multilingual expansion – Once Fluid Dictation supports more languages, it becomes a global accessibility tool.
Broader Copilot+ adoption – As more laptops ship with NPUs, the installed base will grow, and features like this become table stakes.
Stability fixes – All eyes on Microsoft to patch the hibernation and audio bugs quickly.
Third-party APIs – If Microsoft exposes the SLM runtime to developers, expect a wave of voice-enabled productivity apps.
Enterprise controls – Group Policies and management tools for on-device AI models are essential before widespread business deployment.

Fluid Dictation isn’t a revolution, but it’s a decisive architectural move. By keeping AI local, Microsoft is betting that the future of Windows productivity is not in the cloud, but right on your NPU.