Microsoft Foundry Local Brings Free, Private AI Chat to Any Windows 10 or 11 PC

Microsoft just quietly released a command-line tool that lets you run large language models directly on your Windows PC—no cloud subscription, no API keys, and no internet connection required. Announced at the company’s Build 2025 developer conference, Foundry Local is already available for download via winget, and it works on everything from aging laptops with 8GB of RAM to brand-new Copilot+ PCs with dedicated NPUs.

For Windows enthusiasts who have watched AI stay locked inside cloud data centers, this marks a significant shift. Instead of sending your prompts to a remote server, Foundry Local pulls publicly available models like Phi-3.5-mini onto your machine and runs them entirely on your hardware. The result is sub-second latency, complete data privacy, and a frictionless setup that mirrors the simplicity of installing apps via winget.

How to Install Foundry Local in Under Two Minutes

Getting Foundry Local running is almost absurdly simple. Open Windows Terminal, PowerShell, or Command Prompt—the tool works in any of them—and type:

winget install Microsoft.FoundryLocal

Wait a minute while the package downloads and installs. Once it finishes, you can fire up a model immediately:

foundry model run phi-3.5-mini

That’s it. No wrestling with Python environments, CUDA drivers, or Docker containers. The tool automatically fetches the model, selects the optimal variant for your hardware, and drops you into an interactive chat session. If you want to see what else is available, foundry model list prints a table of supported models, which currently includes several sizes of Phi-3 and other open-weight LLMs.

This tight integration with winget—Windows’ command-line package manager—is a deliberate strategy. Just as winget eliminated the hassle of hunting for EXE installers, Foundry Local removes the technical barriers that have kept local AI confined to developers and tinkerers. Anyone comfortable typing a couple of commands can now experiment with a state-of-the-art language model in seconds.

What You Can Actually Do With It—And What You Can’t

Right now, Foundry Local is a text-only chatbot. You type a prompt, and the model responds. There’s no image generation, no file upload, no vision processing, and no retrieval-augmented generation out of the box. For quick question‑and‑answer sessions, draft writing, code explanations, or brainstorming, it’s remarkably capable—especially given that all inference happens locally.

In testing, the Phi-3.5-mini model returned responses almost instantaneously, even on a mid-range laptop without a discrete GPU. On machines equipped with an NPU (like Copilot+ PCs running Snapdragon X Elite) or a recent Nvidia RTX or AMD Radeon GPU, Foundry Local automatically loads GPU-accelerated versions of the model, cutting response times further and handling longer contexts with ease.

Microsoft describes Foundry Local as a developer tool, but the interface is agnostic enough for power users and curious beginners. You can pipe prompts from scripts, chain commands, or experiment with different models without leaving the terminal. For enterprise developers, it opens the door to building local agents, testing fine-tuned models, and prototyping AI features without sending proprietary data to a third-party cloud.

Yet it’s important to set expectations. Foundry Local does not match the breadth of cloud services like ChatGPT or Copilot. It won’t browse the web, analyze images, or tap into real-time data. Its training data has a fixed cutoff, and it lacks the polished guardrails and multimodal capabilities that define commercial chat platforms. For many everyday queries, though, local speed and privacy more than compensate for these limitations.

Hardware Requirements: Almost Any Modern PC Will Work

The official minimum specs are modest: Windows 10 or 11, 8GB of RAM, and 3GB of free storage. Microsoft recommends 16GB of RAM and 15GB of free disk space for a smoother experience, particularly when installing larger models. A Copilot+ PC (Snapdragon X Elite) or a system with an Nvidia RTX 2000 series, AMD Radeon 6000 series, or newer GPU is optional but will significantly boost performance.

On older hardware—say, a 2018 ultrabook with integrated graphics—the Phi-3.5-mini model runs, but response generation is noticeably slower and may stutter with complex prompts. Still, the fact that Foundry Local works at all on such machines underscores its design philosophy: accessibility before peak optimization. Users with legacy devices can at least get a taste of local LLMs, while those with modern silicon unlock the full potential.

Privacy and Offline Reality: What “Local” Actually Means

The strongest argument for Foundry Local is privacy. When you run a model locally, your prompts, responses, and any sensitive data never leave your device. There’s no telemetry to a cloud provider (unless you’ve opted into Windows diagnostic data collection, which is separate), and no account is required. This makes it ideal for testing confidential code, drafting proprietary documents, or simply avoiding the data-hungry habits of cloud AI services.

During testing, however, one curiosity emerged. The Phi-3.5-mini model, when asked about its connectivity, claimed it was communicating with Microsoft to process information. To verify, testers put the PC into airplane mode, and the model continued to work flawlessly—proving the statement was likely a hallucination or a benign pre-training artifact. This serves as a reminder that even “local” models can sometimes generate incorrect claims about their architecture, and users should not rely on an LLM’s self-description. As a best practice, treat any model’s output with healthy skepticism, especially regarding its own capabilities.

From a security standpoint, the tool itself is safe so long as you stick to models fetched through Microsoft’s official channels. The real risk lies in importing third-party or community-converted models, which could theoretically carry malicious payloads or introduce supply-chain vulnerabilities. For now, sticking with the foundry model list catalog is the safest route.

Foundry Local vs. Cloud Giants: Where It Wins (and Loses)

Feature	Foundry Local	Cloud LLMs (ChatGPT, Gemini, Copilot)
Latency	Sub-second locally	Variable, depends on network
Privacy	Full local processing	Data processed in cloud
Offline capability	Fully functional offline	Requires internet
Model updates	Manual, periodic pulls	Continuous, automatic
Hardware impact	Uses local CPU/GPU/NPU	Offloaded to cloud
Feature breadth	Text chat only	Vision, code interpreter, web search, etc.
Integration	CLI, scripting	Rich APIs, plugins

For quick, private interactions—drafting an email, debugging a script, or explaining a concept—Foundry Local often feels snappier and more secure than cloud alternatives. There’s no waiting for a server response, no worrying about rate limits, and no data leaving your machine. On the flip side, if you need the latest news, image analysis, or specialized knowledge, you’ll still reach for Copilot or ChatGPT.

Enterprise Implications and the Roadmap

Foundry Local isn’t just a toy for hobbyists. In enterprise environments, it promises a new class of confidential AI workflows. Imagine a legal team using a local model to summarize case files without uploading them; an accounting department analyzing spreadsheets with natural-language queries that stay on-prem; or a developer building a local copilot for a proprietary codebase. Because everything runs on the local machine, compliance with data residency regulations becomes far simpler.

Microsoft’s roadmap hints at deeper integration with Windows itself. Demo videos have shown “agents” that can leverage Windows features like the Snipping Tool’s text extractor—suggesting that future versions might let Foundry Local perform real actions on your desktop. There’s also talk of allowing custom data uploads for on-device fine-tuning, which would let businesses create specialized models trained on their own documents without ever exposing that data to the cloud.

Expanded model support is almost certain. While the current catalog centers on Phi-3 variants, Microsoft’s open approach means third-party and community models could flood in once a formal conversion pipeline is released. Vision models, text-to-speech, and multimodal LLMs are logical next steps, though no timeline has been announced.

Intel AI Playground and the Competitive Landscape

Foundry Local enters a space where Intel’s AI Playground already offers a consumer-friendly local AI experience—but with a catch. Intel AI Playground restricts its most powerful features to a subset of Intel processors, effectively locking out AMD and Qualcomm users. Foundry Local, by contrast, runs on any x64 or ARM64 Windows machine, regardless of CPU brand. That openness, combined with winget’s frictionless delivery, gives Microsoft a broader potential user base from day one.

Other local AI tools—Ollama, LM Studio, GPT4All—offer more features and model choices, but they require more technical know‑how. Foundry Local’s pitch is simplicity: two commands and you’re chatting. For the average Windows user curious about AI, that low barrier is transformative.

Critical Limitations and What to Watch Out For

Despite the promise, Foundry Local is an early-stage product with clear gaps:

Limited to text chat: No vision, art, or voice capabilities—yet.
Model catalog still small: Only a handful of models are officially supported. Expect this to grow, but for now, tinkerers may outgrow it quickly.
Hardware-dependent performance: On anything less than 16GB of RAM and a recent GPU, experience can be lackluster.
No easy GUI: The command-line interface, while simple, will intimidate novices accustomed to polished chat apps.
Potential for fragmentation: As with any open model ecosystem, quality and compatibility can vary, and Microsoft hasn’t detailed a governance model.
Security when importing models: Bringing your own model is possible but risky if you download from untrusted sources.
Telemetry myths: While the tool itself doesn’t phone home, Windows 11’s built-in telemetry may log some usage data unless you’ve disabled it. And some models might hallucinate about connectivity, as the Phi-3.5-mini test showed.

These aren’t deal‑breakers, but they underscore that Foundry Local is a foundation, not a full house. For now, it’s best suited for early adopters, developers, and privacy hawks willing to accept trade-offs for speed and data control.

A Stealthy Bet on the Future of Windows AI

Foundry Local’s quiet rollout speaks to a larger strategy. Instead of a flashy marketing campaign, Microsoft embedded a deceptively simple tool into the wings of winget, arguably the most underappreciated app delivery channel on Windows. This move mirrors the company’s historical pattern: first enable developers, then let the magic trickle down to consumers. With Windows 11 already baking AI into Search, Copilot, and native apps, Foundry Local could become the engine for a new generation of offline-first AI experiences.

Picture Windows where a local LLM powers context-aware clipboard suggestions, on-device document summaries, or intelligent automation routines that never require an internet connection. That vision isn’t science fiction—it’s exactly the kind of plumbing Foundry Local provides today in its nascent form.

If you’re a Windows power user, a developer, or simply someone who values privacy, there’s no reason not to try Foundry Local. The installation is reversible (winget uninstall Microsoft.FoundryLocal), the footprint is modest, and the intellectual reward is a glimpse at the future Microsoft is quietly building. It won’t replace Copilot or ChatGPT tomorrow, but it might just change how you think about where AI lives.