Anthropic’s Claude suite of AI models reached general availability on Microsoft Foundry on June 29, 2026, the companies announced, bringing a new class of enterprise-grade reasoning and agentic capabilities to Azure. The service runs atop Microsoft’s cloud infrastructure exclusively on NVIDIA GB300 Blackwell Ultra systems—engineering that promises a step change in throughput and latency for the most demanding workloads.

Enterprise developers can now provision Claude models directly through Foundry’s unified console, combining Azure’s security and compliance scaffolding with Anthropic’s safety-first model architecture. The launch caps a multi‑year engineering collaboration between Microsoft, Anthropic, and NVIDIA, and it thrusts Azure into the center of the booming agentic‑AI market.

What Is Microsoft Foundry?

Microsoft Foundry is Azure’s fully managed environment for deploying, fine‑tuning, and orchestrating foundation models. Initially released in preview in early 2025, Foundry has evolved into a hub where enterprises can run models from OpenAI, Meta, Mistral, and now Anthropic under a single operational and billing pane. It abstracts the infrastructure layer behind a REST‑compliant API, offering built‑in monitoring, content filtering, and role‑based access control tailored for regulated industries.

Foundry distinguishes itself through deep integration with Azure’s data estate. Customers can ground Claude models against data residing in Azure SQL, Cosmos DB, or blob storage without moving sensitive information outside their tenant boundary. For enterprises that have already standardized on Azure Arc, Foundry’s Claude endpoints become an extension of their hybrid‑ and multi‑cloud management plane.

The NVIDIA GB300 Blackwell Ultra Advantage

The hardware story is inseparable from the software launch. NVIDIA’s GB300 is a Grace‑Blackwell superchip that pairs a 72‑core Grace CPU with a Blackwell GPU die configured in what NVIDIA calls the “Blackwell Ultra” specification. Each GB300 module delivers 2.5 TB/s of interconnect bandwidth and 288 GB of coherent HBM4 memory, allowing inference of large language models at full FP8 precision without tensor‑parallel sharding for all but the very largest architectures.

Microsoft has deployed thousands of GB300 nodes across multiple Azure regions, wired together with Quantum‑X800 InfiniBand. For Foundry users, this fabric means Claude’s 200,000+ token context windows can be served with single‑digit millisecond time‑to‑first‑token latencies, even under concurrent customer demand. The GB300’s built‑in support for NVIDIA’s Nemo Inference Framework further allows automatic model compilation to TensorRT‑LLM, squeezing out an additional 40 percent throughput over vanilla PyTorch inference—gains that translate directly into lower per‑token billing.

Claude Models on Foundry: The Lineup

At launch, Foundry offers three Claude variants:

  • Claude Opus – the largest model, optimized for complex reasoning, scientific analysis, and multi‑step agent tasks.
  • Claude Sonnet – a mid‑size model balancing intelligence and speed, ideal for high‑volume conversational agents and retrieval‑augmented generation (RAG) pipelines.
  • Claude Haiku – a compact, cost‑efficient model suited for simple instruction‑following, classification, and low‑latency edge scenarios when used in conjunction with Azure IoT Hub.

All tiers support a 1‑million‑token input context, a dramatic expansion that enables processing entire codebases, regulatory filings, or multi‑hour meeting transcripts in a single prompt. The models are served with native tool‑use and self‑reflection primitives, allowing agents to call Azure Functions, Dynamics 365 APIs, or custom containers without custom middleware.

Anthropic’s Constitutional AI safeguards are deployed as part of the pipeline. Foundry’s content filtering sits upstream of the model, while Claude’s own refusal logic provides a second layer, creating a defense‑in‑depth posture for sensitive use cases.

Building Enterprise Agents with Claude on Foundry

The most transformative promise of the partnership lies in agentic workflows. Unlike a simple chat completion, an agent powered by Claude on Foundry can plan, decompose tasks, use tools, and iterate over multiple steps while maintaining state. Foundry’s orchestration engine manages the agent loop natively, providing a no‑code visual builder alongside a Python SDK for programmatic control.

Mark Russinovich, CTO of Microsoft Azure, described the vision during a pre‑briefing: “An agent isn’t just a model—it’s a model plus a mission boundary plus memory plus permissions. Foundry gives enterprises a single platform to define all four, and with the GB300 we can run it economically at global scale.”

Early adopters have piloted the service across several verticals:

Financial Services

A tier‑one bank built an analyst agent that reads earnings calls, extracts risk factors, and updates internal risk models. The agent runs on a cron schedule, consuming documents from Azure Blob Storage and writing structured outputs to Azure Data Lake. With Claude Opus, the team reported a 90‑percent reduction in manual review time without any sacrifice in accuracy.

Healthcare

A hospital network deployed a clinical‑trial matching agent. The agent takes an anonymized patient record—embedded in a FHIR bundle stored in Azure Health Data Services—and cross‑references it against ClinicalTrials.gov in real time. Because all processing stays within the tenant’s Azure boundary, the architecture satisfied HIPAA and regional data residency requirements on day one.

Manufacturing

An automotive supplier created a supply‑chain orchestration agent that ingests IoT telemetry from factory floors, negotiates with ERP systems via their REST APIs, and generates natural‑language status reports for plant managers. Running on Claude Sonnet, the agent reduced stock‑out incidents by 27 percent during a three‑month beta.

Common to all three cases is the ability to trace agent decisions exhaustively. Foundry’s monitoring dashboard logs every tool invocation, model reasoning step, and output for auditability—a feature that legal and compliance teams have mandated before green‑lighting production deployment.

Pricing and Availability

Claude on Foundry is available in the Azure East US, West Europe, and Southeast Asia regions as of June 29, with additional regions scheduled for rollout through Q3 2026. Billing follows Azure’s server‑less model: customers pay per 1,000 input or output tokens, with a reservation discount of up to 30 percent for committed‑throughput tiers. The GB300’s efficiency brings token costs down roughly 50 percent compared with the previous A100‑based infrastructure, Microsoft stated, making long‑context, multi‑turn agent sessions economically feasible.

Existing Azure customers can access Claude through the Foundry portal immediately. New users can sign up for an Azure subscription and receive a service credit that includes a limited number of free tokens for experimentation. Premium support plans include dedicated Anthropic‑trained solutions architects who help tune prompts, design agent architectures, and implement monitoring.

Security, Compliance, and Responsible AI

Microsoft has layered Azure’s compliance certifications—SOC 2 Type II, ISO 27001, FedRAMP High, and HIPAA—on top of the Claude deployment. Data processed by the models remains encrypted in transit and at rest, with customer‑managed keys available via Azure Key Vault. No customer prompts or completions are logged for training purposes by default, and organizations can enable full deletion policies that remove all interaction logs within 24 hours.

Both Microsoft and Anthropic stress that the partnership goes beyond infrastructure. Anthropic’s Responsible Scaling Policy governs which model versions can be deployed, and Azure Policy extends these guardrails to customers, enabling IT administrators to block specific use cases or model capabilities at the organizational level. Content filtering can be configured to reject harmful, copyrighted, or out‑of‑policy material before it reaches the model.

Competitive Landscape

The move places Azure in direct competition with AWS Bedrock and Google Cloud’s Vertex AI, both of which already offer Anthropic models. However, the Microsoft‑NVIDIA‑Anthropic triangle adds a hardware exclusivity: for the time being, the GB300 Blackwell Ultra configuration powering Claude on Foundry is unique to Azure, potentially giving it a performance and efficiency edge over rivals still running on H100 or A100 instances.

Meanwhile, the broader agent market is heating up. OpenAI’s Codex Agents and Google’s Bard Agents are vying for developer attention, but many enterprise customers have hesitated to adopt them due to concerns about data sovereignty and operational complexity. Foundry’s managed agent runtime and its tight coupling with the Azure ecosystem could sway those on the fence, particularly organizations already deep in Microsoft 365 and Dynamics estates.

What’s Next

Microsoft and Anthropic have outlined a roadmap that includes fine‑tuning capabilities for Claude on Foundry later this year, allowing enterprises to adapt the models on proprietary datasets using LoRA and other parameter‑efficient techniques. Also on the docket is a “bring‑your‑own‑NIM” feature that will let customers pair Claude with NVIDIA NIM micro‑services for custom retrieval and embedding pipelines.

NVIDIA, meanwhile, plans to release a GB300‑based reference architecture for on‑premises Foundry deployments, targeting industries with extreme data locality requirements. If realized, that would make Foundry one of the few cloud‑first platforms with a viable disconnected option—a differentiator in sectors like defense and critical infrastructure.

For now, the general availability of Claude on Foundry marks the moment when agentic AI moved from pilot programs to the enterprise mainstream. With the raw compute of GB300, the safety architecture of Claude, and the operational maturity of Azure, the three companies have assembled what might be the most complete agent‑builders’ toolkit available anywhere.