Microsoft Azure Foundry Goes Live with Claude Models on NVIDIA GB300, Ushering in Agentic AI Era

Microsoft's Azure Foundry has entered general availability, and with it, Anthropic's Claude AI models are now running on NVIDIA's latest GB300 NVL72 Blackwell Ultra systems—a move that puts agentic AI capabilities directly into the hands of enterprise developers. The milestone, reached on June 29, 2026, marks a significant leap in cloud-based AI infrastructure, pairing cutting-edge hardware with some of the most advanced language models available.

Azure Foundry, Microsoft's unified AI development platform, had been in preview for months, attracting early adopters with its promise of streamlined model training, fine-tuning, and deployment. But the GA release, coupled with the availability of Claude models, signals that Microsoft is ready to compete head-to-head with rivals like AWS and Google Cloud in the enterprise AI race. The foundation of this offering is NVIDIA's GB300 NVL72 system, a powerhouse that leverages the Blackwell Ultra architecture and Quantum-X800 InfiniBand networking to accelerate inference and training at unprecedented scales.

The Hardware Backbone: NVIDIA GB300 and Quantum-X800

At the heart of the Azure Foundry GA is the NVIDIA GB300 NVL72, a rack-scale system that integrates 72 Blackwell Ultra GPUs interconnected with NVLink and NVSwitch. This design eliminates traditional bottlenecks by pooling memory and compute resources, enabling models with over a trillion parameters to run efficiently. The GB300 represents a generational leap over its predecessor, the H100, offering up to 4x the performance for large language model inference, according to NVIDIA's published benchmarks.

But the real secret sauce is the Quantum-X800 InfiniBand networking. With 800 Gb/s throughput per port and in-network computing capabilities, it ensures that data moves between GPUs, storage, and other nodes with minimal latency. For Azure Foundry users, this means that even the largest Claude models—reportedly including Claude Opus and a new, as-yet-unnamed agentic variant—can process complex multi-step reasoning tasks in near real-time.

Anthropic's decision to run Claude on this specific infrastructure was not accidental. The company has been vocal about the need for secure, scalable hardware to support its constitutional AI approach. By colocating Claude with Azure's confidential computing features, the partnership addresses enterprise concerns about data privacy and regulatory compliance.

Agentic AI: More Than Just a Chatbot

Agentic AI refers to systems that can autonomously plan, execute, and iterate on tasks across digital environments. Unlike a standard chatbot that responds to prompts, an agentic AI can break down a high-level goal—such as "audit our cloud infrastructure for security gaps"—into subtasks, use tools, and adjust its approach based on intermediate results. The combination of Claude's reasoning capabilities and the GB300's compute power makes this practical for enterprise workloads.

During the preview phase, early testers used Azure Foundry to build agents that could handle customer service triage, code refactoring, and even scientific literature review. One pharmaceutical company reportedly used a Claude-powered agent to scan thousands of research papers, cross-reference clinical trial data, and generate hypotheses for drug repurposing—a task that previously took a team of researchers weeks. With GA, Microsoft is providing pre-built agent templates and a drag-and-drop workflow designer, lowering the barrier for non-experts.

The agentic capabilities are deeply integrated with Microsoft's ecosystem. Azure Foundry supports plugins for Microsoft 365, Dynamics 365, and Power Platform, allowing agents to interact with emails, calendars, and business data securely. For example, an agent could draft a response to a vendor inquiry, schedule a meeting, and update a CRM entry—all within a governed permissions framework.

What's Available and How to Get Started

Microsoft has rolled out Claude models in Azure Foundry through a tiered offering. The Claude 3.5 Sonnet and Claude 3.5 Haiku models are available for standard inference tasks, while the larger Claude Opus model is offered via a dedicated capacity model to ensure consistent performance for demanding workloads. Additionally, a specialized "Claude for Agents" variant, fine-tuned for tool use and multi-turn reasoning, is exclusive to the GB300 infrastructure.

Pricing follows the pay-as-you-go Azure model, with tokens billed per 1,000. Input token costs are around $0.003 for Sonnet and $0.015 for Opus, with output tokens roughly double. Dedicated throughput tiers offer predictable monthly costs for enterprises with steady workloads. For the GB300-powered instances, Microsoft has introduced new NCB4as and NCB8as virtual machine series, providing 4 or 8 Blackwell GPUs with up to 2 TB of GPU memory per VM.

Developers can access Claude models via Azure AI Studio, the web-based interface for Foundry, or through APIs that are compatible with Azure's authentication and monitoring tools. The platform also supports fine-tuning using proprietary data, though Anthropic requires adherence to its usage policies, which prohibit certain high-risk applications.

Security and Compliance at Scale

One of the biggest hurdles for enterprise AI adoption has been security. Microsoft addresses this with a multi-layered approach. All Claude inferencing on Azure Foundry runs within Azure's confidential computing environment, which uses hardware-based trusted execution environments to encrypt data in use. This means even Microsoft cannot access customer prompts or model outputs.

Additionally, Azure Foundry integrates with Microsoft Purview for data governance, allowing organizations to apply sensitivity labels, retention policies, and audit trails to AI interactions. The platform has achieved compliance with major standards, including HIPAA, SOC 2, and FedRAMP High, making it suitable for regulated industries like healthcare and finance.

For organizations deploying agentic AI, Azure's role-based access control (RBAC) and managed identities ensure that agents operate under the least-privilege principle. A financial services firm, for instance, could allow an agent to read transaction data but not initiate transfers, with all actions logged for compliance review.

Competitive Landscape: Microsoft vs. AWS vs. Google

The GA of Azure Foundry with Claude on GB300 intensifies the cloud AI wars. Amazon Bedrock offers Claude models as well, but on AWS's own Trainium and Inferentia chips, which have yet to match NVIDIA's raw performance for the largest models. Google Cloud's Vertex AI provides Claude alongside its Gemini models, but lacks the tight coupling with Microsoft's productivity suite. By pairing best-in-class hardware with deep Office integration, Microsoft is betting that enterprises will see Azure as the most productive environment for AI.

There's also a strategic play against NVIDIA's own DGX Cloud, which sells direct access to GB300 clusters. Microsoft's value-add is the managed service layer: Azure handles patching, scaling, and failover, while providing a consistent development experience across model providers. This could appeal to companies that want to avoid vendor lock-in and prefer a multi-model strategy.

Real-World Impact and Use Cases

Early adopters of the preview have shared tangible benefits. A global logistics company used Azure Foundry to deploy a Claude agent that optimizes shipping routes by analyzing weather patterns, port congestion, and fuel costs. The agent reduced empty container movements by 12%, saving millions annually. Another customer, a legal tech startup, built a contract analysis tool that extracts clauses, identifies risks, and suggests revisions with 95% accuracy, slashing review time from hours to minutes.

The agentic aspect is what sets these apart. Instead of requiring a human to prompt each step, the agents can be given objectives and will autonomously pull data from Azure SQL, call APIs, and even send Teams messages when tasks are complete. This shifts the human role from operator to overseer, which is what Microsoft means by "copilot to agent" evolution.

Developer Experience and Ecosystem

Azure Foundry isn't just about Claude; it's a multi-model hub. Alongside Anthropic's models, developers can access OpenAI's GPT-5, Meta's Llama 4, and Microsoft's own Phi-5, all through the same API surface. This allows for hybrid workflows where, say, a Claude agent handles complex reasoning but offloads simple classification tasks to a smaller Phi model to manage costs.

The platform's Model-as-a-Service (MaaS) abstraction means that provisioning a GB300-backed instance is as simple as selecting a model and specifying the throughput tier. Microsoft handles the underlying orchestration, including fault tolerance and automatic scaling. For data science teams, Foundry includes a collaborative notebook environment with pre-configured recipes for fine-tuning Claude on domain-specific data using LoRA or full-parameter tuning.

Microsoft has also launched a certification program for agentic AI developers, with learning paths covering responsible AI, prompt engineering for agents, and performance optimization on the GB300 hardware. This investment in skilling is intended to build a loyal community and accelerate adoption.

Challenges and Considerations

Despite the fanfare, there are hurdles. Cost remains a concern for smaller businesses, as even the most efficient GB300 instances can rack up significant bills when running Opus-class models. Microsoft's calculator estimates that a mid-sized deployment of 10 million agent actions per month could exceed $20,000, which may be prohibitive for some.

Latency for ultra-large models can also be an issue for real-time applications. While the Quantum-X800 fabric keeps GPU-to-GPU communication fast, the time to first token for a complex agent prompt can still be a few seconds, which might not suit customer-facing chatbots that need instant responses.

There's also the risk of agentic AI behaving unpredictably. Anthropic's constitutional AI includes safety guardrails, but in multi-step planning, agents can still make poor decisions if a goal is ambiguous. Microsoft provides monitoring tools to set optional human-in-the-loop checkpoints, but enterprises must invest in rigorous testing.

The Road Ahead

Looking forward, Microsoft has hinted at further hardware innovations on the horizon. The GB300 is expected to be refreshed with the GB300 Superchip later this year, doubling the memory bandwidth and adding native support for FP8 sparse operations—potentially delivering another 2x performance boost. Azure is also planning to integrate Claude's upcoming "Claude Nexus" agent framework, which will enable multi-agent collaboration where multiple Claude instances work together on complex projects.

For the Windows ecosystem, the implications are profound. With Azure Foundry deeply tied to Azure AI and the intelligent edge, we may soon see agentic AI capabilities baked into Windows Server and even the Windows client OS. Imagine a future where your desktop can run a local agent that coordinates with cloud-based Claude models on GB300—seamlessly offloading heavy lifting while keeping sensitive data on-device.

The general availability of Azure Foundry with Claude on NVIDIA GB300 is more than just a product launch; it's a declaration that enterprise AI has moved from experimentation to production. As organizations scramble to adopt agentic workflows, the battle for cloud AI dominance will be won by the platform that delivers the best combination of performance, security, and integration. For now, Microsoft has set a high bar.