Microsoft will begin transitioning Copilot Cowork customers to a consumption-based billing model and is actively evaluating a Microsoft-hosted version of DeepSeek’s latest model as a budget-friendly alternative, the company confirmed on June 16, 2026. The dual-pronged strategy aims to give enterprises granular control over soaring AI costs while addressing long-simmering complaints about flat-rate subscriptions that often go underutilized. Under the new framework, organizations will pay only for the AI resources they actually consume—measured by token usage, compute time, or completed task volume—instead of fixed per-user monthly fees.
The shift marks one of the most significant pricing pivots in Microsoft’s AI roadmap since the original Copilot subscription debuted. Copilot Cowork, the enterprise-grade variant woven into Microsoft 365, Teams, and Azure AI Foundry, has faced scrutiny from IT leaders who say fixed per-seat licensing fails to align value with peak and off-peak usage. By moving to metered billing, Microsoft is betting that variable pricing will attract cost-sensitive customers who have held back from broad deployments because of predictable yet inflexible bills.
Microsoft executives framed the change as a direct response to enterprise feedback. “Customers told us they want to connect cost to value,” said Sarah Weinberg, corporate vice president for Microsoft AI Platforms, in a press briefing. “With usage-based billing, a marketing team that runs Copilot Cowork heavily during campaign season pays more in those months, but the legal department using it sparingly doesn’t subsidize that heavy usage. It’s fairer and more transparent.” The company will offer a blended model initially: enterprises can keep existing per-user licenses with a certain pooled consumption floor, then pay overages at a per-unit rate once thresholds are crossed. Pure pay-as-you-go plans will follow later in the year.
Alongside the billing overhaul, Microsoft disclosed it is testing a proprietary deployment of DeepSeek’s large language model within Azure AI Foundry. The Chinese AI lab has gained attention for delivering competitive performance to OpenAI’s GPT-4 class at a fraction of the inference cost. By hosting DeepSeek in Microsoft’s own data centers—stripped of telemetry back to China and wrapped with enterprise governance controls—organizations could tap a model that costs as little as one-tenth the per-token price of GPT-4.5 for tasks like document summarization, code generation, and internal knowledge retrieval.
The hosted DeepSeek option is still in private preview and has not been given a release date, but Microsoft confirmed it will be governed by the same compliance certifications as other Azure AI services. That means data stays within the customer’s selected geography, and all prompts and responses fall under existing data protection addenda. The move may defuse geopolitical concerns that have kept many Western enterprises from using China-based models directly. “We’re not routing anything to external APIs; the model runs on Microsoft infrastructure, with Microsoft’s safety layers—Azure AI content filters, customer-managed keys, and full audit logging,” Weinberg added.
Financial analysts view the two moves as a strategic masterstroke. The AI model market is undergoing rapid commoditization, and Microsoft’s golden handcuffs—access to OpenAI’s latest models at preferred rates—is being challenged by competitors like Google, Amazon, and open-source models. By offering a cheaper, self-hosted DeepSeek, Microsoft can keep customers on Azure while undercutting rivals who might lure them with lower-priced alternatives such as Anthropic’s Claude or Meta’s Llama on other clouds. “This is Microsoft saying: we’ll give you the same cost savings you’d get from a Chinese API, but with all the enterprise controls you require,” said Gartner analyst Linda Song.
For Microsoft, the metered billing piece could prove even more lucrative. Early trials showed that total per-employee AI spend can drop 20–30% for organizations that adopt tight cost governance, but the consumption model also encourages more experimentation. Enterprises that pilot Copilot Cowork for a few power users often expand to hundreds or thousands of seats once they see the productivity lift. With usage-based pricing, those expansions become frictionless: no procurement renegotiations, no license true-ups. The meter simply ticks faster as adoption grows, and Microsoft’s revenue per account may increase even as the per-unit cost falls.
To help enterprises manage the new financial model, Microsoft is rolling out a Cost Intelligence dashboard inside the Microsoft 365 admin center and Azure AI Foundry. The tool provides real-time spend tracking, budget alerts, per-department chargebacks, and AI-powered recommendations to optimize prompt patterns for lower token consumption. Early adopters in the Copilot Cowork Insider program have already built custom policies—like capping daily token budgets for specific user groups—using Azure Policy integration.
Enterprise governance is the scaffolding holding the entire offering together. The billing shift and model diversification both feed into a broader push Microsoft calls “AI responsibly, at scale.” IT administrators can now define fine-grained access controls that determine which employees can use the high-cost GPT-4.5 model versus the cheaper DeepSeek model for different tasks. For example, a policy might route all internal HR chatbot queries to the DeepSeek endpoint while reserving GPT-4.5 for customer-facing communication. This model-routing capability, coupled with spend limits, gives CISOs and CFOs the dual assurance that AI costs won’t spiral out of control and sensitive data stays protected.
The deep integration with Azure AI Foundry, Microsoft’s unified platform for building and governing AI applications, is critical. Customers already using Foundry to deploy custom copilots, retrieval-augmented generation (RAG) applications, and agentic workflows can now mix and match billing models and model endpoints from a single console. The DeepSeek offering appears right alongside OpenAI’s models in the model catalog, complete with benchmark scores and estimated per-token pricing. This transparency aims to make cost-performance trade-offs an everyday operational decision, not a procurement mystery.
Reaction from the enterprise community has been cautiously optimistic. On a popular Windows-centric forum, IT administrators debated the practical impacts. “We’ve been holding off on Copilot Cowork because the flat $30/user/month added up to six figures per year with only a handful of power users actually generating meaningful output,” one forum member wrote. “If we can switch to pay-as-you-go, we could roll it out to everyone and let usage converge organically.” Others raised concerns about budget unpredictability. “What happens if a department runs a poorly optimized prompt loop and racks up a five-figure bill overnight?” asked another. Microsoft says the hard-spend limits and real-time alerts will prevent such overruns, but skeptics want to see it battle-tested.
The DeepSeek announcement triggered a separate wave of discussion about model quality and safety. DeepSeek’s models have performed well on public benchmarks like MMLU and HumanEval, scoring close to GPT-4 on reasoning tasks while requiring far fewer resources. However, enterprise users question whether the model is robust enough for nuanced business logic, legal contracts, or creative content. Microsoft said it is subjecting the hosted DeepSeek to the same red-teaming and content safety evaluations as its first-party models, with results to be published before general availability. In preliminary tests, it handled standard enterprise use cases—summarizing technical reports, generating code snippets from natural language, drafting emails—with acceptable accuracy, though it sometimes lagged behind GPT-4.5 on complex multi-step reasoning.
Underscoring the cost differential is a leaked pricing estimate that circulated shortly after the announcement. According to internal documents reviewed by WindowsNews, DeepSeek’s per-token price for the Microsoft-hosted variant could start at around $0.03 per million tokens, compared to $0.30 for GPT-4.5—a 10x spread. Even accounting for potential latency differences and lower throughput on less optimized hardware, the savings could be transformative for call centers, document processing pipelines, and internal knowledge bases that handle tens of millions of tokens daily.
Microsoft is not abandoning its partnership with OpenAI. The company stressed that GPT-4.5 and upcoming frontier models remain the premium option for high-stakes decision-making. The goal, rather, is to give customers a gradient of choice—much like Azure’s existing tiered compute instances. “In the same way you pick a virtual machine size based on workload, you’ll now pick an AI model based on task criticality and budget,” the briefing document explained.
As the Copilot Cowork billing changes start rolling out, Microsoft also announced a 90-day grace period during which customers can run both the old per-user plan and the new metered option side by side. This dual-run window lets finance teams model what their bills would have been under the new scheme before committing. Microsoft’s partners, including Accenture and EY, are building migration tooling and cost-optimization consulting practices around the shift.
The competitive landscape is shifting rapidly. Google last month introduced pay-per-use Ai Enterprise plans for Workspace, and Salesforce just unveiled a token-based pricing for its Einstein GPT. Amazon is expected to follow with consumption-based SageMaker Studio Lab pricing. By acting now, Microsoft seizes first-mover advantage in the enterprise copilot space while setting a precedent that AI is an operational expense like any other cloud service.
Looking ahead, the hosted DeepSeek initiative might signal a broader multi-model strategy. Insiders hint that Microsoft could eventually host other open or licensed models—such as Mistral’s Mixtral or Cohere’s Command R—within the same Azure AI Foundry governance framework, turning the platform into a one-stop model marketplace. That would put it head-to-head with Amazon Bedrock and Google Vertex AI’s model garden. For IT buyers, it means more leverage and less lock-in, shifting the battleground from model exclusivity to platform integration.
What does this mean for the everyday Windows user? While Copilot Cowork is enterprise-targeted, the pricing model and multi-model infrastructure could trickle down to Microsoft 365 Copilot for business and even consumer offerings. Imagine a future where your personal Copilot app lets you pick “fast & cheap” for quick tasks and “deep reasoning” for important analyses, all billed against a prepaid credit pool. If successful, the metered billing experiment could reshape how all Microsoft AI services are packaged.
For now, enterprises eager to slash their AI spend will watch closely as the private preview expands. Microsoft plans to share performance data and case studies from early testers at its Ignite conference in November. Until then, the company is betting that a mix of financial flexibility, model choice, and ironclad governance will keep enterprise customers building on Azure—and keep the Copilot brand dominant.