Meta Reportedly Readying ‘Meta Compute’ Cloud Service to Sell AI GPU Access and Hosted Models

Meta Platforms is secretly developing a cloud infrastructure service designed to sell raw artificial intelligence computing power and hosted large language models to outside customers, a move that could directly challenge Amazon Web Services, Microsoft Azure, and Google Cloud. Codenamed “Meta Compute,” the service would allow businesses and developers to rent access to the company’s massive fleet of Nvidia H100 GPUs and eventually its custom silicon, according to a July 1 Bloomberg report that cited people familiar with the plans. The initiative represents a dramatic shift for a company that has traditionally kept its infrastructure locked down for internal workloads.

The service would not only provide on-demand GPU rental for training and inference workloads but also offer a suite of Meta-built AI models—such as the Llama family—through API endpoints. This would put Meta in direct competition with existing AI-as-a-service offerings like Amazon Bedrock and Microsoft Azure AI Studio, while undercutting them on price thanks to Meta’s immense scale and ownership of the underlying models.

Meta’s AI Infrastructure Blitz

Few companies have invested as aggressively in AI hardware as Meta. Chief Mark Zuckerberg has publicly stated that the company will have 350,000 Nvidia H100 GPUs online by the end of 2024, with total compute capacity approaching 600,000 H100 equivalents when counting other chips. Those figures dwarf the clusters operated by specialized cloud providers like CoreWeave and Lambda Labs, and they rival the internal training environments of Microsoft and Amazon. Meta’s AI Research SuperCluster (RSC), which serves as the backbone for training Llama 3 and other models, already ranks among the most powerful computing systems on Earth.

Most of that capacity, however, remains dedicated to Meta’s own product teams—powering news feed recommendations, generative AI features across Instagram and WhatsApp, and the company’s open-source model development. The Meta Compute rumor suggests leadership sees a lucrative opportunity to monetize excess capacity during off-peak hours or to build out additional infrastructure specifically for external customers.

What Meta Compute Could Look Like

According to the Bloomberg report, Meta has been quietly building the service with a small team and aims to launch a limited preview in late 2024 or early 2025. The offering would likely include three tiers:

Bare-metal GPU instances: customers rent dedicated physical servers equipped with H100 GPUs interconnected via high-bandwidth InfiniBand fabrics, similar to what AWS offers with its P5 instances.
Managed training clusters: higher-level orchestration that automates distributed training across thousands of GPUs, targeting AI startups and enterprises that lack in-house MLOps expertise.
Hosted model APIs: REST endpoints for Meta’s own models—Llama 3 and future iterations—along with fine-tuning and retrieval-augmented generation (RAG) capabilities.

Pricing remains speculative, but one source suggested Meta could undercut current market rates by 30–40 percent. Reserved H100 instances from major cloud providers typically sell for $2.50–$3.00 per GPU-hour without long-term commitments. Meta, which purchases Nvidia chips at massive scale and builds its own data centers, could offer them at $1.50 or less and still maintain healthy margins.

The Open-Source Angle

Meta’s strategy hinges on more than just cheap compute. The company has bet heavily on open-source AI, releasing Llama 2 and Llama 3 under permissive licenses that have fostered a vibrant ecosystem of derivative models and tooling. By offering hosted versions of those models, Meta Compute could funnel developers toward the company’s stack while collecting usage data to improve future releases.

“If you’re already fine-tuning Llama on your own hardware, moving to Meta’s cloud and paying pennies per token is a no-brainer,” said a cloud architect who requested anonymity. “And because the model weights are open, you’re not locked in—you can always take your custom checkpoint elsewhere. That’s a powerful story.”

Contrast that with the closed-source approaches of OpenAI (via Azure) and Anthropic (via AWS). A Meta Compute service would be the first major cloud to put open-weight models front and center, potentially accelerating enterprise adoption of non-proprietary alternatives.

Competitive Landscape: Giants and Upstarts

The cloud AI market is already crowded. AWS controls roughly 32 percent of global cloud infrastructure spending, followed by Azure at 23 percent and Google Cloud at 11 percent. Each has baked AI into its platform: AWS with SageMaker and Bedrock; Azure with its OpenAI partnership and AI Studio; Google Cloud with Vertex AI and its TPU offerings.

But a new wave of GPU-specialist cloud providers—CoreWeave, Lambda, RunPod—has proven that there is massive unmet demand for raw compute, particularly from AI startups that find the big hyperscalers too expensive or too restrictive. CoreWeave, which began as a crypto mining operation, is now valued at $19 billion and counts Microsoft as its largest customer. Meta’s entry would instantly outscale all of those players combined, potentially triggering a price war.

“When Meta enters a market, it doesn’t tiptoe,” said an industry analyst who tracks cloud infrastructure. “They have the balance sheet to run this at break-even for years if they wanted to, and they have homegrown models that nobody else can match in terms of permissive licensing. AWS and Azure should be worried.”

Technical and Operational Hurdles

Yet building a public cloud service is not merely a matter of plugging servers into a billing portal. Meta would need to invest heavily in multi-tenancy, network isolation, identity and access management, and compliance certifications (SOC 2, ISO 27001) that enterprise customers demand. The company has historically struggled with product reliability and customer support—areas where AWS and Azure have decades of experience.

There is also the question of internal conflict. Meta’s AI product teams compete fiercely for compute resources, and carving out a portion of the fleet for external customers could slow internal development if not managed carefully. The company might avoid this by purchasing additional GPUs specifically for the Compute offering, but that would require billions more in capex at a time when Wall Street is already scrutinizing Meta’s AI spending.

Windows Ecosystem Implications

For Windows enthusiasts and developers, a Meta Compute service could open new avenues for AI experimentation. Microsoft has tightly integrated its Copilot stack with Windows 11, and Azure AI Studio provides a managed environment for building AI applications that run on Windows workstations or in the cloud. A low-cost, open-model competitor could pressure Microsoft to reduce Azure AI pricing or to accelerate the rollout of on-device AI features that rely on local Llama derivatives.

Developers using Visual Studio Code or GitHub Copilot could easily consume Meta Compute APIs directly, without needing to switch ecosystems. And if Meta releases Windows-native client tooling—akin to the Azure CLI—managing GPU instances and model endpoints would be trivial from a Windows terminal. The mere threat of Meta’s entry might benefit Windows users by spurring more competitive offers from Microsoft’s cloud division.

Antitrust and Geopolitical Risks

Meta’s ambitions would not exist in a regulatory vacuum. U.S. and EU antitrust authorities have taken a sharp interest in the concentration of AI infrastructure, and a Meta-owned cloud could raise concerns about vertical integration—especially if the company ties its social media data to the Compute platform. Export controls on advanced GPUs to China could also limit Meta’s addressable market, though the service would likely launch in North America and Europe first.

Furthermore, cloud partnerships could become awkward. Meta currently uses AWS for some workloads and has a deep relationship with Microsoft through the PyTorch Foundation. Becoming a direct competitor might strain those ties, though both hyperscalers are accustomed to frenemy dynamics.

The Bottom Line

Meta Compute remains unconfirmed, and a company spokesperson declined to comment on the Bloomberg report. But the logic behind it is sound: Meta sits atop one of the world’s largest GPU fleets, owns some of the most popular open-weight AI models, and faces pressure from investors to show a return on its enormous infrastructure outlays. If executed well, the service could turn AI compute into a commodity, much as Amazon Web Services did for general-purpose computing nearly two decades ago. For now, the industry watches and waits—but competing cloud CEOs are unlikely to be sleeping soundly.