GitHub Copilot Bring-Your-Own-Key Lands: Tap OpenAI, Azure, Anthropic, and Local Models Per Session

On June 23, 2026, GitHub stunned the developer world by introducing bring-your-own-key (BYOK) support to the GitHub Copilot app, allowing programmers to route agent sessions through their own accounts with OpenAI, Azure OpenAI, Microsoft Foundry, Anthropic, LM Studio, and Ollama. The move shatters the long-standing model lock-in and gives teams direct control over which AI models power their coding assistance, right down to running fully local models with no cloud dependency.

This isn't a simple model picker. BYOK lets developers supply their own API keys for each supported provider, meaning a single Copilot session can seamlessly switch between GPT-5 on Azure for enterprise compliance, Claude 4 from Anthropic for nuanced code review, and a local Mixtral variant through Ollama when working offline. For the first time, Copilot becomes a hub for diverse AI models rather than a walled garden.

What exactly is bring-your-own-key?

BYOK flips the traditional SaaS AI model. Instead of GitHub buying model capacity on behalf of all users and baking it into a flat subscription, developers now bring their existing relationships with AI providers — and their own credits or quotas — directly into Copilot. The feature, accessed from within the GitHub Copilot app settings, accepts API keys or connection strings for each of the six supported endpoints. Once configured, the developer selects which provider powers a given agent session, whether that’s a quick code completion, an extended refactoring dialogue, or a multi-step task orchestrated by Copilot’s agentic features.

Critically, BYOK operates per session, not globally. A developer might start a morning coding block using Azure OpenAI to stay within a corporate VNet, then flip to Anthropic for a creative brainstorming session, and end the day with LM Studio on an air-gapped machine. That session-level granularity is an industry first among major coding assistants.

Which providers and models are supported?

The initial rollout covers a broad spectrum, from hyperscale clouds to local runners:

OpenAI – Direct access to the standard OpenAI API, supporting models like GPT-5, GPT-4o, and o3. Users need their own OpenAI API key and will be billed directly by OpenAI.
Azure OpenAI Service – For enterprises already provisioned on Azure, this allows Copilot to invoke models deployed in private instances, with all the networking, security, and compliance benefits of Azure.
Microsoft Foundry – Often referred to as Azure AI Foundry, this gives teams the ability to connect fine-tuned or custom models hosted on Microsoft’s model platform, including proprietary models trained on internal codebases.
Anthropic – Claude 4 and other Anthropic models become first-class citizens inside Copilot. Developer who prefer Anthropic’s coding style or longer context windows can use their own API key.
LM Studio – A locally installed model runner with a local server API. Copilot can detect a running LM Studio instance and route requests to any model loaded there, from Llama 4 to DeepSeek variants.
Ollama – The popular cross-platform local LLM runner. By specifying a localhost endpoint, developers can use any Ollama-managed model for completions, embeddings, and more — entirely offline.

This lineup covers every use case from maximum security (local models) to bleeding-edge performance (latest OpenAI) to tailored enterprise knowledge (Foundry custom models).

How does it work in practice?

Setting up BYOK takes minutes. Inside the Copilot app’s provider settings, a new “Custom provider” section lists each supported backend. The user enters the API key, endpoint URL if applicable, and optional model selection preferences. Local providers like Ollama require only the base URL (e.g., http://localhost:11434) and the model name. GitHub Copilot then validates the connection and remembers the configuration for future sessions.

During a coding session, the bottom-right provider indicator now shows a dropdown with all configured backends. Switching is instantaneous: the current conversation retains its context while the next message routes to the newly selected provider. This fluidity means no more reloading the IDE or losing agent state. For agentic workflows — where Copilot might autonomously edit multiple files, run terminal commands, or query APIs — the developer can even assign different subtasks to different backends if desired, though the initial release requires a single provider per session.

Behind the scenes, Copilot acts as a proxy. All API traffic still goes through GitHub’s infrastructure to maintain the rich editor integration, diff rendering, and agent orchestration. But the actual model inference is handed off to the user-specified provider. GitHub never sees the API key; it is encrypted and stored locally. For Azure and Foundry, enterprise customers can enforce that traffic stays within their virtual network, addressing data residency concerns.

Why this matters for Windows developers

The timing aligns perfectly with the surge in local AI tooling on Windows. LM Studio and Ollama have matured into first-class Windows applications, with native builds that leverage DirectML for GPU acceleration on Windows 11. The ability to run an entire Copilot agent session on an offline, air-gapped Windows machine using a local Llama 4 70B model — with zero data leaving the device — is a game changer for defense, finance, and healthcare developers. Previously, offline copilots were a compromise. Now, with BYOK and a powerful local model, devs get the same agentic experience as the cloud version, right down to automated PR descriptions and test generation.

For Windows-centric development, this means Azure DevOps pipelines can be debugged with a local model that understands your entire proprietary codebase, without ever shipping code to a third-party API. Large Windows shops with strict compliance can now adopt Copilot’s full capabilities while retaining control.

Privacy, security, and enterprise controls

BYOK immediately eliminates the #1 enterprise concern with Copilot: code telemetry. When using a local model through Ollama, not a single line of source code leaves the developer’s machine. Even with cloud providers, if the organization already has a contractual agreement with Azure OpenAI or Anthropic, Copilot’s traffic is just another API call under the same data processing terms. GitHub’s intermediary role is minimal: it passes the prompt and receives the model’s response without retaining either for training.

GitHub has added enterprise admin controls to manage which backends are allowed within an organization. Administrators can restrict teams to Azure OpenAI only, or to a specific Foundry endpoint, while blocking public OpenAI and local models. Auditing logs now show which provider was used for each session, aiding compliance teams. Additionally, because the API key is held by the customer, GitHub’s platform risk is reduced — a key compromise on GitHub’s side wouldn’t expose copilot inference costs or access.

What the community is saying

Although today’s announcement caught many by surprise, leaked preview builds had sparked heated threads on Reddit and GitHub Community. Developer sentiment overwhelmingly favors the move, with many praising the flexibility to avoid vendor lock-in and the cost transparency that comes with paying providers directly. “This is what Copilot should have been from day one,” one influential developer tweeted. “Now I can use my existing Azure credits instead of paying GitHub even more.”

Early adopters on the Windows Insider Dev channel have already tested local-only sessions with LM Studio’s latest GPU-accelerated runner, reporting near-instant latency for inline completions with 7B parameter models. The primary pain point so far is that local models lack the tool-calling and agentic skills of cloud-hosted GPT-5 or Claude 4; however, the community is already fine-tuning small coding-specific models specifically for Copilot’s new protocol.

Potential stumbling blocks and limitations

Not everything works out of the box. The initial release limits BYOK to the GitHub Copilot app — the standalone desktop client — with IDE extension support expected in a future update. This means Visual Studio Code and JetBrains users must run the Copilot app in parallel for BYOK features, while the built-in extension still uses the standard subscription model. GitHub acknowledged this gap and committed to parity within six months.

Another friction point: agentic features such as multi-file refactoring and codebase search are currently optimized for GitHub’s default models. When using a BYOK provider, the agent may produce different quality results, especially with smaller local models that lack strong instruction-following. GitHub provides a compatibility checklist and recommends starting with large models (at least 30B parameters) for local agent use.

Cost management also shifts to the user. A single long agent session with GPT-5 could rack up significant API charges on the user’s own key. GitHub offers a spending alert system that warns when a session’s token usage exceeds a configurable threshold, but there is no hard cap in the initial BYOK implementation.

The bigger picture: Copilot becomes a platform

This release signals a strategic pivot: GitHub Copilot is no longer just an intelligent assistant; it’s becoming a model-agnostic platform that orchestrates AI coding agents. By decoupling the model from the experience, GitHub positions itself as the universal interface for developer AI, much like an operating system for AI coding tools. The addition of local model support through LM Studio and Ollama breaks the final dependency on cloud compute, allowing Copilot to reach previously off-limits environments.

Rumors suggest that the next step will be a plugin marketplace for model connectors, letting third-party providers — Mistral, Cohere, or even self-hosted models — plug into the BYOK framework. If that materializes, Copilot could evolve into a true ecosystem, where enterprises mix and match models to optimize cost, speed, and privacy on a per-task basis.

For now, the June 23 update is live for all GitHub Copilot subscribers on Windows, macOS, and Linux. Developers can download the latest GitHub Copilot app, navigate to Settings > Provider > Add custom provider, and start experimenting. The era of single-model AI coding assistants is officially over.