Local Newspaper Coalition Sues Microsoft and OpenAI Over AI Training on Copyrighted Work

A coalition of 35 newspaper publishers representing nearly 400 local and regional titles sued Microsoft and OpenAI in federal court in New York on Tuesday, alleging the companies used copyrighted articles to train their AI models without permission. The lawsuit, one of the largest of its kind, claims that products like Microsoft Copilot and ChatGPT are built on a foundation of intellectual property theft, reproducing or closely paraphrasing reporting from community newspapers across the country.

The Lawsuit: 400 Newspapers, 35 Publishers, One Claim

Filed on June 24, 2026, in the Southern District of New York, the complaint asserts that OpenAI and Microsoft systematically scraped decades of local journalism—news stories, features, obituaries, and editorials—without consent or compensation. The plaintiffs, a consortium of small and mid-sized publishers, argue that the defendants’ actions violate federal copyright law, the Digital Millennium Copyright Act, and state unfair competition statutes.

According to the suit, both Microsoft Copilot and ChatGPT have been trained on vast corpora that include copyrighted works from the plaintiffs’ websites, and the models can generate outputs that mirror the publishers’ original reporting. The publishers seek statutory damages—potentially up to billions of dollars—and a permanent injunction to prevent further use of their content for training or inference.

Microsoft has not yet publicly responded, but the company has consistently defended AI training practices as fair use. OpenAI, which has struck licensing deals with some larger publishers in the past, may argue that its methods fall within the bounds of the law. The case is before Judge Paul G. Gardephe, and the first hearing is expected later this year.

Who’s Suing and Why

While the names of all 35 publishers were not immediately released, the group describes itself as the backbone of American local journalism. Many operate in underserved communities where a daily paper is often the only source of reliable news. They allege that AI models, by ingesting and repurposing their content, threaten economic viability. When users ask Copilot or ChatGPT for local news summaries, weather, or event information, the tools can provide answers that replicate the publishers’ work, diverting traffic and revenue from original sources.

“Our member papers have invested in reporting that strengthens democracy,” a spokesperson for the coalition said in a statement. “Microsoft and OpenAI are commercializing that investment without paying a dime.”

The suit follows a pattern: major media outlets like The New York Times, News Corp, and The Intercept have filed similar complaints. But the sheer scale of this action—hundreds of newspapers united—is unprecedented. It also marks the first time a geographically distributed group of small publishers has banded together to confront technology giants over AI training data.

Windows Users: Immediate Impact and Long-Term Questions

For the millions who use Microsoft Copilot daily—built into Windows 11, Edge, Bing, and Microsoft 365—the lawsuit does not immediately change what the assistant can do. Copilot will continue to answer questions, summarize documents, and generate text as before. But beneath the surface, the litigation introduces uncertainty that could eventually reshape the product.

If the publishers prevail on the merits or secure a preliminary injunction, Microsoft might be forced to restrict Copilot’s access to certain types of content. For instance, queries about local events, public records, or specific newspaper articles might yield less specific or lower-quality responses. The company could also preemptively remove training data from contested sources, altering the model’s knowledge base and affecting performance in some domains.

For enterprise users, the stakes are higher. Microsoft’s Copilot Copyright Commitment says the company will defend customers against copyright infringement claims that arise from using its AI services, provided they follow certain guardrails. But a ruling against Microsoft could test the limits of that indemnity. If a court determines that training on copyrighted data is not fair use, the legal landscape for all enterprise AI deployments becomes murkier, potentially slowing adoption or increasing costs.

IT administrators should watch for updated guidance from Microsoft on data provenance and copyright filters. The company may introduce new tools that allow organizations to exclude certain data sources from model training or generation, similar to content crediting features already in preview.

How We Got Here: A Timeline of Tensions

The legal battle over AI and copyright has escalated steadily since ChatGPT’s launch in late 2022.

January 2023: Getty Images sued Stability AI over use of millions of photos in training.
December 2023: The New York Times filed a landmark complaint against OpenAI and Microsoft, alleging mass copyright infringement.
Early 2024: Multiple authors, including Sarah Silverman and Michael Chabon, sued over books used for training.
Mid-2024: OpenAI began striking commercial deals with publishers (e.g., Le Monde, Financial Times) for content access, signaling a willingness to pay when legally necessary.
Late 2024: The Authors Guild sued, representing 17 writers.
January 2025: A U.S. federal judge allowed most of the NYT complaint to proceed, rejecting OpenAI’s motion to dismiss.
2025–2026: Dozens of smaller suits from individual artists, coders, and publications entered various stages of litigation.

Microsoft’s deep integration of Copilot into Windows and its suite of productivity tools has made the company a primary target. Copilot relies on large language models—often GPT-4 variants—through a partnership with OpenAI, in which Microsoft has invested over $13 billion. Notably, Copilot is advertised as Windows’ AI assistant, meaning that the outcome of these cases could have direct implications for the operating system itself.

The current lawsuit amplifies a critical question: Does web scraping and AI training on publicly available but copyrighted content constitute fair use? The “purpose and character of the use,” a key test, is hotly debated. Microsoft and OpenAI contend that their use is transformative, as the models learn patterns rather than reproduce articles verbatim. Publishers counter that the outputs sometimes replicate substantial portions of their stories, and that the real purpose is commercial gain, not public benefit.

Practical Steps for Copilot Users and IT Admins

Though the case will take years to resolve, there are measures individuals and organizations can take now.

For everyday Windows users

Be mindful of the source: When Copilot provides news or factual information, treat it as a starting point, not a primary source. It may regurgitate from copyrighted articles, and you could inadvertently disseminate protected content.
Use Bing or Edge citations: Copilot often includes links to sources. Click through to verify information and support original journalism.
Consider privacy settings: Windows Copilot may use your interactions to improve the service. Review privacy controls under Settings > Privacy & security > Copilot to opt out of model training if you have concerns about how your data is used.

For power users and developers

Test for memorization: If you regularly use Copilot in Visual Studio or GitHub to generate code or documentation, run your outputs through a plagiarism checker if copyright is a concern.
Stay informed on model updates: Microsoft frequently updates its content filtering and data handling. Subscribe to the Microsoft 365 admin center blog for the latest on Copilot’s compliance posture.

For IT professionals and business decision-makers

Audit AI content policies: Review your organization’s acceptable use policy for AI-generated content. Ensure employees know not to input or generate text that could infringe third-party rights.
Leverage Microsoft’s indemnity—but read the fine print: The Copilot Copyright Commitment covers commercial customers under specific terms. Verify coverage with your Microsoft representative and consider supplemental insurance if your exposure is high.
Explore content filtering: Microsoft’s Azure OpenAI Service offers content filtering that can block certain types of output. Investigate whether similar controls can be applied at the Copilot level through Microsoft Purview.
Prepare for discovery: If your company is in a heavily regulated industry, document how and when AI tools are used in content creation. This could matter in any future legal challenge.

What’s Next for AI and Journalism

This lawsuit will likely not be the last of its scale. As local newspapers face existential threats from declining advertising revenue and consolidation, many see the unauthorized use of their content by AI as a final blow. The publishers are well-organized, and their collective action may attract political attention—already, several senators have voiced support for stronger copyright protections against AI training.

In the coming months, expect Microsoft and OpenAI to file motions to dismiss or transfer the case. A ruling on a preliminary injunction could come as early as 2027, which might force immediate changes to Copilot. If the publishers gain even a partial victory, it could trigger a wave of licensing deals similar to those in the music industry after sampling lawsuits. For Windows users, that likely means Copilot will become more cautious—and possibly more expensive—as companies pass on the cost of content access.

The broader significance is clear: AI’s relationship with journalism is at a crossroads, and the courts will decide who pays for the training data that powers the next generation of Windows features.