Governor Josh Shapiro dropped a major AI update at the AI Horizons Summit in Pittsburgh, telling over 900 technology and business leaders that Pennsylvania is adding Microsoft Copilot to its existing ChatGPT Enterprise deployment. The move widens access to generative AI for qualified state employees, creating what the administration calls “the most advanced suite of generative AI tools offered by any state.”
This isn’t a pie-in-the-sky experiment. A year-long pilot with OpenAI’s ChatGPT Enterprise began in January 2024, involving roughly 175 employees across 14 agencies. The Office of Administration ran the pilot in partnership with Carnegie Mellon University and OpenAI. Participants reported slashing an average of 95 minutes per day on tasks like drafting emails, summarizing long documents, researching policy, and basic coding. The administration now uses those numbers to justify scaling up.
The rollout pairs two enterprise-grade AI assistants—ChatGPT Enterprise and Microsoft Copilot—under a single governance umbrella. It also cements Pennsylvania’s position as a national leader in public-sector AI readiness. An independent assessment by Code for America already ranks the commonwealth among the top three states for AI capacity and leadership.
From pilot to purchase order: what changed
The pilot gave the state hard data on how workers actually use generative AI. Exit surveys and telemetry showed large perceived time savings, but outputs needed human verification and editing—a reality check for anyone expecting fully autonomous AI. Still, the pilot convinced leadership that the tools could boost productivity without replacing jobs.
Now the state is moving from a limited test to a managed, enterprise-grade deployment. Microsoft Copilot adds a deep integration into the Microsoft 365 apps agencies already rely on. Word, Outlook, PowerPoint, Excel, and Teams will all gain an AI assistant that can summarize emails, draft documents, generate slides, and automate repetitive tasks. That complements ChatGPT Enterprise’s strength in open-ended Q&A, research, and creative drafting.
Both products run under a “human-in-the-loop” model. The AI assists, but final decisions and official documents stay with trained employees and reviewers. That principle is baked into Executive Order 2023-19, which established a Generative AI Governing Board and set core values of accuracy, transparency, privacy, and human oversight.
Governance gets a labor voice
The administration isn’t just throwing tools at workers. It’s creating a Generative AI Labor and Management Collaboration Group to give unions and employees a formal seat at the table. The group will help design workflows, address concerns about automation, and ensure AI augments rather than replaces jobs. Shapiro’s public line is that AI is a “job enhancer, not a job replacer.”
Mandatory training comes with the tools. Employees must complete competency-based programs covering prompt engineering, data privacy, and verification techniques before they can use the AI. The Generative AI Governing Board, formed under the executive order, retains authority over policy, vendor contracts, and expansion plans. It’s a centralized control model that lets agencies run specialized pilots under clear guardrails.
This hybrid approach—strong central policy plus worker collaboration—mirrors best practices in other states that have moved beyond skunkworks projects. Code for America’s Government AI Landscape Assessment specifically cited Pennsylvania’s leadership in pairing policy with capacity building.
The security checklist IT pros need
Deploying AI in government demands a fortress around data. The state’s technical blueprint offers a practical checklist:
- Classify data and apply sensitivity labels before AI access.
- Route high-sensitivity and controlled unclassified information (CUI) only through secure tenancies like Azure Government.
- Enable robust audit logs, retention policies, and eDiscovery to support transparency and FOIA requests.
- Deploy least-privilege access and phishing-resistant multifactor authentication (MFA) for accounts using Copilot or ChatGPT.
- Require prompt provenance logs and mandate human verification for legal, benefits, or safety-critical outputs.
ChatGPT Enterprise provides tighter administration controls and restricts vendor training on state data. Microsoft Copilot operates within the state’s existing Microsoft 365 environment, leveraging Purview classification, Data Loss Prevention (DLP) policies, and audit logging. Both vendors emphasize enterprise-grade compliance, but the state’s procurement must still negotiate explicit portability and audit rights.
$10 million for AI governance research
The summit also showcased a broader economic play. BNY (Bank of New York) and Carnegie Mellon University announced a five-year, $10 million partnership to establish the BNY AI Lab at CMU’s School of Computer Science. The lab will focus on governance, trust, and accountability for mission-critical AI—directly feeding expertise into the public sector deployments. It’s a strategic move to build a regional AI cluster, blending academic research with practical government applications.
Google joined in with a statewide AI Accelerator for small businesses. Free training and toolkits aim to help entrepreneurs cut costs and scale operations using AI. This public-private skills push dovetails with the administration’s narrative that AI investments fuel economic growth.
The administration claims these efforts have helped attract over $25 billion in private-sector commitments and between 11,000 and 12,400 new jobs since Shapiro took office. The numbers vary slightly across press releases, reflecting fast-moving announcements and aggregated reporting. Procurement officials should verify specific projects against the Department of Community & Economic Development’s database rather than rely on headline totals.
The 95-minute claim: what it really means
The most eye-catching number from the pilot is the self-reported 95-minute daily time savings. It comes from exit surveys and interviews with pilot participants. While that’s a strong signal of perceived productivity gains, it’s not an independent audit. Public-sector reporting on the pilot noted that AI outputs still required significant human checking and editing. For any organization benchmarking this figure, treat it as a proof-of-concept metric, not a guarantee of system-wide efficiency.
Similarly, the “most advanced suite” claim is aspirational. No independent authority ranks states by AI tool bundles. Pennsylvania’s dual-vendor approach is indeed rare, but the phrase is better seen as a promotional positioning than an objective fact.
Risks that keep CIOs up at night
Any public-sector AI rollout faces hard realities:
- Accuracy and hallucinations: Generative models confidently fabricate facts. For any decision affecting legal rights, benefits, or safety, human verification is mandatory.
- Privacy and FOIA: AI inputs and outputs are likely public records. Contracts must specify data retention, exportability, training reuse restrictions, and vendor obligations for FOIA requests.
- Vendor lock-in: Relying on a single cloud or assistant can become a long-term trap. Procurement must demand data egress clauses, measurable SLAs, and audit rights.
- Equity and bias: Models trained on imbalanced data can produce biased outputs. Regular fairness testing, diverse red-team reviews, and public audit reports are essential.
- Workforce disruption: Even “augmentive” AI changes roles. The Labor Collaboration Group is a good start, but the state will need robust retraining programs and transparent outcome measurements.
These aren’t distant threats. They’ve tripped up federal and international pilots already. Good governance is the dividing line between a productivity leap and a public relations disaster.
Playbook for other states
For CIOs and digital service teams eyeing a similar path, Pennsylvania’s experience yields a practical cheat sheet:
- Run a short, instrumented proof-of-value on high‑impact, low‑risk workflows.
- Capture baseline metrics (average handle time, throughput, error rates) so savings can be verified, not just self-reported.
- Set clear human-review thresholds and version all prompt logs.
- Build training programs with measurable competency goals.
- Negotiate contracts with portability, audit rights, and no‑training‑on‑your‑data clauses.
- Work with unions and HR early to design role‑redesign pathways.
These steps echo federal recommendations and harden the deployment against the common failures seen in early AI pilots.
What to watch next
The real test is execution. The state must deliver on training, DLP enforcement, tenancy configuration, and centralized auditing at scale. Independent third‑party audits will be crucial to validate the productivity claims beyond exit surveys.
Procurement transparency will matter, too. Watch for contract details that reveal how portable the AI solutions really are. If the state doesn’t lock in strong data‑use limitations, it risks long‑term sovereignty and cost overruns.
Public trust hinges on openness. Pennsylvania should publish red‑team results, governance board minutes, and annual transparency reports detailing deployments, incidents, and outcomes. Without that, the “most advanced” label will ring hollow.
Pennsylvania’s AI Horizons Summit marks a significant moment. It’s a carefully orchestrated shift from cautious experimentation to enterprise adoption, backed by workforce engagement, academic firepower, and private-sector muscle. The ambition is clear. Now the hard work begins.