Security teams must stop treating AI models as isolated black boxes and start red teaming the entire application stack—data connections, identity systems, automation layers, and logs—if they want to uncover the most dangerous vulnerabilities. That’s the message from Craig Nelson, a lead in Microsoft’s AI red teaming practice, shared in recent guidance to enterprise security professionals. Nelson, who spearheads offensive security efforts across Microsoft's AI portfolio, argues that attackers don't care about academic nuances of model architecture. They go after the weakest link, and that link is rarely the model itself.
Security researchers and IT admins have spent countless hours probing large language models (LLMs) for prompt injection flaws and jailbreaks. Those exercises are important, but they represent only a sliver of the attack surface. Nelson stresses that true AI red teaming requires a holistic view—one that encompasses the data pipelines feeding the model, the identity fabric controlling access, the automation layers that enable agents to take action, and the logs that record every move. Without this full-stack mindset, organizations leave dangerous blind spots that adversaries are all too willing to exploit.
The Limits of Model-Centric Red Teaming
Traditional AI red teaming often fixates on the model endpoint. Testers craft mischievous prompts, embed hidden commands in images or documents, and search for ways to make the AI behave badly. While this uncovers genuine flaws—such as a chatbot leaking its system prompt or a code assistant offering malicious snippets—it misses the bigger picture. Real-world AI applications are not just models; they are complex systems with databases, APIs, back-end services, and integrated authentication.
Consider a typical customer service chatbot. The model might be secure against direct prompt injection, but what if an attacker tampers with the training data stored in a misconfigured blob storage container? Or what if the bot’s identity token is over-privileged, allowing it to read sensitive HR files when a user asks about vacation days? These are data and identity risks that never touch the model’s inference logic. Nelson’s team has repeatedly seen that attackers chain together such non-model vulnerabilities to achieve far greater impact than a simple jailbreak.
Data: The Foundation of AI Risk
Data is the fuel for AI, and compromising that fuel is often easier than breaking the engine. Red teams must examine every phase of the data lifecycle: ingestion, storage, transformation, and consumption. Common attack patterns include:
- Data poisoning: Injecting malicious samples into training or fine-tuning datasets to skew model behavior. For example, subtly altering spam classification examples so the AI marks phishing emails as safe.
- Exfiltration via inference: Sending thousands of carefully crafted queries to reconstruct private training data, revealing sensitive corporate information.
- Supply chain attacks: Manipulating data coming from third-party sources or public datasets before they reach the AI pipeline.
Nelson emphasizes that red teamers need to look at data connections. That means testing the platforms where data is stored (like Azure Blob Storage, AWS S3, or on-premises databases) and the ingestion pipelines (like Azure Data Factory or AWS Glue). Are these services properly authenticated? Are encryption keys managed securely? Can a low-privileged developer modify a feature store that silently corrupts a recommendation algorithm? These are the questions a full-stack approach asks.
One often overlooked risk is the data that AI systems generate themselves. Logs, telemetry, and output caches can become treasure troves for attackers if not properly secured. A red team might discover that an AI application logs user prompts in plain text alongside internal system messages, exposing sensitive internal APIs.
Identity: Who Controls the AI?
Identity is the new perimeter for AI. An AI application rarely operates with its own identity; it inherits permissions from the services it connects to and the context of the user or agent triggering it. Flaws in identity architecture are among the most severe vulnerabilities Nelson’s red teams uncover.
A typical enterprise AI app might interact with Microsoft Graph for calendar data, SharePoint for documents, and a custom HR API. The application’s service principal or managed identity often holds a powerful set of permissions—sometimes far beyond what’s needed. In a red team exercise, a seemingly innocuous prompt like “Summarize my next meeting” could be hijacked if the attacker can manipulate the identity flow. For example, if the app uses OAuth but fails to validate the token audience, an attacker might swap the token for one that gives admin privileges.
Agentic AI systems, where the AI can make decisions and execute actions autonomously, amplify these risks. A Red team must test scenarios like:
- Over-permissioned agents: An agent designed to manage emails might have the ability to delete entire mailboxes.
- Chained authentication: An agent that calls other agents, each with its own identity, might allow an attacker to pivot across services.
- Cross-tenant attacks: In multi-tenant setups, a flaw in identity federation could let a tenant access another’s AI data.
Nelson’s team uses a simple rule: Treat AI applications like any other cloud-native app and apply zero-trust principles. Verify every request explicitly, use least-privilege access, and assume breach. Red teamers must probe the identity layer by stealing tokens, escalating privileges, and moving laterally between services. Tools like ROADtools and Stormspotter, often used in Azure AD red teaming, are just as relevant here.
Automation: When AI Acts on Its Own
Automation is the natural evolution of AI from a passive query engine to an active participant in business processes. AI agents now send emails, modify files, update CRM records, and even trigger financial transactions. For red teamers, this opens a new frontier of risk because the consequences of a compromised instruction can ripple far beyond a text response.
Microsoft’s guidance highlights several automation-related attack vectors:
- Instruction injection via data: An email containing a hidden command that an AI agent reads and then executes, perhaps forwarding a confidential document to an external address.
- Workflow hijacking: Manipulating the logic of an AI-driven automation flow, like an approval process that skips a critical step.
- Unintended side effects: An AI tasked with cleaning up storage might misinterpret a file’s importance and delete critical data.
Red teaming automation demands simulating human-in-the-loop bypasses. Testers should try to craft prompts that cause the AI to perform unauthorized operations without alerting supervisors. They should also assess whether the automation system has proper guardrails—like confirmation prompts, rate limits, and rollback mechanisms. Nelson notes that many teams only test the AI’s output, not the downstream consequences of that output in an automated workflow. That’s a costly mistake.
Logs: The Blind Spot
Logging is often an afterthought in AI development, but it is the backbone of detection and forensics. Without comprehensive logs, a red team’s successful attack might go undetected for months. Nelson stresses that red teamers must not only try to cover their tracks but also evaluate whether the current logging infrastructure would catch them.
Key logging deficiencies his team often finds:
- Insufficient detail: Logs that record “AI query received” without capturing the full prompt, the user context, or the model’s decision path.
- Missing auth events: Failed authentication attempts against the AI endpoint are not logged, allowing brute-force or token-replay attacks to go unnoticed.
- No correlation: Security events are siloed. A suspicious AI prompt and a simultaneous spike in API calls from the same IP might be visible only in separate systems, so no alert fires.
A robust logging strategy for AI red teaming should include all interactions with the model, all identity and authorization events, data access patterns, and automation decisions. Red teamers should actively test logging coverage by executing known attack patterns and verifying that the logs generate appropriate alerts. Nelson advises organizations to treat AI logs as part of their SIEM (security information and event management) strategy, integrating with tools like Microsoft Sentinel or Splunk.
Microsoft’s AI Red Teaming Playbook
Microsoft has been building AI red teaming capabilities since at least 2018 and formalized its approach in the AI Red Team program. Drawing from that experience, Nelson’s team follows a methodology that includes:
- Threat modeling the full stack: Identify all components—data stores, model endpoints, identity services, middle tiers—and map potential attack paths.
- Using adversary personas: Role-play specific threat actors, from malicious insiders with limited access to sophisticated external attackers with deep pockets.
- Red teaming in layers: Start with the data layer, move to identity, then automation, and finally the model. Each layer’s findings inform the next.
- Automating attacks: Where possible, automate routine attack simulations using tools like the Microsoft Counterfit tool or custom scripts. This allows for continuous red teaming.
- Collaborating with blue teams: Share tactics, techniques, and procedures (TTPs) with defenders to improve detection and response playbooks.
Microsoft also emphasizes the importance of diversity in the red team. AI red teaming isn’t just for traditional security researchers; it requires data scientists, ML engineers, and even sociotechnical experts to understand failures of responsible AI. Nelson’s team regularly pulls in domain specialists to test for bias, fairness, and privacy issues that go beyond pure security.
Implications for Windows and Enterprise Users
For Windows enthusiasts and IT professionals managing Windows environments, these recommendations are not academic. Microsoft is rapidly infusing AI into the Windows ecosystem—Copilot in Windows 11, AI capabilities in Microsoft 365, Azure AI services, and even on-device AI with NPU-powered features. Every new AI feature expands the attack surface that defenders must understand and test.
Consider a Windows enterprise that deploys Copilot for Microsoft 365. The AI interacts with user emails, files, and calendars, and can be extended with plugins that automate actions in third-party apps. A red team exercise that only probes the Copilot chat interface for prompt injections misses the risk of a compromised plugin token that gives an attacker access to a backend CRM. Similarly, on-device AI models like Windows Studio Effects might process personally identifiable information (PII); a data exfiltration risk exists if those streams are not properly protected.
Nelson’s advice is clear: when rolling out any AI feature, Windows admins should include it in their existing security testing cadence. Use tools like Windows Defender Application Control or Attack Surface Reduction rules to constrain what AI agents can touch, and enable detailed logging via Windows Event Forwarding. And don’t assume the AI vendor has already done the full-stack red teaming—verify.
The Road Ahead: Continuous Full-Stack Red Teaming
AI systems are living software, constantly retrained, updated, and integrated with new data sources. Static, point-in-time red teaming is insufficient. Nelson advocates for continuous red teaming embedded in the MLOps lifecycle. That means automated security tests run with every model retraining and every application deployment, covering data validation, identity posture, automation guardrails, and log fidelity.
The message from Microsoft’s top AI red teamer is a wake-up call. As AI moves from pilot projects to mission-critical infrastructure, the gap between model-centric testing and full-stack red teaming will be the difference between a secure deployment and a headline-making breach. Security teams that follow this advice will not only harden their AI today but also build the muscle memory to tackle the next generation of agentic AI threats—where the lines between data, identity, automation, and logs blur further.
For Windows administrators, the takeaway is simple: start treating AI features like any other high-value asset. Inventory them, threat model them, and test them relentlessly—not just the model, but the entire stack that makes it go.