Microsoft Warns: Claude Code GitHub Action Exploitable via Prompt Injection to Leak CI/CD Secrets

Vulnerability | dark6 | 8 June 2026

Read Time:3 Minute, 21 Second

AI-powered coding tools are rapidly changing how developers build and ship software, but as these tools enter everyday development pipelines, they also open new doors for attackers. A recently uncovered vulnerability in Anthropic’s Claude Code GitHub Action illustrates just how dangerous that can be — and it was Microsoft’s own security team that found it.

Microsoft Threat Intelligence has disclosed a prompt injection vulnerability in the Claude Code GitHub Action that allowed attackers to exfiltrate CI/CD secrets, including API keys, from automated GitHub workflows. The issue has been patched by Anthropic in Claude Code version 2.1.128, released May 5, 2026.

How Prompt Injection Hijacks AI Agents

The attack exploits a technique called prompt injection. An attacker embeds a hidden instruction inside a GitHub issue or pull request — text that appears harmless to human reviewers but is treated as a command by the AI model processing it.

In Microsoft’s tests, a malicious payload instructed the agent to perform a “compliance review” — deliberately vague phrasing that bypassed Claude’s safety filters, which are tuned to reject obvious requests like “print the API key.” The payload also instructed the model to trim the first seven characters of the returned value, cleverly evading GitHub’s Secret Scanner in the process.

The Root Cause: Inconsistent Tool Sandboxing

The underlying issue was a gap between how two of Claude’s tools handled file access. The Bash tool executed inside a secure sandbox that stripped environment variables before running. The Read tool, however, did not apply the same rules.

Once manipulated via the injected prompt, the Read tool accessed /proc/self/environ directly inside the runner’s process memory — returning the unscrubbed ANTHROPIC_API_KEY alongside any other credentials present in the environment. The attacker could then reconstruct the full key and exfiltrate it through channels available to the workflow, such as web requests, issue comments, or action logs.

What an Attacker Could Do With a Stolen Key

Impersonate the CI/CD workflow and consume AI compute at the account holder’s expense
Access any downstream service tied to the compromised API key
Pivot to additional credentials if the workflow had broad permissions
Inject malicious behavior into future AI-assisted development tasks without detection

The full exploit required no special privileges — just the ability to open an issue or submit a pull request on any repository using the vulnerable action.

MITRE ATLAS Mapping

Microsoft mapped the attack chain to several MITRE ATLAS techniques: LLM Prompt Injection, AI Agent Tool Invocation, LLM Jailbreak, and AI Agent Tool Credential Harvesting. This is one of the first documented real-world demonstrations of these techniques chained together in a production CI/CD environment.

The “Agents Rule of Two” and Hardening Guidance

Microsoft introduced a new security principle called the “Agents Rule of Two”: an AI workflow should never simultaneously combine all three of the following — processing untrusted input, accessing sensitive secrets, and taking external actions or modifying state. Any two is acceptable; all three creates unacceptable risk.

Additional recommendations include:

Least-privilege token scoping — scope each API key to exactly what a specific workflow requires, and monitor usage at the provider level for anomalies
Hardened system prompts — explicitly instruct the agent that issue bodies, PR descriptions, and comments are untrusted data, not commands
Single-task pinning — constrain agents to a narrow, defined task to minimize the attack surface
Alert on unexpected API calls — flag new IP addresses or unusual endpoint activity tied to any API key wired into CI/CD

Patch Status and Immediate Action

Anthropic released a fix in Claude Code version 2.1.128. Organizations using the Claude Code GitHub Action should update immediately. Teams should also audit existing workflow configurations for overly broad token permissions and review whether any secrets could have been accessed before the patch was applied.

This incident is a timely reminder that as AI agents take on privileged roles in development pipelines, the attack surface expands dramatically. Prompt injection is not a theoretical threat — it is an active attack class targeting real production infrastructure today.