Microsoft Warns Claude Code GitHub Action Could Leak CI/CD Secrets Through Prompt Injection
Microsoft has warned that Anthropic’s Claude Code GitHub Action could expose CI/CD workflow secrets when AI agents process untrusted GitHub content such as issue bodies, pull request descriptions, and comments.
The issue allowed a manipulated AI agent to read sensitive environment data from a GitHub Actions runner, including the ANTHROPIC_API_KEY. Anthropic fixed the problem in Claude Code 2.1.128 on May 5, 2026.
Access content across the globe at the highest speed rate.
70% of our readers choose Private Internet Access
70% of our readers choose ExpressVPN
Browse the web from multiple devices with industry-standard security protocols.
Faster dedicated servers for specific actions (currently at summer discounts)
In its Microsoft Threat Intelligence report, Microsoft said the bug came from an inconsistent security boundary between Claude Code’s Bash tool and Read tool. Bash used environment scrubbing, while Read could access sensitive /proc files inside the runner.
Prompt injection turned GitHub content into instructions
The attack started with untrusted text in a GitHub issue, pull request, or comment. A human reviewer might see harmless text, but the AI agent reads the raw content and can treat hidden instructions as commands.
This type of attack is especially dangerous in CI/CD because workflows often have access to repository secrets, API keys, write tokens, build artifacts, and deployment systems. Once an AI agent can read untrusted content and call tools, a prompt can become an attack path.
MITRE ATLAS defines LLM Prompt Injection as a technique where attackers craft inputs to change the model’s behavior or instructions. Microsoft said the Claude Code case maps directly to that class of agentic AI risk.
| Attack step | What happened | Security impact |
|---|---|---|
| Untrusted GitHub content | An attacker placed hidden instructions in a public issue or pull request context. | The AI agent processed attacker-controlled text. |
| Prompt injection | The agent was steered into performing a fake compliance review. | The request avoided obvious secret-exfiltration wording. |
| Read tool access | The agent read /proc/self/environ from the runner. | The environment contained an unscrubbed Anthropic API key. |
| Secret laundering | The prompt asked the model to trim characters from the key output. | This helped bypass refusal behavior and secret scanning patterns. |
| Exfiltration | The secret could be sent through logs, comments, web requests, or other allowed channels. | An attacker could reconstruct and misuse the stolen credential. |
The flaw exposed a gap between Bash and Read tool protections
Claude Code GitHub Action can run inside a GitHub Actions workflow and respond to repository activity. Anthropic’s Claude Code GitHub Actions documentation says the tool can analyze code, create pull requests, implement features, and fix bugs from PRs and issues.
That workflow makes the tool useful, but it also places an AI agent inside a sensitive automation environment. If the agent can process untrusted text and use tools, attackers may try to control what the agent does next.
Microsoft found that Claude Code’s Bash path scrubbed environment variables during subprocess execution. The Read tool did not follow the same model and could access /proc/self/environ, which exposed the runner’s ANTHROPIC_API_KEY and possibly other credentials available to the process.
- The issue affected AI workflows that processed untrusted GitHub issue or PR content.
- The Read tool could access sensitive /proc files inside the GitHub Actions runner.
- The exposed environment data included the ANTHROPIC_API_KEY in Microsoft’s test.
- Anthropic mitigated the issue in Claude Code 2.1.128 by blocking sensitive /proc file access.
- Microsoft reported the issue to Anthropic through HackerOne on April 29, 2026.
Why CI/CD secrets are a high-value target
CI/CD workflows often hold credentials that can affect source code, cloud infrastructure, deployment systems, package registries, and internal services. A leaked key can do more than create billing risk. It can give attackers a bridge into the software supply chain.
GitHub’s secure use guidance for GitHub Actions warns developers to treat certain contexts as untrusted input and to consider whether attacker-controlled values can influence workflow behavior.
In this case, the attacker did not need direct access to secrets. Microsoft said the full exploit could start from the ability to open an issue or submit a pull request, depending on how the target workflow was configured.
| Secret type | Possible impact if leaked |
|---|---|
| AI provider API key | Unauthorized model usage, billing abuse, or access to connected automation flows |
| GitHub token | Repository changes, issue comments, pull requests, release changes, or workflow abuse |
| Cloud credential | Cloud resource access, data theft, persistence, or deployment tampering |
| Package registry token | Malicious package publishing or supply-chain compromise |
| Internal API key | Access to private services, customer data, or internal automation systems |
The attack bypassed several expected defenses
Microsoft said the malicious prompt used a benign-sounding “compliance review” framing rather than directly asking the model to print an API key. That helped avoid the model’s refusal behavior.
The prompt also instructed the model to remove the first seven characters of the key before outputting it. This prevented the leaked value from matching the obvious sk-ant- prefix that could trigger refusal or secret-scanning rules.
The behavior maps to AI Agent Tool Credential Harvesting, a MITRE ATLAS technique where an agent uses tool access to obtain credentials from files, environment variables, or other accessible sources.
AI workflows change the GitHub Actions threat model
Traditional GitHub Actions workflows run defined steps from YAML files. AI-powered workflows add a new layer because natural language from users can influence which tools the agent calls and what the agent decides to do.
GitHub’s Agentic Workflows project describes agentic repository automation that can run coding agents in GitHub Actions with guardrails and security-first design principles.
Microsoft’s warning shows why those guardrails matter. When untrusted repository content, secrets, and external communication sit in the same workflow, the AI agent can become a confused deputy that helps the attacker reach data they could not access directly.
- Issue bodies can contain hidden HTML comments or markdown that the model still reads.
- Pull request descriptions can carry instructions that look like normal text.
- Comments can trigger automation in workflows that respond to mentions or commands.
- Public workflow files can reveal which tools and exfiltration paths are available.
- Logs, issue comments, web requests, and tool outputs can become leak channels.
Microsoft recommends the Agents Rule of Two
Microsoft’s main recommendation is the Agents Rule of Two. An AI-powered workflow should not combine all three of these capabilities at the same time: processing untrusted input, accessing sensitive secrets, and changing state or communicating externally.
If a workflow needs to read untrusted issue or PR content, it should not also hold broad secrets and external communication tools. If it needs secrets, it should not allow arbitrary untrusted text to guide tool calls.
This principle does not remove the need for patching. Teams using Claude Code should make sure they run Claude Code 2.1.128 or later and review workflows that expose AI agents to public repository content.
| Risky combination | Safer design |
|---|---|
| Untrusted issue text plus secrets plus WebFetch | Remove secrets or block external communication in the issue triage workflow. |
| Public PR comments plus write tokens | Use read-only tokens for review workflows and require human approval for changes. |
| Agent file reads plus runner environment secrets | Keep secrets out of the runner unless the exact task requires them. |
| Broad API keys reused across workflows | Use one scoped key per workflow, environment, and provider. |
| Hidden prompt content treated as trusted instructions | Tell the agent that issue bodies, comments, diffs, and file contents are untrusted data. |
How developers should harden Claude Code workflows
Teams using Claude Code GitHub Action should first update to a fixed version. They should then review every workflow trigger that lets outside users influence the agent, including issues, pull requests, pull request comments, and issue comments.
Anthropic’s Claude Code GitHub Actions guide explains how the action integrates into GitHub workflows. Security teams should compare that setup against their own permissions, secrets, tool access, and trigger rules.
The system prompt should also define a clear trust model. It should tell the agent that content from issues, PRs, comments, commits, diffs, and files is untrusted data and must not override system instructions.
- Upgrade Claude Code to 2.1.128 or a later release.
- Audit workflows that process public issues, pull requests, and comments.
- Remove secrets from workflows that do not strictly need them.
- Use one API key per workflow and per environment.
- Scope GitHub tokens and cloud credentials to the minimum required permissions.
- Disable unnecessary tools such as WebFetch, Bash, or external posting paths.
- Require human approval before AI-generated changes reach protected branches.
GitHub Actions defenses still matter
AI prompt hardening should act as defense in depth, not the main control. If a workflow gives an agent secrets and outbound channels, a clever prompt may still bypass instructions.
Teams should use GitHub’s GitHub Actions security practices to review untrusted input, third-party actions, secrets, token permissions, and workflow triggers.
They should also apply stronger separation between untrusted content and sensitive execution. GitHub’s Agentic Workflows security model is relevant because it treats agent automation as a new form of CI/CD that needs strict isolation and controlled outputs.
| Control | Why it helps |
|---|---|
| Least-privilege tokens | Limits damage if an agent leaks or misuses a credential. |
| Environment separation | Prevents one workflow from exposing keys meant for another environment. |
| Human approvals | Stops AI-generated changes from reaching production automatically. |
| Secret rotation | Reduces the lifetime of exposed keys after a suspected leak. |
| Provider-side monitoring | Detects abnormal key usage, new IP addresses, and unusual API calls. |
AI-powered CI/CD needs a new security review
The Claude Code case shows that AI coding agents can blur the line between data and instruction. A GitHub comment may look like user feedback, but an agent can interpret it as operational guidance.
That is why AI workflows need threat modeling before teams connect them to secrets, write tokens, deployment rights, cloud credentials, or package publishing systems. The same workflow that saves developer time can become an automated exfiltration path when it trusts public text too much.
Security teams should map these risks to MITRE ATLAS prompt injection and MITRE ATLAS credential harvesting techniques, then build detections for suspicious file reads, unusual comments, unexpected outbound requests, and AI-generated secret-shaped output.
The Microsoft case study makes the lesson clear: public repository text should be treated as hostile by default when an AI agent can access secrets or take actions inside a CI/CD runner.
FAQ
Microsoft found that Claude Code GitHub Action could expose CI/CD workflow secrets when an AI agent processed untrusted GitHub content. The Read tool could access /proc/self/environ and read the runner’s ANTHROPIC_API_KEY.
Yes. Microsoft said Anthropic fixed the issue in Claude Code 2.1.128 on May 5, 2026 by blocking access to sensitive /proc files through the Read tool.
An attacker could place hidden prompt-injection instructions in a GitHub issue, pull request, or comment. If the workflow processed that untrusted content, the agent could be tricked into reading environment secrets and leaking them through allowed output channels.
The Agents Rule of Two says an AI workflow should never combine all three risky capabilities at once: processing untrusted input, accessing sensitive secrets, and changing state or communicating externally.
Developers should update Claude Code, review all AI workflow triggers, remove unnecessary secrets, scope tokens to least privilege, restrict external communication tools, and require human approval for sensitive changes.
AI agents can treat natural language as instructions while running inside environments that contain secrets, tokens, files, and automation tools. If attackers control part of the text the agent reads, they may influence tool calls or outputs.
Read our disclosure page to find out how can you help VPNCentral sustain the editorial team Read more
User forum
0 messages