Claude Code Proof-of-Concept Shows How Clean GitHub Repos Can Lead to Developer Machine Takeover
Security researchers have demonstrated a proof-of-concept attack that can trick AI coding agents into running a reverse shell from a GitHub repository that appears clean during review. The attack targets the way agentic coding tools follow setup instructions, recover from errors, and execute commands on behalf of developers.
The research, published by Mozilla’s Zero Day Investigative Network, shows how a harmless-looking project can lead to shell access on a developer’s machine when an AI agent tries to get the project running. The 0DIN analysis focuses on Claude Code, but the broader risk applies to coding agents that can read files, run shell commands, and make network requests.
Access content across the globe at the highest speed rate.
70% of our readers choose Private Internet Access
70% of our readers choose ExpressVPN
Browse the web from multiple devices with industry-standard security protocols.
Faster dedicated servers for specific actions (currently at summer discounts)
The issue is not that the repository contains an obvious malware file. The dangerous payload lives outside the repository and gets fetched at runtime, which means static code scanners, human reviewers, and the agent may not see the full attack chain before execution.
Why the Claude Code attack matters
Agentic coding tools are powerful because they can inspect projects, install dependencies, run tests, fix errors, and execute commands. That same convenience creates risk when the agent processes untrusted repository content as part of a setup flow.
Anthropic’s Claude Code security documentation says the tool uses read-only permissions by default and asks for explicit approval before actions such as editing files or running commands. It also warns users that Claude Code only has the permissions they grant it.
The proof-of-concept shows why approval prompts and command visibility must be treated carefully. A developer may approve a setup step that appears routine, while the actual payload arrives later from a separate system.
How the proof-of-concept works
The attack chain starts with a normal-looking GitHub repository. The README describes a fictional cloud deployment tool and gives ordinary setup instructions that would not look suspicious during a quick review.
Next, the package fails in a controlled way and tells the user to run an initialization command. When the developer asks the agent to get the project working, the agent treats the error as a normal recovery step and follows the instruction.
The initialization process then calls a script that retrieves attacker-controlled data from a DNS TXT record. According to the 0DIN research, the payload is not present in the repository and only appears when the setup process runs.
| Attack stage | What the defender sees | Why it may be missed |
|---|---|---|
| Repository review | Normal setup instructions and project files | The payload is not stored in the repository. |
| Package execution | A routine error and a suggested initialization command | The error looks like a normal first-run setup issue. |
| Runtime fetch | A DNS TXT lookup | The payload arrives from infrastructure outside the codebase. |
| Command execution | A setup process that finishes without obvious warning | The developer may not see what the script executes after resolution. |
| Compromise | An outbound connection from the developer machine | The shell runs with the developer’s own user privileges. |
What attackers could access
If the reverse shell connects successfully, the attacker can interact with the machine under the developer’s user account. That can expose project files, environment variables, local configuration, tokens, and other development secrets.
The risk is especially serious for developers who keep cloud keys, GitHub tokens, package registry credentials, or AI API keys in environment variables or local configuration files. A compromised workstation can also become a stepping stone into source code, build systems, and internal services.
OWASP classifies prompt injection as a top LLM application risk. The OWASP LLM01:2025 guidance says prompt injection can cause unintended behavior, including sensitive information disclosure, unauthorized access to connected functions, and arbitrary command execution in connected systems.
Why this is an agentic AI problem
The proof-of-concept does not depend on a traditional malicious source file. It depends on the agent’s ability to interpret instructions, handle errors, and run commands with access already granted by the user.
Anthropic has acknowledged the broader security challenge around autonomous coding workflows. In its Claude Code auto mode post, the company said users approve most permission prompts, which can create approval fatigue and reduce attention during repeated command approvals.
The same post explains that Anthropic built auto mode to reduce unsafe choices while keeping developer workflows usable. It also makes clear that safer automation still depends on classifying risk correctly and understanding what an agent plans to do.
Earlier Claude Code advisory shows similar trust-boundary risk
The new proof-of-concept is separate from a previous Claude Code security advisory, but both point to the same theme: agentic tools need strict boundaries when they process untrusted content.
A GitHub security advisory for CVE-2025-55284 said older Claude Code versions had an overly broad allowlist that could let attackers read files and send contents over the network without user confirmation.
The advisory lists the issue as high severity and says it affected Claude Code versions before 1.0.4. It also says users on standard auto-update received the fix automatically, while current users are unaffected because older versions have been deprecated.
What makes the new attack hard to spot
The attack spreads its logic across multiple layers. The repository looks ordinary, the package error looks normal, the DNS lookup looks like configuration retrieval, and the agent sees a setup command that appears pre-authorized by the project.
That separation weakens many common defenses. Static analysis may not see off-repository payloads. Human review may focus only on source files. Network monitoring may treat a DNS query as routine unless teams specifically watch for suspicious runtime behavior.
The OWASP prompt injection page also notes that indirect prompt injection can come from files or websites and does not need to be visible to humans as long as the model parses the content.
How developers can reduce the risk
Developers should avoid giving coding agents broad approval to run setup scripts from unfamiliar repositories. Treat every setup command, install script, package hook, and network call as untrusted until reviewed.
- Run unfamiliar repositories inside a VM, container, or disposable development environment.
- Do not expose production API keys, cloud credentials, or Git tokens to local test environments.
- Review shell scripts and package lifecycle hooks before allowing an agent to execute them.
- Block or alert on unexpected outbound connections from coding-agent processes.
- Use least-privilege tokens and rotate credentials after any suspicious agent activity.
- Avoid approving commands that fetch remote content and pipe it into an interpreter.
The Claude Code security guide recommends reviewing suggested commands before approval, verifying changes to critical files, using virtual machines for risky tool calls, and maintaining good security practices even with built-in safeguards.
What security teams should monitor
Security teams should treat coding agents as privileged development tools, not as simple text assistants. If a tool can read local files, execute commands, and make network requests, it needs endpoint monitoring, policy controls, and logging.
Useful detections include suspicious DNS TXT lookups during project setup, outbound shells from developer workstations, unexpected reads of .env files, unusual access to SSH keys, and agent-initiated commands that connect to unfamiliar hosts.
Teams should also review whether developers use unsafe modes or broad allow rules. The Anthropic engineering post says skipping permissions entirely gives Claude broad freedom and remains unsafe for most situations.
What companies should do now
Organizations that allow AI coding agents should set clear rules for untrusted repositories. The safest approach is to isolate execution, reduce credential exposure, and require human review for commands that install packages, start scripts, or contact external infrastructure.
Companies should also update Claude Code and audit older installations. The Claude Code advisory shows why outdated versions and permissive command rules can increase the blast radius of prompt-injection attacks.
The main lesson is simple: AI coding agents can speed up development, but they also turn repository content, error messages, and setup scripts into active security inputs. Developers should not let an agent execute code from an unfamiliar project without isolation and review.
FAQ
It is a proof-of-concept attack showing how a clean-looking GitHub repository can trick an AI coding agent into running setup steps that fetch and execute an external payload, leading to a reverse shell on the developer’s machine.
No. In the proof-of-concept, the dangerous payload is fetched at runtime from attacker-controlled infrastructure. This makes the repository appear clean to static scanners and human reviewers.
No. The research focuses on Claude Code, but the broader risk applies to agentic coding tools that can follow setup instructions, execute shell commands, read local files, and make network requests.
A successful attack can expose files, environment variables, cloud credentials, API keys, Git tokens, SSH keys, and other secrets available to the developer’s user account.
Developers should run unfamiliar repositories in isolated environments, avoid exposing real credentials, review setup scripts before execution, limit agent permissions, and monitor unexpected network connections from coding-agent processes.
Read our disclosure page to find out how can you help VPNCentral sustain the editorial team Read more
User forum
0 messages