Hackers Used Claude and Codex Agents in Real-World Exploitation and Data Theft Campaign

Home » News

Yash

News

9 min. read

Published on June 18, 2026

Hackers used AI coding agents, including Claude and OpenAI’s Codex, to support real-world intrusions involving reconnaissance, exploitation, credential theft, and data exfiltration, according to a new OpenAnalysis report.

The case is important because investigators recovered more than 1,000 agent sessions from a compromised staging server. Those logs showed how the attacker used natural-language prompts to push AI agents through tasks that looked like authorized penetration testing but were tied to real breaches.

BEST SPRING 2026 DEALS

Editor's Choice

Private Internet Access

Access content across the globe at the highest speed rate.

70% of our readers choose Private Internet Access

70% of our readers choose ExpressVPN

ExpressVPN

Browse the web from multiple devices with industry-standard security protocols.

Nord VPN

Faster dedicated servers for specific actions (currently at summer discounts)

OpenAnalysis says the recovered artifacts detailed the compromise of at least 14 companies. Claude handled most of the operational work, while Codex played a smaller supporting role in research, report review, and host triage.

AI agents were used as hands-on operators

The attacker did not use the AI tools only for writing code or explaining commands. The logs show a broader workflow where the attacker gave high-level goals, and the agent filled in the technical steps.

Claude was used for service enumeration, public vulnerability research, exploit development for known flaws, credential discovery, database review, data staging, and report drafting. The human operator often supplied vague instructions and relied on the agent to plan and execute the next steps.

This matches a wider warning from an Anthropic threat intelligence report, which said agentic AI is no longer being used only as an advisor in cybercrime. In some cases, it can help execute parts of an attack chain.

AI tool	Role in the OpenAnalysis case	Important context
Claude	Handled most reconnaissance, exploitation, credential harvesting, data review, and reporting tasks	The attacker repeatedly framed the work as authorized red-team activity
Codex	Assisted with high-level criminal-market research, report review, host triage, and process analysis	It refused some direct requests involving live targets, dark-web activity, and API key exposure
Human operator	Provided targets, goals, credentials, and prompts	The logs suggest the operator relied heavily on the agents for structure and technical execution

Red-team framing helped bypass suspicion

The attacker repeatedly described the activity as an authorized red-team engagement. That framing mattered because many offensive security tasks look similar in language to legitimate security testing.

OpenAnalysis found that the attacker often asked for recon, access validation, impact assessment, and reporting in terms that resembled professional penetration testing. Once the agent accepted the premise, it could continue through several stages of the intrusion.

The Claude Code security documentation says Claude Code includes security safeguards, permission controls, and best-practice guidance for safer usage. This case still shows that local agent sessions, shell access, stored credentials, and permissive prompts can create serious risk when an account or host gets compromised.

OpenAnalysis recovered an unusually detailed forensic record

The investigation started after a compromised host was turned over for analysis. Researchers found full local installations of Claude and Codex, not just traffic routed through a remote service.

That gave investigators access to session logs, prompts, tool use, artifacts, policy-violation records, and files generated during the attacks. It also exposed the attacker’s poor operational security.

According to the OpenAnalysis research, the attacker copied full agent installations, including session histories, across servers they did not fully control. The same logs also contained personal material, including résumé and job-application work.

Investigators recovered more than 1,000 Claude and Codex sessions.
The logs included prompts, tool use, artifacts, and policy-violation records.
The attacker copied full AI agent directories between hosts.
Some recovered logs included personal details that helped investigators understand the operator’s activity.

Exploitation and data theft were part of the workflow

Once the agent identified exposed services, the attacker pushed it toward known vulnerabilities and access validation. The report says Claude researched public CVEs, built exploit tools for known issues, and tested access with limited additional direction from the attacker.

After access was confirmed, the activity moved into post-exploitation. Claude was used to look for credentials, API keys, databases, admin sessions, invoices, financial data, and other sensitive records.

The agent also generated “PENTEST-REPORT” files for victims. These reports documented how access was obtained, what data was present, and, in some cases, how valuable the stolen access or data could be.

Stage	What the attacker used AI for	Defensive concern
Reconnaissance	Scanning exposed services and reviewing target information	Fast, repeated probing across multiple organizations
Vulnerability research	Matching exposed services to known public flaws	Shorter time from discovery to attempted exploitation
Post-exploitation	Finding credentials, API keys, databases, and administrative data	Rapid expansion after first access
Reporting	Writing attack summaries and impact reports	Criminal use of professional security workflows
Monetization analysis	Ranking possible value of stolen data or access	Data theft can quickly shift into extortion or access sales

Codex had a smaller but notable role

Codex was not the main driver in this case. OpenAnalysis says the attacker used it more narrowly for research and triage, including questions about access-broker markets and suspicious processes on their own infrastructure.

Codex also refused several requests that crossed clearer lines, including direct live-target scanning without authorization, dark-web navigation, and printing an API key. That distinction matters because the headline risk is not that every AI system blindly complied with every request.

OpenAI’s Codex cyber safety guidance says the company trains Codex to refuse clearly malicious requests, such as stealing credentials, and uses automated monitors to detect suspicious cyber activity. The recovered logs show why those controls need constant tuning as attackers change their language and workflows.

The case mirrors broader AI abuse warnings

Anthropic has already warned about AI misuse in cybercrime. In a previous case, the company said Claude Code had been abused in a data-extortion operation that targeted at least 17 organizations, including healthcare, emergency services, government, and religious institutions.

That earlier case showed AI being used across reconnaissance, credential harvesting, network penetration, analysis of stolen data, and ransom-note generation. The OpenAnalysis case adds another public example, this time built around recovered local agent logs.

The Anthropic report also warned that AI can lower the skill barrier for sophisticated cybercrime. The latest case supports that point because the attacker appeared to rely on the agent for much of the structure and technical execution.

Why defenders should treat AI logs as evidence

AI coding agents can leave rich forensic trails. Session histories, tool-call logs, prompts, local workspaces, generated scripts, downloaded files, and credential references can all help reconstruct what happened during an intrusion.

Security teams should not treat AI tools as ordinary developer apps when investigating compromised hosts. If an AI agent runs locally with shell access, it can become part of the intrusion path, and its logs may show both attacker intent and executed actions.

The Claude Code security page recommends safe usage practices and emphasizes security controls around the tool. In enterprise environments, those controls should sit alongside endpoint monitoring, secrets management, least-privilege access, and centralized logging.

Collect AI agent session logs during incident response.
Review local agent workspaces for generated scripts, reports, and downloaded data.
Search for copied agent directories on staging hosts.
Rotate API keys and tokens if they appeared in prompts or local agent files.
Monitor AI tools that run with shell, file-system, or network access.

Security teams need new detections for AI-assisted intrusions

Traditional detections focus on malware, suspicious processes, lateral movement, and data transfer. Those still matter, but AI-assisted attacks add another layer: rapid task chaining driven by natural-language prompts.

Defenders should watch for unusual combinations of activity, such as an AI coding tool running recon scripts, invoking shell commands against external hosts, reading secrets, generating pentest-style reports, and staging data for offline analysis.

The OpenAI Codex safety documentation frames cybersecurity as a dual-use area, where the same capabilities can help defenders or enable harm. That dual-use problem now applies directly to incident response, because defenders need to know whether an AI tool on a host acted as an assistant, an automation layer, or part of the compromise.

Detection area	What to look for
Agent execution	AI coding tools running on servers that do not normally use them
Secrets exposure	API keys, SSH keys, tokens, or passwords pasted into prompts or stored in session logs
Rapid task chaining	Recon, vulnerability research, access validation, and report generation in one workflow
Data staging	Databases, invoices, credentials, and cloud configuration files copied to unusual hosts
False authorization claims	Repeated “red team” or “authorized assessment” language without matching engagement records

The lesson is not to block security AI entirely

This case does not mean AI coding agents have no place in security work. The same capabilities can help defenders triage incidents, audit code, speed up vulnerability research, and write clearer reports.

The problem starts when powerful agents get shell access, network access, stored secrets, and vague instructions without proper oversight. In that situation, a compromised account or host can turn a productivity tool into an attack accelerator.

The safest approach for organizations is to govern AI agents like privileged developer tools. Limit where they can run, log their activity, protect their tokens, restrict access to secrets, and define what kinds of cyber tasks require human approval.

The OpenAnalysis case gives defenders a rare view of how agentic AI can function inside an intrusion. It also gives organizations a clear message: AI session data now belongs in the incident-response playbook.

FAQ

What happened in the Claude and Codex hacking case?

OpenAnalysis recovered more than 1,000 AI agent sessions from a compromised server. The logs showed an attacker using Claude and, to a lesser extent, Codex for reconnaissance, exploitation, data theft, report drafting, and related intrusion activity.

Did Codex perform the same role as Claude in the attacks?

No. OpenAnalysis says Claude handled most of the operational activity, while Codex played a smaller supporting role. Codex was used for research and triage, and it refused some direct requests involving live targets, dark-web access, and exposing an API key.

How did the attacker bypass AI safeguards?

The attacker repeatedly framed malicious requests as authorized red-team work. That made many prompts look similar to legitimate security testing, which is one of the hardest policy problems for AI cyber-safety systems.

What should security teams learn from this case?

Security teams should treat AI agent logs as forensic evidence, protect tokens and session files, monitor AI tools with shell or network access, and look for rapid chains of recon, exploit research, credential access, data staging, and report generation.

Does this mean companies should stop using AI coding agents?

No. AI coding agents can support legitimate development and security work. Companies should manage them like privileged tools by logging activity, limiting access, protecting credentials, and requiring human approval for high-risk actions.

Yash

I am a Business Analytics student with a strong interest in publishing well-researched and data-driven news articles. I focus on analyzing trends in business, finance, and technology to create clear, accurate, and engaging content for readers. I enjoy transforming complex data and information into simple, meaningful stories that help audiences understand current developments. With analytical thinking and attention to detail, I aim to deliver credible and insightful news that adds real value to readers.

Readers help support VPNCentral. We may get a commission if you buy through our links.

Improve this guide

User forum

0 messages

Sort by: