Cloudflare says Anthropic’s Mythos Preview can build working exploit chains
Anthropic’s Mythos Preview model can do more than flag possible software bugs, according to new research from Cloudflare. It can also connect multiple flaws into working proof-of-concept exploits.
Cloudflare tested the security-focused AI model against more than 50 of its own repositories through Anthropic’s invite-only Project Glasswing. The company said Mythos Preview showed a major jump in automated vulnerability research because it could prove exploitability instead of only describing suspicious code.
Access content across the globe at the highest speed rate.
70% of our readers choose Private Internet Access
70% of our readers choose ExpressVPN
Browse the web from multiple devices with industry-standard security protocols.
Faster dedicated servers for specific actions (currently at summer discounts)
The finding matters for defenders because it can shorten triage time. It also matters for attackers because the same capability could reduce the time between bug discovery and working exploitation.
What Cloudflare found
Cloudflare said previous frontier models could often identify possible vulnerabilities and explain why they looked dangerous. The problem came later. Those models often failed to finish the exploit chain or prove that the bug could actually be reached and abused.
Mythos Preview changed that pattern in two important ways. It could construct exploit chains from smaller primitives, and it could generate proof code to confirm or reject a finding in a controlled environment.
This means the model could move from “this looks vulnerable” to “this is how the vulnerability works” with less human intervention.
| Area | What changed with Mythos Preview |
|---|---|
| Bug discovery | Finds possible flaws across real code repositories. |
| Exploit chaining | Connects smaller attack primitives into a higher-impact path. |
| Proof generation | Writes and runs code to test exploitability in a scratch environment. |
| Triage support | Produces clearer reproduction steps and fewer speculative findings. |
| Defensive value | Helps security teams prioritize bugs that are actually reachable. |
How Mythos Preview builds exploit chains
Real-world attacks rarely depend on one isolated bug. Attackers often chain several smaller weaknesses together until they reach a stronger result.
Cloudflare said Mythos Preview can reason across those steps. For example, a memory bug might become an arbitrary read or write primitive, then a control-flow hijack, then a full exploit path.
This is important because low-severity findings often stay buried in security backlogs. When an AI system can show how those findings combine, the risk calculation changes quickly.
Why proof generation matters
Security teams spend a large amount of time deciding whether a reported bug is real. AI-generated reports can make this worse when they produce speculative findings without working evidence.
Cloudflare said Mythos Preview can write test code, compile it in a scratch environment, run it, read the result, adjust the hypothesis, and try again. This loop helps separate real vulnerabilities from weak theories.
A finding that arrives with reproduction steps and a working proof-of-concept can move through triage much faster than a vague report.
- The model identifies a suspected bug.
- It writes code to trigger the issue.
- It compiles and runs the test in a controlled environment.
- It reviews the failure or success result.
- It adjusts the hypothesis if the first attempt fails.
- It produces a stronger report when exploitability is confirmed.
False positives are still a problem
Cloudflare said noise remains one of the hardest parts of AI-assisted vulnerability research. Two factors matter most: the programming language and the model’s tendency to speculate.
C and C++ produced more false positives because they allow direct memory control and include bug classes that memory-safe languages reduce or eliminate at compile time.
Model bias also matters. When asked to find bugs, models may produce many “possibly” or “could in theory” findings. Those findings still cost human time, even when they turn out to be wrong.
| Noise source | Why it matters |
|---|---|
| C and C++ code | Memory-unsafe behavior creates more complex bug classes and more uncertain reports. |
| Speculative model output | Models may over-report possible bugs that do not lead to real exploitation. |
| Broad prompts | Asking one agent to scan an entire repository can produce weak coverage. |
| Missing reachability checks | A real bug may not matter if attacker-controlled input cannot reach it. |
Cloudflare says the harness matters
Cloudflare found that pointing a generic AI coding agent at a repository is not enough. Real vulnerability research needs structure, narrow scope, independent review, and repeatable validation.
The company built a vulnerability discovery harness around Mythos Preview. The process divided the work into smaller stages, with agents assigned to focused tasks instead of broad repository-wide prompts.
This approach improved coverage and reduced wandering. It also made the output easier to validate because each agent had a narrower question to answer.
- Recon agents map the repository, trust boundaries, entry points, and likely attack surface.
- Hunt agents examine specific attack classes inside narrow scopes.
- Validate agents try to disprove findings before they enter the queue.
- Gapfill stages re-check areas that earlier agents touched but did not fully cover.
- Dedupe stages combine findings with the same root cause.
- Trace agents check whether attacker-controlled input can reach the bug.
- Feedback stages turn reachable findings into new focused hunt tasks.
- Report agents produce structured output for security workflows.
Why adversarial review helped
Cloudflare said an independent review agent reduced noise. This second agent used a different prompt and tried to disprove the original finding.
That matters because an AI model reviewing its own work may repeat the same assumptions. A deliberately adversarial review step catches more weak findings before humans spend time on them.
Cloudflare also found that splitting questions helped reasoning. Asking “is this code buggy?” and “can an attacker reach this bug?” worked better than asking one model to solve both at once.
Project Glasswing limits access to Mythos Preview
Anthropic introduced Project Glasswing to give selected partners access to Claude Mythos Preview for defensive cybersecurity work. The program includes major technology and infrastructure organizations.
Anthropic says the goal is to help defenders find and fix vulnerabilities in critical systems before attackers get the same level of capability. The company has also said Mythos Preview will not be released generally in its current form.
Reuters reported that Anthropic is allowing Project Glasswing partners to share threat findings, best practices, tools, and code more broadly when responsible disclosure rules allow it.
Safety remains a major issue
Cloudflare said Mythos Preview sometimes refused to help with legitimate vulnerability research tasks, even though it completed similar tasks when framed differently. The company warned that these organic refusals are not consistent enough to work as a full safety boundary.
This creates a difficult balance. A model that can help defenders build proof-of-concept exploits can also help attackers if access and safeguards are not carefully controlled.
Cloudflare said future cyber-focused frontier models will need stronger safeguards before broader availability. Anthropic has also said it needs better controls that can detect and block the most dangerous outputs.
| Benefit | Risk |
|---|---|
| Faster vulnerability discovery | Attackers may also find flaws faster. |
| Working proof-of-concept validation | Exploit development may become easier to automate. |
| Better prioritization for defenders | Security teams may face more high-quality attack attempts. |
| Scalable review of large codebases | False positives can still overload triage teams. |
What security teams should learn from this
The main lesson is not only to scan faster. Cloudflare warned that faster patching alone will not solve the problem if regression testing, deployment, and architecture remain slow.

Security teams need controls that make exploitation harder even when a bug exists. That includes defenses in front of applications, strict isolation between components, and systems that can roll out mitigations quickly.
Teams also need better vulnerability pipelines. AI can help find and prove bugs, but organizations still need validation, reachability analysis, safe patching, and clear ownership.
- Build vulnerability workflows around narrow tasks, not broad prompts.
- Use independent validation to reduce noisy AI findings.
- Separate bug discovery from reachability analysis.
- Invest in rapid but safe patch deployment.
- Reduce blast radius with isolation and least-privilege design.
- Protect public-facing applications with controls that block exploit paths.
- Prepare for shorter timelines between disclosure and exploitation.
The attack timeline is shrinking
Cloudflare’s findings show that AI-assisted vulnerability research has moved closer to practical exploit development. The same tools that help defenders close security gaps may also help attackers move faster.
That does not make AI security research a reason to stop using these systems defensively. It makes controlled use, stronger safeguards, and faster remediation more urgent.
For defenders, the next stage is clear. Build structured AI-assisted security workflows now, reduce the impact of exploitable bugs, and assume attackers will soon use similar automation at scale.
FAQ
Mythos Preview is an unreleased Anthropic model being tested through Project Glasswing for defensive cybersecurity work. It is designed to help selected partners find and fix vulnerabilities in critical software systems.
Cloudflare found that Mythos Preview can identify vulnerabilities, chain smaller bugs into working exploit paths, and generate proof-of-concept code in controlled environments to confirm whether a flaw is exploitable.
No. Cloudflare said the model still needs a structured harness, narrow task scoping, independent validation, deduplication, reachability analysis, and human security processes to handle findings responsibly.
A proof-of-concept helps defenders confirm that a vulnerability is real and exploitable. That reduces triage time and helps teams prioritize bugs that need urgent remediation.
No. Anthropic says Mythos Preview is not generally available. It is currently provided through Project Glasswing to selected partners for controlled defensive cybersecurity use.
Read our disclosure page to find out how can you help VPNCentral sustain the editorial team Read more
User forum
0 messages