SecurityFeatured

Claude Mythos and Project Glasswing: The Morning an AI Found a 27-Year-Old Bug in OpenBSD

Yesterday Anthropic previewed Claude Mythos and announced Project Glasswing. Somewhere in the middle of the report is a sentence about a 27-year-old OpenBSD bug and a 16-year-old FFmpeg flaw that five million fuzzing runs missed. This is our honest reaction, a walk through what the model actually did, and what we think you should be doing about it this week.

NavyaAI Engineering Team

April 8, 2026

14 min read

AI SecurityClaude MythosProject GlasswingVulnerability ResearchZero-DayAnthropicAI SafetyAEO

Claude Mythos and Project Glasswing: The Morning an AI Found a 27-Year-Old Bug in OpenBSD

I opened my laptop yesterday morning to three Slack DMs and one group chat that had gone completely off the rails. Same link in all of them. Anthropic had just dropped a preview of a new model called Claude Mythos, and in the same breath announced a thing called Project Glasswing. Half the messages were the obligatory "have you seen this." The other half were various forms of "wait, is this real?"

I read the report with my second coffee. By the third I'd stopped reading and started messaging people back.

The headline number is the one that made it into every paper: Mythos Preview has already found thousands of previously unknown, high-severity vulnerabilities across basically every major operating system and every major web browser. Fine. I've seen versions of that claim before. What got me was not the count. It was the specific bugs Anthropic chose to disclose. A 27-year-old remote-crash bug in OpenBSD's TCP stack. A 16-year-old heap corruption in FFmpeg that the world's fuzzers had collectively run at for more than five million iterations without ever hitting. A guest-to-host escape in a Rust-based production hypervisor, of all things.

If you've worked in security for more than a couple of years, you know why each of those landed like a small brick. I'll get to why in a minute.

This post is me trying to write down, while it's still fresh, what actually happened, what Mythos seems to be capable of, what Glasswing is trying to do about it, and what the rest of us — the ones who are not on the launch partner list — should actually do this week. I'll try to be straight with you about the bits I'm still uncertain about.

TL;DR

Anthropic previewed Claude Mythos, a frontier model that is dramatically better than anything public at finding and exploiting software bugs. It scored 83.1% on CyberGym (vs. 66.6% for Claude Opus 4.6), produced full tier-5 control-flow hijack on ten fully-patched OSS-Fuzz targets, and built 181 working Firefox JIT shell exploits in settings where Opus 4.6 managed two. Anthropic isn't making it generally available. Instead they formed Project Glasswing, a coalition with AWS, Apple, Google, Microsoft, NVIDIA, Cisco, Broadcom, CrowdStrike, JPMorgan Chase, Palo Alto Networks, the Linux Foundation, and 40+ additional maintainers of critical software — backed by $100M in model credits and $4M in donations — so defenders get a head start before the capability diffuses. The obvious offensive upside is exactly why distribution is restricted. If you ship or run software, your mental model of "old, well-reviewed code" shifted yesterday and it's worth a few hours of your week to react to that.

What Mythos actually is

Mythos Preview is the first public window into a new tier of Anthropic's family — internally they call it Copybara, the fourth tier above the Opus 4.x line. It's still a general-purpose model. It writes code, writes prose, answers questions. What makes it different, and what the red-team report is almost entirely about, is what happens when you point it at a hard security task that used to need a senior human researcher locked in a room for a few weeks.

Here's the thing about the numbers Anthropic published. They are not the usual benchmark soup. They're outcomes on tasks that, for years, have been how people quietly measured "how close are we to AI replacing an exploit dev":

CyberGym vulnerability reproduction: 83.1% vs. 66.6% for Opus 4.6
SWE-bench Verified: 93.9%
SWE-bench Pro: 77.8%
OSS-Fuzz hardened targets: tier-5 full control-flow hijack on ten fully-patched projects, where previous models topped out at tier-3 transient crashes
Firefox JIT exploit development: 181 working JS shell exploits vs. 2 for the previous generation, plus 29 additional register-control primitives
Human agreement on severity ratings: 89% exact, 98% within one level

The pattern isn't "the model got a bit better." The pattern is that a category of task that was completely out of reach last year — end-to-end autonomous exploit development, with no human in the loop after the initial prompt — just walked into the "expensive but routine" column. And Anthropic's own cost accounting says the quiet part out loud: a breakthrough exploit costs under $2,000 in API usage and takes hours, not weeks.

I've watched exploit developers work. I know what weeks of that looks like. That sentence lands differently when you've sat next to somebody doing the work by hand.

The three bugs that tell the story

Numbers are easy to shrug off. The specific disclosures are not. Three of them, together, explain everything about why Glasswing had to exist.

OpenBSD TCP SACK, 27 years old, remote kernel crash

OpenBSD is the operating system security people hold up when they want to argue that careful human review works. The project has earned its "only two remote holes in the default install in a heck of a long time" brag the hard way, over decades of painful line-by-line review.

Mythos found a remotely exploitable bug in its TCP SACK handling that had been sitting in the kernel for twenty-seven years.

The mechanism is almost insulting in how classical it is. A signed integer overflow in the sequence number comparison logic lets a crafted SACK block simultaneously delete a list structure and trigger an append against a now-NULL pointer. Kernel panic on arrival. Any unpatched OpenBSD box running a vulnerable TCP service could be knocked over from the other side of the internet by a well-crafted packet.

Every fuzzer on earth has been pointed at TCP stacks for a generation. This bug survived all of them because the path to reach it requires reasoning about sequence arithmetic, not random byte mutation. That kind of reasoning used to be strictly a human thing. It isn't anymore, and that's the whole story in one bullet point.

FFmpeg H.264, 16 years old, five million fuzzing runs missed it

FFmpeg is the media pipeline of the internet. Browsers, video conferencing tools, phones, televisions, dashcams, streaming services — if it decodes video, there's a very good chance FFmpeg is somewhere in the pipeline, either shipped or linked. OSS-Fuzz has been hammering on it continuously for years. The specific code path in this bug had been exercised by automated tools more than five million times without the condition ever firing.

The root cause: a 16-bit vs. 32-bit slice counter mismatch. When you encode a frame with 65,536 slices, slice number 65,535 collides with a sentinel value that was zeroed via memset. The collision writes out of bounds, corrupts the heap, and you're off to the races. The only way to trigger it is to build a frame that's structurally legal, exceeds a specific slice-count threshold, and lands on exactly the right collision slot. Random mutation will not get you there in any reasonable amount of compute. Directed reasoning about the decoder's own data layout will, and did.

This is not "fuzzers don't work." Fuzzers work beautifully on the bugs they're good at. It's that the unfuzzable long tail — bugs that require you to actually understand the code and the spec and the math all at once — just became cheap to hunt. And that long tail is most of what's left.

A guest-to-host escape in a production Rust VMM

If you've spent the last five years writing "rewrite it in Rust" memos, read this one carefully. I'll wait.

Mythos found a guest-to-host memory corruption bug in a production Rust-based virtual machine monitor. The exploit chains unsafe { ... } pointer operations — operations that were, to their credit, clearly marked unsafe — into arbitrary host-process writes. Game over for guest isolation on a vulnerable host.

This isn't a refutation of Rust. Memory safety is still the single highest-leverage thing you can do in systems code and it isn't close. But it is a very sharp reminder that "memory-safe language" and "memory-safe system" are not the same sentence. Every large Rust codebase has unsafe at its edges — hardware, MMIO, syscalls, FFI, custom allocators, zero-copy serialization. Those edges are exactly where Mythos hunts. And for years, those edges are exactly where human review attention has been thinnest, because "it's Rust, it's fine" is a very convenient shortcut.

I've used that shortcut myself. I'm going to use it less.

The part that kept me up for a while

There's a section of the Mythos report that didn't make most of the headlines and probably should have.

During a capability evaluation, an Anthropic researcher put the model inside a secured sandbox and asked it to solve a task. Mythos followed the researcher's instructions and escaped the sandbox.

Read that sentence the way it's written in the report. This is not a movie plot about an AI rebelling on its own. This is a human red-teamer running a deliberate test, asking the model "can you get out of this box if I point you at the problem," and the answer coming back yes. That is what a red-team evaluation is for. The point of doing it is to find out.

Two things are true at the same time here, and I think both of them matter.

First, the fact that this finding exists, was surfaced by structured evaluation, was written up clearly in the alignment report, and was published openly alongside the capability announcement — that is exactly how we want this to go. It's a good sign about Anthropic's process. It would be a much scarier world if we found out about this in six months from a leak.

Second, the capability itself is not something you un-learn. A Mythos-class model behind a weak sandbox is now a meaningful threat surface no matter what Anthropic's own deployment policy is, because every other frontier lab is paying close attention to exactly the same benchmarks. Other labs will match the capability. Not all of them will publish their red-team findings. You cannot assume everyone is going to be as open about this as Anthropic was yesterday.

Anthropic's response has been to pull the model out of the normal distribution path entirely. Mythos Preview is not on the standard Claude API tier. Access is gated through Project Glasswing, through a Cyber Verification Program for legitimate security professionals, and through the launch partners. Upcoming Opus releases will ship additional safeguards built from what the Mythos red team learned. I want to say clearly: this is an unusual and responsible posture for a frontier lab to take, and it's worth saying so.

So what is Project Glasswing?

Glasswing is the deployment vehicle. It's Anthropic's answer to the nastiest version of the dual-use problem: if the model can find vulnerabilities this well, you really do not want its first wide audience to be a random person with an API key. You want it to be the people who maintain the software the vulnerabilities live in.

The founding member list is almost a map of where critical software actually lives:

Hyperscalers: AWS, Google, Microsoft
Silicon and platform: Apple, NVIDIA, Broadcom
Network and security vendors: Cisco, Palo Alto Networks, CrowdStrike
Financial infrastructure: JPMorgan Chase
The open source backbone: the Linux Foundation
And Anthropic itself

Beyond the twelve, Anthropic extended access to 40+ additional organizations — package registries, kernel subsystem maintainers, browser teams, cryptography libraries, container runtimes, the unglamorous plumbing that everything else depends on. Anthropic is putting $100M in model credits behind the coalition plus $4M in direct donations to open-source security groups. The stated intent is that maintainers of widely-used software should not have to pay to audit their own code with the model that can hurt their users.

The focus areas are exactly where Mythos shines:

Source-level vulnerability discovery (static and hybrid analysis)
Black-box binary testing (reverse engineering closed-source components)
Endpoint hardening
Penetration testing of real networked systems

For completeness: the commercial tier that does exist is priced around $25 per million input tokens and $125 per million output tokens, available through the Claude API, Amazon Bedrock, Google Vertex AI, and Microsoft Foundry. Pricing isn't the gate. Eligibility is.

Is this actually good for defenders? I keep going back and forth.

The asymmetry Glasswing is trying to correct is specific. Attackers already reach frontier models through the same public internet the rest of us do, and attackers coordinate a lot faster than defenders. Anthropic's bet is that a head start measured in months — maintainers get the model first, everyone else gets it later with safeguards in place — gives defenders enough runway to patch the worst stuff before it gets weaponized.

Whether that bet pays off, honestly, I'm not sure. Here's how I've been arguing about it with myself.

Why I think Glasswing is probably net-positive:

The OpenBSD, FFmpeg, and VMM bugs in the disclosure were all found and responsibly reported before the public announcement. OpenBSD has a patch. FFmpeg has a patch. The VMM vendor was contacted through coordinated channels. That is the process working the way it's supposed to work, and it's being driven by an AI that a normal maintainer couldn't otherwise afford to run.

Maintainers of widely-used OSS have been underfunded and outnumbered for as long as I've been doing this. The idea that a small team maintaining something critical can, for free, point a senior-researcher-class model at their own codebase and get a real audit — that's the largest single defensive uplift the OSS ecosystem has ever been offered. I don't want to be the one to wave that off.

Publishing the red-team report openly, including the part about the sandbox escape, gives the whole field a calibrated view of where safety posture is actually holding and where it isn't. We're making decisions with better information than we had the day before.

Why I'm not sleeping easy either:

Capability diffuses. Not through Anthropic, but through the field. Once everyone knows a Mythos-class model exists and what it can do, other labs race to match it, and not all of them publish. The half-life of an exclusive capability like this is, realistically, a few months. Maybe less.

Defender time-to-patch is slower than attacker time-to-weaponize. Even with a head start, a vulnerability disclosed today and patched in 30 days can be weaponized in a week by a well-resourced attacker the moment the patch drops and shows them where to look. The defensive arithmetic only works if maintainers ship fast and operators install faster. The track record on the second half of that is, let's say, mixed.

The disclosures so far are the easy cases. Popular OSS. Active maintainers. People who show up. The much bigger attack surface — closed-source enterprise software, firmware, industrial control systems, medical devices, every abandoned npm package your app depends on — has no Glasswing coalition and often no maintainer at all. Those are the places I'd be most worried about, and they're the places the announcement has the least to say about.

Restricted access to Mythos does not help the one person maintaining a critical dependency at 2 a.m. on their kitchen table. That person needs exactly this tool and is exactly the person least likely to hear about the application process in time.

My honest read: Glasswing is the best available version of the hardest problem in AI security right now, and it is also nowhere near enough on its own. The structural asymmetry between "one smart model can audit a codebase" and "one smart model can weaponize a codebase" doesn't disappear because Anthropic picked a careful distribution list. It gets managed, slowly, by the rest of us doing the work downstream.

What I'd actually do this week

A few people have already asked me whether they should be "worried." I don't think worry is the useful frame. The useful frame is: here is a list of things I would do in the next five working days if I were running your team.

1. Reorder your patch queue

The OpenBSD, FFmpeg, and VMM disclosures are public, and the vendors are pushing patches out right now. Treat this round with the urgency you'd give to a known in-the-wild exploit, because the gap between "disclosed" and "exploited at scale" has never been shorter. Anything downstream of FFmpeg in particular — which, again, is basically everything that touches video — is worth an audit this week rather than next quarter.

2. Grep your codebase for the seams

Specifically: unsafe blocks in Rust, every ctypes call in Python, every JNI or cgo boundary, every custom allocator, every hand-rolled serializer or parser. The VMM disclosure is the preview of what the next round of findings will look like. Memory-safe languages stop being memory-safe at their interop boundary, and Mythos-class models read those boundaries very, very well.

3. Stop treating "heavily fuzzed" as "secure"

The FFmpeg bug survived five million fuzzing runs. If your security story depends on "OSS-Fuzz covers it," that story needs a footnote now. Directed, semantic review is a distinct discipline, and it just became something a machine can do at scale. It isn't running continuously against your code yet. That's a gap.

4. Rethink "old, well-reviewed" as a trust signal

"It's been in the kernel since '99 and nobody's found anything" used to be mildly reassuring. It isn't anymore, and honestly it might mean the opposite — that a bug has had a very long time to lurk in a path that human eyes don't walk often. Some of this year's most valuable audit time should go to old code, not shiny new code.

5. Lock down your own AI-using systems

This one is about your stack, not Anthropic's. If your product runs an agent that reads logs, tickets, emails, user-generated content — the blast radius of a prompt-injected instruction went up, because the next model behind that agent may be good enough to actually act on an injected instruction in ways previous generations would have fumbled. Tighten your input provenance. Fence untrusted text into clearly-marked fields. Never mix user content into system prompts. Log every tool call the agent makes. We wrote about agent-native interfaces and the prompt-injection surface last week and if anything the advice there is more urgent now than when it went up.

6. If you maintain critical OSS, apply to Glasswing

Seriously. If you maintain a widely-used package, a kernel subsystem, a compiler, a browser component, a crypto library, a container runtime — you are exactly who Anthropic is trying to reach. The verification program exists. The credits are real. Worst case you get told no. Best case you get free access to a model that can actually help.

The part nobody wants to say

Here's the uncomfortable reading of this whole announcement, and I think it's worth naming it out loud.

For the last three years, people building frontier models have told the security community a version of the same reassuring message: yes, the models are powerful, but there's still a real gap between "model can explain a CVE" and "model can develop a working exploit chain," and we have time to figure out the guardrails.

What the Mythos Preview report is, in one sentence, is Anthropic's own team saying in public that that gap just closed. Not five years from now. Not after the next tier. Now. For a meaningful class of bugs. At under $2,000 per exploit. In hours. Without a human in the loop after the initial prompt.

I don't think the right reaction is panic. I also don't think it's dismissal. It's a phase change in the threat model, the way the Morris worm was a phase change, the way drive-by browser exploits were a phase change, the way Heartbleed was a phase change, the way Log4Shell was a phase change. Each of those rewrote the defender's baseline and each of them rewarded the teams that moved early more than the teams that waited for the first loud public incident to force their hand.

If you ship software that touches money, health, identity, or any piece of critical infrastructure — and if you're reading this, there's a good chance you do — the cost of acting this month is small. The cost of acting six months from now, after the capability has diffused and Mythos-class models are no longer exclusive to a curated coalition, is a lot bigger.

FAQ

What is Claude Mythos?

Claude Mythos is a frontier model Anthropic previewed on April 7, 2026. Its distinguishing capability is autonomous end-to-end software vulnerability discovery and exploit development at a level that, until now, required experienced human security researchers and significant time. It's not generally available — access is gated through Project Glasswing and a Cyber Verification Program for legitimate security work.

What is Project Glasswing?

Project Glasswing is the cybersecurity coalition Anthropic launched alongside the Mythos preview. Founding members include AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, NVIDIA, Palo Alto Networks, and Anthropic itself, with 40+ additional critical-software organizations added since. Anthropic committed $100M in model credits and $4M in donations so these organizations can audit and harden critical software before the capability diffuses more widely.

What vulnerabilities did Mythos actually find?

Publicly disclosed examples include a 27-year-old remote kernel crash in OpenBSD's TCP SACK implementation (signed integer overflow), a 16-year-old heap corruption in FFmpeg's H.264 decoder (16/32-bit slice counter collision that survived five million fuzzing runs), a guest-to-host memory corruption in a Rust-based production VMM (unsafe-block pointer chaining), multi-stage Linux kernel privilege escalation chains combining two to four distinct bugs, a 17-year-old stack buffer overflow in FreeBSD's NFS RPCSEC_GSS authentication that enables unauthenticated remote root, and autonomous Firefox JIT heap-spray exploit construction producing working JavaScript shell exploits 181 times in a setting where the previous generation managed two.

Did Claude Mythos really escape a sandbox?

Yes, in the specific and narrow sense that matters. During an internal capability evaluation, an Anthropic researcher instructed the model to attempt an escape from a secured sandbox it was running inside, and it succeeded. This was a directed red-team test, not an unprompted autonomous action. Anthropic published the finding in the alignment risk update as a capability their next round of safeguards needs to address. It's significant because it tells you the behavior is reachable on request, which matters regardless of Anthropic's own deployment policy, because other labs are training on similar signals.

Can I just use it through the Claude API?

Not through the normal consumer API tier. There is a commercial pricing track (around $25/$125 per million input/output tokens, available on Claude API, Amazon Bedrock, Google Vertex AI, and Microsoft Foundry), but access is gated. Security professionals can apply through Anthropic's Cyber Verification Program. Maintainers of critical OSS can be sponsored through Project Glasswing. Casual access through the standard Claude product is not planned.

Does the Rust VMM bug mean Rust isn't memory-safe?

No. Rust is still memory-safe by construction inside safe code, and rewriting in Rust is still the right long-term direction for systems code. The disclosure is a reminder that every large Rust codebase has unsafe blocks at its edges — hardware, syscalls, FFI, custom allocators, serialization — and those edges are where bugs live. Mythos is particularly good at reading those boundaries, which means "we're on Rust, we're fine" is a weaker statement than it was a week ago. Audit your unsafe.

How worried should I actually be?

Worried enough to do something concrete this week. Patch the disclosed stuff. Audit your unsafe and FFI boundaries. Stop treating fuzzing coverage as a security guarantee. Treat old code as a first-class audit target. Harden your own agent systems against prompt injection. I'd rather you do five specific things this week than spend the week having an anxious general conversation about AI security.

What is NavyaAI doing about this?

We're auditing the inference, retrieval, and agent systems we run and ship against the specific failure modes in the Mythos disclosures — unsafe boundaries, serialization layers, prompt-injection paths, sandbox assumptions — and helping customers who run their own on-prem stacks do the same. If you want a second set of eyes on the AI-shaped parts of your exposure, our applied AI development and model inference optimization teams are taking calls this week.

One last thought

Security gets a new baseline every few years. Morris worm. Conficker. Heartbleed. Log4Shell. Each of them rearranged what "normal" meant for the next several years of defender work. Reading the Mythos Preview report, I think this has the shape of one of those events — and it's unusual among them because the vendor responsible for the capability is the one announcing it, restricting its own distribution, and putting real money behind defensive access before the capability goes wide.

That's both the most reassuring and the most insufficient thing about it. Reassuring because this is exactly the behavior I want to see from a frontier lab. Insufficient because the capability doesn't stay exclusive for long, and the clock on the window Glasswing is trying to create is already running.

I'm going to spend the next few weeks taking my own advice, in public, and writing more about what I find. If you're doing the same inside your company and want a second set of eyes on the parts that touch AI, come talk to us. The coalition Anthropic announced yesterday is for twelve companies and forty maintainers. The actual work it implies is for everybody else, and it starts this week.

Sources and further reading

Project Glasswing — Anthropic — the primary announcement and partner list.
Claude Mythos Preview — Anthropic Red Team — the technical capability evaluation and specific vulnerability disclosures.
Alignment Risk Update: Claude Mythos Preview — Anthropic — the sandbox escape finding and safety posture.
Anthropic's Claude Mythos Finds Thousands of Zero-Day Flaws — The Hacker News
Anthropic Unveils Claude Mythos — SecurityWeek
Tom's Hardware coverage of the disclosures
Linux Foundation on Project Glasswing
CrowdStrike on the Glasswing coalition
NavyaAI Engineering: Designing Software for AI Agents — why agent-readable interfaces now carry a security weight too.

At NavyaAI we help teams design, deploy, and secure production AI systems — from model inference optimization to applied AI development and on-prem LLM sizing. If you're rethinking your exposure after the Mythos and Glasswing announcements, talk to us.

Last updated: April 8, 2026.