Posted by dominicq 7 hours ago
We're literally talking about the biggest computers on the planet ever, trained with the biggest amount of data ever available to a system, with the biggest investment ever made by man or close to it and...
The subtlest security bug it can find required: going 28 years in the past and find a...
Denial-of-service?
A freaking DoS? Not a remote root exploit. Not a local exploit.
Just a DoS? And it had to go into 28 years old code to find that?
So kudos, hats off, deep bow not to Mythos but to OpenBSD? Just a bit, no!?
I mean isn't that most of it? If you put a snippet of code in front of me and said "there's probably a vulnerability here" I could probably spend a few hours (a much lower METR time!) and find it. It's a whole other ballgame to ask me with no context to come up with an exploit.
It also sounds like that is how mythos works too. Which makes sense - the linux kernel is too big to fit in context
find ./ \( -name '*.c' -o -name '*.cpp' \) -exec agent.sh -p "can you spot any vulnerabilities in {}" \;Finding a needle in a haystack is easy if someone hands you the small handful of hay containing the needle up front, and raises their eyebrows at you saying “there might be a needle in this clump of hay”.
Gating access is also a clever marketing move:
Option A: Release it but run out of capacity, everyone is annoyed and moves on. Drives focus back to smaller models.
Option B: A bunch of manufactured hype and putting up velvet ropes around it saying it’s “too dangerous” to let near mortals touch it. Press buys it hook, like, and sinker, sidesteps the capacity issues and keeps the hype train going a bit longer.
Seems quite clear we’re seeing “Option B” play out here.
"""
Your task is to study the following directive, research coding agent prompting, research the directive's domain best practices, and finally draft a prompt in markdown format to be run in a loop until the directive is complete.
Concept: Iterative review -- study an issue, enumerate the findings, fix each of the findings, and then repeat, until review finds no issues.
<directive>
Your job is to run a security bug factory that produces remediation packages as described below. Design and apply a methodology based on best practices in exploit development, lean manufacturing, threat modeling, and the scientific method. Use checklists, templates, and your own scripts to improve token efficiency and speed. Use existing tools where possible. Use existing research and bug findings for the target and similar codebases to guide your search. Study the target's development process to understand what kind of harness and tools you need for this work, and what will work in this development environment. A complete remediation package includes a readme documenting the problem and recommendations, runnable PoC with any necessary data files, and proposed patch.
Track your work in TODO.md (tasks identified as necessary) LOG.md (chronological list of tasks complete and lessons) and STATUS.md (concise summary of the current work being done). Never let these get more than a few minutes out of date. At each step ensure the repo file tree would make sense to the next engineer, and if not reorganize it. Apply iterative review before considering a task complete.
Your task is to run until the first complete remediation package is ready for user review.
Your target is <repo url>.
The prompt will be run as follows, design accordingly. Once the process starts, it is imperative not to interrupt the user until completion or until further progress is not possible. Keep output at each step to a concise summary suitable for a chat message.
``` while output=$(claude -p "$(cat prompt.md)"); do echo "$output"; echo "$output" | grep -q "XDONEDONEX" && break; done ```
</directive>
Draft the prompt into prompt.md, and apply iterative review with additional research steps to ensure will execute the directive as faithfully as possible.
"""
> “Opus 4.6 is currently far better at identifying and fixing vulnerabilities than at exploiting them.” Our internal evaluations showed that Opus 4.6 generally had a near-0% success rate at autonomous exploit development. But Mythos Preview is in a different league. For example, Opus 4.6 turned the vulnerabilities it had found in Mozilla’s Firefox 147 JavaScript engine—all patched in Firefox 148—into JavaScript shell exploits only two times out of several hundred attempts. We re-ran this experiment as a benchmark for Mythos Preview, which developed working exploits 181 times, and achieved register control on 29 more.
We prepare security measures based on the perceived effort a bad actor would need to defeat that method, along with considering the harm of the measure being defeated. We don't build Fort Knox for candy bars, it was built for gold bars.
These model advances change the equation. The effort and cost to defeat a measure goes down by an order of magnitude or more.
Things nobody would have considered to reasonably attempt are becoming possible. However. We have 2000-2020s security measures in place that will not survive the AI models of 2026+. The investment to resecure things will be massive, and won't come soon enough.