Posted by ricksunny 3 days ago
> "On June 11th Mark Warner, the vice-chair of the Senate Intelligence Committee, said that General Joshua Rudd, who leads the National Security Agency and the Pentagon’s Cyber Command, had told him that Mythos “broke into almost all of our classified systems, not in weeks, but in hours”"
Why:
1. It's a paraphrase of a 2nd hand conversation and (at least) the last two 'telephone game' recipients are a U.S. Senator and a general, not security domain or IT experts. 2. Motivated communication: The Senator claimed this to justify the necessity of unprecedented restrictions that he agrees with. 3. The original testimony to the Intelligence Committee was almost certainly detailed, nuanced and highly classified, making this an extreme paraphrase.
In saying this, I'm not claiming Mythos may not be a security issue or that something directionally like this wasn't reported. But given the indirect, circuitous path, it's quite easy to imagine the original testimony was more like "Mythos identified a potential vulnerability we rated "Severe" in a critical system and we believe it could find similar vulnerabilities in any of our systems."
>An update. A US official tells me that Sen. Warner misunderstood the NSA director Gen. Rudd in this case. Rudd did use the 'hours, not weeks' wording, but the use of Mythos in this context was—as widely assumed—part of a red-teaming effort, i.e. testing the security of internal networks
State sponsored, non-public penetration fine tunes (of possibly public ones) likely can do it even faster.
Unsupervised penetration RL loop is ideal setup similar to optimization one – it's relatively easy to gain function on it.
And the fact that all our systems are riddled with security holes shouldn't be too much of a surprise given the way that we all know that software is developed and how tech debt / chores are constantly underbudgeted (plus I think this underscores that any one human's knowledge and attention are inherently limited, and even the best PR review is going to leak all kinds of security holes).
And the threat actors that would find that information "useful" already know it.
All of our IT security is a mess, the NSA director is just confirming what should be common knowledge.
- With a weaker model, the time to break into the system might grow so larger that it becomes infeasible, similar to how password hashes can be bruteforced, but if the password is long enough, that is not going to happen in our lifetime.
- There might be problems which are inherently unsolvable with a lower level of intelligence. For example, your dog won't derive calculus from scratch, even if it lived forever.
- LLMs might be biased in such a way that they never explore the entire solution space, no matter how many attempts are made. Some models are notorious for getting stuck in a loop, trying small variations of the same approach every time, even though it is doomed to fail. This can be counteracted somewhat with higher sampling temperature, but that hurts reasoning capabilities.
The ability to reproduce an exact copy of hamlet does not make one Shakespeare. A monkey on a typewriter may very well generate Shakespeare eventually, but it wouldn't understand Shakespeare then any more than it could immediately. Likewise a dog may put together some string of text that includes a derivation of calculus, but at no time will it be able to apply that derivation to solve mathematical problems.
It's a line of reasoning meant to shut off empathy to the here and now. And while it sounds good, along the lines of Baywatch: If you're jumping into a live saving situation and you have to choose between further harming your victim and you being harmed, you choose your victim because without you to save both of you, it's fatal; the difference is indirectly or directly pushing your victim into the water then claiming you're altruistically going to save them at a later date.
It's just delusions to keep moving forware.
https://www.csun.edu/~dgray/BE528/Pennigs2003Dogs_Calculus.p...
Let's just take GPT 5.5 and Opus 4.8 as an example. Both are worse than Mythos 5, but they're capable of quite a bit when the guardrails are lifted and they're paired with a skilled human operator. They more than "good enough" to reach the same result with the addition of some human effort.
We're not talking about dogs, but LLM systems.
Mythos is not exploring entire solution space either.
Usually looping is solved by repetition/frequency/presence/n-gram penalties/DRY/min-p sampling, not temperature but we're not talking about small models that have those classes of issues here.
I am not talking about literally bruteforcing passwords (although LLMs are being used for that, too), but bruteforcing passwords and solving verifiable domain tasks have quite a few similarities, especially when considering rule-based and probabilistic bruteforce methods.
> We're not talking about dogs, but LLM systems.
Well, clearly dogs are not LLM systems. It is an analogy. If there is an important point on your mind that makes the analogy break down, feel free to spell it out.
> Mythos is not exploring entire solution space either.
Yes, but weaker models do not find the solution right away, so they need to try more often. But if they only try the same thing every time, they will never succeed, so we need some kind of guarantee that they try something different every time.
> Usually looping is solved by repetition/frequency/presence/n-gram penalties/DRY/min-p sampling, not temperature but we're not talking about small models that have those classes of issues here.
Those might help to reduce looping (at the cost of biasing the generation), but to guarantee that a model can generate all possible generations, we need non-zero probabilities for all tokens, not lower probabilities for likely tokens.
They are? Seems like a much worse way to brute force that a tight loop written in a compiled language.
https://huggingface.co/papers/2306.01545
Although most activity is likely hidden (blackhat or state)
Only thing I disagree on is that we lost that knowledge, we did not, there isn’t much to capabilities, they actually simplify OS design IMO.
It's my belief that we can have general purpose, easy to use, secure computing for everyone.
No UAC crap, or horrible systems like AppArmor, no virus scanners, etc... just computers that do what you want, and only what you want.
We could have had it decades ago, if things had happened in a slightly different sequence order, related to the flood of personal computers.
And hardware glitches are a thing (edit: and supply chain attacks).
But I do agree that verified correct software can offer very strong guarantees that go well beyond those of commonly deployed software. We could have been in a much better place today.
still not immune to be hacked ofc. I think the last step would be making it common place again to build these things custom. that way they'd have to have more specific information available as threat actors to exploit you. It'd be harder to have generic methods affecting millions of systems.
regardless there are no silverbullets, and tradecraft/opsec will always be a thing. most compromises are because people hand out keys unwittingly rather than 0days and crazy sploits. (they do happen though, but its more expensive than fishing and just loggin on under some dudes credentials)
But there's much synergy there. Each enhances the other.
My brain hurts. How is a system where you can run whatever you want, however you want, but still keep sensitive things safely isolated possible?
Either you have restrictions on what you can run or access (in which case those limit sandboxed capabilities) or you have a hypothetically secure system, the security features of which you never leverage (because sandboxes have absolute freedom).
Unless you were talking about the ability to guarantee a monitor-only hypervisor or resource slice a machine into multiple tenants? (i.e. no/light touch hypervisor situations)
This is the downside of isolation machines and their upside.
Hard to make a completely isolated machine for all workflows and keep all data at all times inaccessible for exploits. But because each user has their own ways its more potential that 'your particular way of breaking the model' is not known or exploitable (yet).
A lot of holes you open are one-time actions from within a restricted domain.
in qubes you have cross domains tools from domain0 for this, which is very hard to reach (but not impossible).
And then supplychain is also hard. Qubes have canaries, but i think most ISO people copy into their dom0 and spinnVMs off of are not doing such rigorous things. (depends what u use ofc).
This depends on the chosen level of compartmentalization. For most people, it might be sufficient to store passwords in a dedicated, offline VM and do everything else in another one. This will already be huge improvement.
The dom0 has no network and doesn't manage, e.g., USB devices.
By definition, the latter implies limits on the former.
Either you have complete freedom to run whatever you want, however you want, or you enforce limits to guarantee system behavior and enforce isolation.
And if you do the latter... then you don't have the former.
Last VM escape in VT-d was discovered in 2006 by the Qubes founder, so I really feel safe on Qubes, https://en.wikipedia.org/wiki/Blue_Pill_(software)
I thought your original point above was that VMs freed you from having to come up with policy-based isolation rules (which have always been a UX weakness of policy-based isolation systems).
The point I was making is that VMs don't provide any security guarantees unless you also use the trusted hypervisor layer to enforce something.
At lightest touch, this might be time-slicing resources and ensuring they're evenly split between VMs, regardless of what individual VMs try to do.
But to provide policy-alike granular security control on VMs, you fundamentally have to generate similar rules. E.g. network can only be used by this VM in this way, etc.
Which gets you right back to having to define policies.
From an architecture security perspective, sure, having a trusted hypervisor enforcing the rules is nice. But it doesn't fundamentally fix the problem of getting policies right... if you're trying to guarantee the same level of control.
They also plan to replace Fedora in dom0 with something minimized https://github.com/QubesOS/qubes-issues/issues/1919#issuecom.... Is this a problem for you?
there are some BSD spinoffs like 5BSD which might end up with a good capability model but even there things like capsicum have their limits and IOMMU based isolation is still a dream. (because entire OS kernel is in one privilege level, accessible as root user, so DMA capable devices kill a lot of those securities).
(my os puts every subsystem, service, device driver, app etc. in their own hardware VM, likely there will be IPC bugs or hypercall bugs still tho in that case)
Nowadays with AI its getting more to a point where people can actually build these systems for themselves. Maybe that is a bigger threat to these big corporate tech companies than some security things. It will allow nations and companies to detach from their Tech...
From outside? Or did you have a shit ton of unpatched systems that only internal users could access?
Those "tapes" DOGE took away? Nothing on them can be considered private any more. That's how brute force risk happens. Mythos' risks are showing doorways to exfiltration surely? Why bother when you can walk out the door with a data dump?
The NSA is just a highly specific subclass of the problem. Their traditional publicly stated approach to security is "nothing electronic which enters our domain leaves" and yet somehow they have assessed these systems as capable of breaching their walls? That's super bad.
I suspect they ran an analogue/instance inside their protection rings. I doubt they ran a test outside in the global internet. If they have actually lost control of their boundary, that's a bigger story (which I doubt) and contextually he could have been referring to information systems in NSAs duty of care, not things inside Ft Meade.
In the end I got to help write up the issue but to my knowledge they never patched it as it would have caused major issues with maintenance by closing off access needed for some legacy software patches.
Not taking a dig at people, it was not a terrible choice earlier. Not like these models are inventing net new ways to exploit systems.
I would bet a large sum of money that Mythos was put on the same local network as the "systems" (ie you have access to services like UPnP brokers that never meant for outside internet), and the "broke into" is just a blanket term for finding some bug which can range from simply crashing the program, to actual remote code execution. And its probably mostly the former. It used to be that cyber security research was all about finding ways to crash the program, which then implied that you can inject shell code, so the two became synonymous for vulnerability, but these days its very much not the case.