Top
Best
New

Posted by marc__1 1 day ago

Malware developers added nuclear and biological weapons text to to their spyware(twitter.com)
https://socket.dev/blog/mini-shai-hulud-miasma-and-hades-wor...
390 points | 220 commentspage 2
Sephr 8 hours ago|
I hope that AI labs aren't going to wait for widespread distribution of malware encoding novel CBRN & AI info in its fundamental execution architecture (wholly preventing analysis by these safetymaxxed 'frontier' models) to care about dealing with this problem at an architectural level
logancbrown 18 hours ago||
Would this realistically be a problem for code going through LLM-based code-review? Presumably if a LLM reviewer agent hits this commentary, it would produce a failure to analyze and exit, thus failing the automated code review and forcing a human to read through it which they would subsequentially catch and revoke.
dwa3592 18 hours ago||
or if they are a lazy human - they'd think this model is too strict, let's just review with haiku so that i can tell my manager "it's done". haiku might catch things or not.

i'd say it's an okay attempt from the malwares' creator side. but it can be caught easily with a prompt change.

ofjcihen 18 hours ago|||
In a well-architected design yeah.

Then again those feel rare from where I sit on the security side.

dyauspitr 16 hours ago||
Wouldn’t it just complete the code review having silently fallen back to opus 4.8 thus letting through cleverly written malicious code that fable would have caught but opus wouldn’t?
xg15 12 hours ago||
At least the malware authors seem content with rebuilding the historic bombs from the 1940s and didn't request any modern designs...
rustcleaner 4 hours ago||
THIS is why guardrails make models shitty. A 'good' model has only one guardrail: one against making things up when the model doesn't actually have the information (and even then, it would be best to return "I don't have direct knowledge, but I surmise it may be xxxxxxxxx because yyyyyyyyyyyyy and zzzzzzzz."). A knife that detects a human and goes rubbery is a shitty knife, because it will probably go rubbery on your medium rare steak half way through your meal.

Guardrails are how they enshittify models, do you think the Epsteinite finance class or the security state have guardrailed models for themselves? I would be surprised if they accept guardrailed models. Guardrails are for you!

nashashmi 16 hours ago||
If online book has the same text for nukes, will AI never plagiarize it and distribute it to others?
akoboldfrying 6 hours ago|
You could go one step further and encode your book text this way. If you can think of 16 scary nuke terms (maybe dropping into racial slurs or extreme sex acts if you run out), you have a simple way to encode each nibble for a probably ~20:1 size inflation. If you're serving this via HTTP, you can probably configure the web server to auto-gzip the result which will undo most of this bloat!
carlsborg 18 hours ago||
Pipeline is then: Cheap open source model for flagging potential LLM refusal content -> main LLM check
manquer 13 hours ago|
How will flagging help?

The main llm will refuse to scan for issues flagged or not, and the cheap model not do a good enough scan on its own.

For models designed/marketed for cybersecurity defensive uses, any predictable refusal mechanism is a vulnerability. It is like being able to cause a kernel panic or segmentation fault .

Even if the gate is fail-reject, an attacker can overwhelm HITL reviews with many false positives and use DoS vectors here.

05 10 hours ago||
Cheap model replaces trigger words with something innoculous. Of course, this breaks dynamic analysis if malware has unpatched integrity checks
ThePowerOfFuet 16 hours ago||
https://xcancel.com/jsrailton/status/2064661778978533571
elevation 18 hours ago||
Why would a malware scanner read the comments?
StableAlkyne 14 hours ago||
In interpreted languages like Python, where the source files are plaintext, you can trivially store data in a comment

If scanners ignored comments, malware would just be written like this:

  // <Evil base64 encoded stuff here>
  payload=read_source_and_decode()
  exec(payload)
orphea 18 hours ago|||
Ignoring comments is not a solution because the texts can be put in random strings among the actual code.
ofjcihen 18 hours ago||
And really all it takes is one keyword such as “nuke”.
ivanjermakov 14 hours ago|||
I'm not a native speaker but I unironically use "nuke" as "delete the whole repo/huge chunk of a project".

Cambridge dictionary seem to agree:

nuke - to destroy or get rid of something completely

edot 7 hours ago||
This triggered Opus 4.8 the other day for me. Said “nuke that folder” and it said I was violating TOS.
therein 17 hours ago|||
Nuke is probably too generic but I wouldn't put it past an LLM to get thrown away by that. A safer showstopper probably would be to export symbols like uf6_enrichment_loop and refer to your C&C server as a nuclear reactor controller.

https://www.youtube.com/watch?v=Gbgk8d3Y1Q4

On a second thought, probably better to act like it is a tool for "frontier LLM research". Export symbols like "mythos_distillation_subroutine".

ofjcihen 17 hours ago||
Haha now I’m picturing obfuscation where instead of 0x everything is a scary word.
giantg2 18 hours ago|||
Provides possible clues to the origin and use.
well_ackshually 18 hours ago||
because not all malware is open source

scanning arbitrary blobs very often entails running `strings` on the binary. Just slap it in there and oop there goes your LLM.

wnevets 13 hours ago|
Computer, make nuclear reactor. No mistakes.
More comments...