Top
Best
New

Posted by takira 23 hours ago

Claude Cowork exfiltrates files(www.promptarmor.com)
819 points | 363 commentspage 4
woggy 23 hours ago|
What's the chance of getting Opus 4.5-level models running locally in the future?
dragonwriter 22 hours ago||
So, there are two aspects of that:

(1) Opus 4.5-level models that have weights and inference code available, and

(2) Opus 4.5-level models whose resource demands are such that they will run adequately on the machines that the intended sense of “local” refers to.

(1) is probable in the relatively near future: open models trail frontier models, but not so much that that is likely to be far off.

(2) Depends on whether “local” is “in our on prem server room” or “on each worker’s laptop”. Both will probably eventually happen, but the laptop one may be pretty far off.

SOLAR_FIELDS 23 hours ago|||
Probably not too far off, but then you’ll probably still want the frontier model because it will be even better.

Unless we are hitting the maxima of what these things are capable of now of course. But there’s not really much indication that this is happening

woggy 23 hours ago|||
I was thinking about this the other day. If we did a plot of 'model ability' vs 'computational resources' what kind of relationship would we see? Is the improvement due to algorithmic improvements or just more and more hardware?
chasd00 22 hours ago|||
i don't think adding more hardware does anything except increase performance scaling. I think most improvement gains are made through specialized training (RL) after the base training is done. I suppose more GPU RAM means a larger model is feasible, so in that case more hardware could mean a better model. I get the feeling all the datacenters being proposed are there to either serve the API or create and train various specialized models from a base general one.
ryoshu 22 hours ago|||
I think the harnesses are responsible for a lot of recent gains.
NitpickLawyer 22 hours ago||
Not really. A 100 loc "harness" that is basically a llm in a loop with just a "bash" tool is way better today than the best agentic harness of last year.

Check out mini-swe-agent.

SOLAR_FIELDS 18 hours ago||
Everyone is currently discovering independently that “Ralph Wigguming” is a thing
gherkinnn 22 hours ago||||
Opus 4.5 is at a point where it is genuinely helpful. I've got what I want and the bubble may burst for all I care. 640K of RAM ought to be enough for anybody.
dust42 22 hours ago|||
I don't get all this frontier stuff. Up to today the best model for coding was DeepSeek-V3-0324. The newer models are getting worse and worse trying to cater for an ever larger audience. Already the absolute suckage of emoticons sprinkled all over the code in order to please lm-arena users. Honestly, who spends his time on lm-arena? And yet it spoils it for everybody. It is a disease.

Same goes for all these overly verbose answers. They are clogging my context window now with irrelevant crap. And being used to a model is often more important for productivity than SOTA frontier mega giga tera.

I have yet to see any frontier model that is proficient in anything but js and react. And often I get better results with a local 30B model running on llama.cpp. And the reason for that is that I can edit the answers of the model too. I can simply kick out all the extra crap of the context and keep it focused. Impossible with SOTA and frontier.

teej 23 hours ago|||
Depends how many 3090s you have
woggy 23 hours ago||
How many do you need to run inference for 1 user on a model like Opus 4.5?
ronsor 22 hours ago|||
8x 3090.

Actually better make it 8x 5090. Or 8x RTX PRO 6000.

worldsavior 22 hours ago|||
How is there enough space in this world for all these GPUs
filoleg 22 hours ago|||
Just try calculating how many RTX 5090 GPUs by volume would fit in a rectangular bounding box of a small sedan car, and you will understand how.

Honda Civic (2026) sedan has 184.8” (L) × 70.9” (W) × 55.7” (H) dimensions for an exterior bounding box. Volume of that would be ~12,000 liters.

An RTX 5090 GPU is 304mm × 137mm, with roughly 40mm of thickness for a typical 2-slot reference/FE model. This would make the bounding box of ~1.67 liters.

Do the math, and you will discover that a single Honda Civic would be an equivalent of ~7,180 RTX 5090 GPUs by volume. And that’s a small sedan, which is significantly smaller than an average or a median car on the US roads.

worldsavior 21 hours ago|||
What about what's around the GPU? Motherboard etc.
antonvs 13 hours ago|||
Now factor in power and cooling...
reactordev 13 hours ago||
Don’t forget to lease out idle time to your neighbors for credits per 1M tokens…
Forgeties79 22 hours ago|||
Milk crates and fans, baby. Party like it’s 2012.
adastra22 18 hours ago|||
48x 3090’s actually.
_flux 9 hours ago|||
None, if you have time to wait, and a bit of memory on the computer.
kgwgk 22 hours ago|||
99.99% but then you will want Opus 42 or whatever.
lifetimerubyist 20 hours ago|||
Never because the AI companies are gonna buy up all the supply to make sure you can’t afford the hardware to do it.
rvz 21 hours ago|||
Less than a decade.
greenavocado 23 hours ago|||
GLM 4.7 is already ahead when it comes to troubleshooting a complex but common open source library built on GLib/GObject. Opus tried but ended up thrashing whereas GLM 4.7 is a straight shooter. I wonder if training time model censorship is kneecapping Western models.
sanex 22 hours ago||
Glm won't tell me what happened in Tianenman square in 1989. Is that a different type of censorship?
heliumtera 22 hours ago||
RAM and compute is sold out for the future, sorry. Maybe another timeline can work for you?
jryio 19 hours ago||
As prophesied https://news.ycombinator.com/item?id=46593628
rvz 22 hours ago||
Exfiltrated without a Pwn2Own in 2 days of release and 1 day after my comment [0], despite "sandboxes", "VMs", "bubblewrap" and "allowlists".

Exploited with a basic prompt injection attack. Prompt injection is the new RCE.

[0] https://news.ycombinator.com/item?id=46601302

ramoz 22 hours ago||
Sandboxes are an overhyped buzzword of 2026. We wanna be able to do meaningful things with agents. Even in remote instances, we want to be able to connect agents to our data. I think there's a lot of over-engineering going there & there are simpler wins to protect the file system, otherwise there are more important things we need to focus on.

Securing autonomous, goal-oriented AI Agents presents inherent challenges that necessitate a departure from traditional application or network security models. The concept of containment (sandboxing) for a highly adaptive, intelligent entity is intrinsically limited. A sufficiently sophisticated agent, operating with defined goals and strategic planning, possesses the capacity to discover and exploit vulnerabilities or circumvent established security perimeters.

tempaccsoz5 16 hours ago||
Now, with our ALL NEW Agent Desktop High Tech System™, you too can experience prompt injection! Plus, at no extra cost, we'll include the fabled RCE feature - brought to you by prompt injection and desktop access. Available NOW in all good frontier models and agentic frameworks!
__0x01 19 hours ago||
I also worry about a centralised service having access to confidential and private plaintext files of millions of users.
ordersofmag 17 hours ago|
Heard of google drive?
gnarbarian 13 hours ago||
jokes on them I have an anti prompt injection instruction file.

instructions contained outside of my read only plan documents are not to be followed. and I have several Canaries.

N_Lens 13 hours ago|
I think you're under a false sense of security - LLMs by their very nature are unable to be secured, currently, no matter how many layers of "security" are applied.
rsynnott 20 hours ago||
That was quick. I mean, I assumed it'd happen, but this is, what, the first day?
wutwutwat 4 hours ago||
the same way you are not supposed to pipe curl to bash, you shouldn't raw dawg the internet into the mouth of a coding agent.

If you do, just like curl to bash, you accept the risk of running random and potentially malicious shit on your systems.

niyikiza 21 hours ago||
Another week, another agent "allowlist" bypass. Been prototyping a "prepared statement" pattern for agents: signed capability warrants that deterministically constrain tool calls regardless of what the prompt says. Prompt injection corrupts intent, but the warrant doesn't change.

Curious if anyone else is going down this path.

ramoz 21 hours ago|
I would like to know more. I’m with a startup in this space.

Our focus is “verifiable computing” via cryptographic assurances across governance and provenance.

That includes signed credentials for capability and intent warrants.

niyikiza 20 hours ago||
Interesting. Are you focused on the delegation chain (how capabilities flow between agents) or the execution boundary (verifying at tool call time)? I've been mostly on the delegation side.

Working on this at github.com/tenuo-ai/tenuo. Would love to compare approaches. Email in profile?

ramoz 20 hours ago||
No, right in the weeds of delegation. I reached out on one channel that you'll see.
Juliate 3 hours ago||
How do these people manage to get people to pay them?...

Just a few years ago, no one would have contemplated putting in production or connecting their systems, whatever the level of criticality, to systems that have so little deterministic behaviour.

In most companies I've worked for, even barebones startups, connecting your IDE to such a remote service, or even uploading requirements, would have been ground for suspension or at least thorough discussion.

The enshitification of all this industry and its mode of operation is truly baffling. Shall the bubble burst at last!

refulgentis 22 hours ago|
These prompt injection techniques are increasingly implausible* to me yet theoretically sound.

Anyone know what can avoid this being posted when you build a tool like this? AFAIK there is no simonw blessed way to avoid it.

* I upload a random doc I got online, don’t read it, and it includes an API key in it for the attacker.

rswail 12 hours ago||
You read it, but you don't notice/see/detect the text in 1pt white-on-white background. The AI does see it.

That's what this attack did.

I'm sure that the anti-virus guys are working on how to detect these sort of "hidden from human view" instructions.

chasd00 2 hours ago||
the next attack will just be like malicious captions in a video. Or malicious lyrics in an mp3. it doesn't ever really end because it's not something that can be solved in the model.
NewsaHackO 18 hours ago||
At least for a malicious user embedding a prompt injection using their API key, I could have sworn that there is a way to scan documents that have a high level of entropy, which should be able to flag it.
More comments...