LLMs corrupt your documents when you delegate

Posted by rbanffy 1 day ago

LLMs corrupt your documents when you delegate(arxiv.org)

428 points | 169 commentspage 3

pickleRick243 12 hours ago|

With this paper by Microsoft and the infamous paper by Apple last year, it seems the tech giants that don't have their own models are getting a bit insecure.

twobitshifter 22 hours ago||

I thought this was going to be about a problem we saw recently. Someone used an LLM to update the comment block at the start of each source file, and the LLM programmed its own tool that ended up changing ALL of the line endings when it output again with the corrected comment block. Instead of an LLM we could have used find and replace, but people are thinking LLM is the only tool.

madprops 13 hours ago||

Simple quick blind test for fun: https://w.merkoba.com/pickabot/

deferredgrant 14 hours ago||

Delegation needs a boundary. If the task is "improve this section," the system should make it very obvious what it touched and what it left alone.

woeirua 23 hours ago||

It's an interesting paper, but I'd like to see a lot more about the types of errors that the LLM makes. Are they happening in the forward pass or the inverse pass? My guess is the inverse pass.

daveguy 18 hours ago|

This sounds like wishful thinking to me.

The tasks are designed to be reversible. Whether it stochastic parrots in the forward direction or reverse direction is irrelevant. Especially considering these are inference engines. Every pass is a forward pass from the perspective of the LLM / agent. There is no feedback loop, and part of the reason why it's so easy for these things to mangle tasks. They are plausible sounding sentence/sequence generators.

y3ahd0g 18 hours ago||

Yeah so I run my agents as a different user that do not have write perms to my /home

Then I can diff what they wrote with my copy

Users are the OG container. On Linux it's possible to constrain a user to a network namespace, cgroups.

BPF can be used like docker compose to ensure a service running under a user is running

TL;DR a lot of the userspace cruft we import to run software has been rolled into the kernel over the last 10-15 years.

Ignore the terminology "user". Under the hood all the same constraint and boundary setting you want exists without downloading the entire internet

adampunk 23 hours ago||

LLMs will make mistakes on every turn. The mistakes will have little to no apparent connection to "difficulty" or what may or may not be prevalent in the training data. They will be mistakes at all levels of operation, from planning to code writing to reporting. Whether those mistakes matter and whether you catch them is mostly up to you.

I have yet to find a model that does not make mistakes each turn. I suspect that this kind of error is fundamentally incorrigible.

The most interesting thing about LLMs is that despite the above (and its non-determinism) they're still useful.

simonw 22 hours ago||

> I have yet to find a model that does not make mistakes each turn

What kind of mistakes are you talking about here?

pyrolistical 23 hours ago||

As a human I make typos all the time

dangus 23 hours ago|||

A human can sit down and say “I’m going to make sure this is correct on the first pass and make sure I make an exact copy.”

They have cognitive awareness of which tasks are highly critical and need more checking and re-checking without being prompted to think that way.

For a human, time doesn’t stop when the first pass of the prompt and response is over. An LLM effectively wipes its memory of what it just did unless something is keeping track of a highly resource constrained context.

An LLM is like an author of a book that immediately closes its eyes and wipes its memory after writing a chapter. Sure, it can pull some of that back in the next query via context, and it can regain context very quickly, but it effectively has no memory of the exact thing it just did.

When a human is doing these tasks there is a lot of room for mistakes but there’s also a wildly higher capacity for flowing through time.

adampunk 23 hours ago||

Ok, and?

simonh 22 hours ago|||

Humans understand what mistakes are and can reason about what constitutes a mistake and what doesn’t. LLMs can’t do that.

It’s for the same reason that they will invent bullshit instead of saying “I don’t know”, when they don’t know. They don’t have a concept of accuracy of facts.

dangus 22 hours ago|||

And that’s why I’m paid six figures and my LLM is paid $20/month.

leptons 19 hours ago||||

The LLM makes typos for me all the time using AI autocomplete. It's caused a lot of frustration while coding, because it makes mistakes that I would not. When it does help, it's great, but the errors waste as much time as the LLM saves me. Even using agentic coding, AI is mostly break-even for me.

adampunk 23 hours ago|||

I do too! I also make higher level design errors and get too enthusiastic about projects before code is written.

We are, in a sense, fallible machines who have designed a planet-wide computational fabric around that fact.

peyton 23 hours ago||

[flagged]

rao-v 19 hours ago||

May your contexts always be short

tieTYT 17 hours ago|

Before I read some of the study, I thought that was relevant too, but each "step [was] conducted as an independent, single-turn session."

carterschonwald 22 hours ago||

this is literally just “leave a child at the work computer with a real doc open playing office”. otoh it is good to design benchmarks tonground these things.

on the flip side if you’re literally just using a bare bones harness on top of a stochastic parrot, of course stochastic errors accumulate.

theres a lot of ways for improving text faithfulness through harness tool designs, and my incremental experiments seem promising.

but unless work is gated on shit like “the script used must type checked ghc haskell or lean4”, unsupervised stuff is gonna decay

rhubarbtree 20 hours ago|

It’s not a stochastic parrot.

kgwgk 16 hours ago||

It’s a stochastic goblin.

carterschonwald 14 hours ago||

good one

cyanydeez 1 day ago|

I played around with a local LLM to try and build a wiki like DAG. It made a lot of stupid errors from vague generic things like interpreting based on file names to not following redirects and placing the redirect response in them.

I've also had them convert to markdown something like an excel formatted document. It worked pretty well as long as I was examining the output. But the longer it ran in context, the more likely it was to try in slip things in that seemed related but wasn't part of the break down.

The only way I've found to mitigate some of it is to make every file a small-purpose built doc. This way you can definitely use git to revert changes but also limit the damage every time they touch them to the small context.

Anyone who thinks they're a genius creating docs or updating them isnt actually reading the output.

sebastiennight 23 hours ago|

> I've also had them convert to markdown something like an excel formatted document.

This look like a task where the LLM would be best used in writing a deterministic script or program that then does the conversion.

Trusting a LLM to make the change without tools is like telling the smartest person you know to just recite the converted document out loud from memory. At some point they'll get distracted, wrong, or unwittingly inject their own biases and ideas into it whenever the source data is counter-intuitive to them.

trollbridge 23 hours ago|||

I see people cut and paste from Excep into a chat, as an image, and ask it to sum up numbers.

somewhatgoated 21 hours ago||

I’ve seen people drink their own recycled piss and inject coffee into their ass - what’s your point?

sebastiennight 19 hours ago||

In the first half, I thought you were an astronaut, but the second half has me double-guessing myself.

somewhatgoated 15 hours ago||

I used to be a connoisseur of weird Facebook groups - I would advise everyone to never look into aged urine, coffee enemas or targeted individuals - makes you lose your faith in humanity

cyanydeez 22 hours ago|||

it was, but the formatting was garbage so it ran again to fix thw format.

More comments...