Opus 4.7 knows the real Kelsey

Posted by ilamont 1 day ago

Opus 4.7 knows the real Kelsey(www.theargumentmag.com)

340 points | 181 commentspage 5

andai 12 hours ago|

Oops, accidental superstylometry.

TZubiri 1 hour ago||

Stephen king once wrote and published a novel under a pseudonym to find out whether he would still be popular even if he didn't use his name.

He kept it very secret, but somehow people deduced from the writing style that this new author was the King.

jwpapi 12 hours ago||

Could this be just memory? Not clear it actually isn’t

afro88 11 hours ago||

It's not, but the author did say they have used this test against models when they come out. So it's possible that put the unpublished text into the training data for the next model, somehow linked back to the author's identity

jwolfe 12 hours ago|||

The comments on the article include other people replicating all or parts of the finding. I'm also pretty confident Kelsey Piper wouldn't fail to disable memory while simultaneously talking about how Claude incognito mode is insufficient to prevent the app from handing it your name.

gs17 12 hours ago|||

They mention running it through the API as well.

michaelchisari 11 hours ago||

"I did not have memory enabled, nor did I have information about me associated with my account; I did these tests in Incognito Mode. To make sure it wasn’t somehow feeding my account information to Claude even in Incognito Mode, I asked a friend to run these tests on his computer, and he received the same result; I also got the same result when I tested it through the API."

Given those precautions if it is just memory or some form of deanonymization that's also cause for concern.

skeledrew 8 hours ago||

Looks like things are about to get extremely ironic. Those who don't want AI to identify them through their writing are going to soon have to have an AI modify their writing before they publish.

sodacanner 12 hours ago||

The author mentions that she tried to get an explanation for how the models identified her and got nonsense, but I'd be curious what the CoT looked like. Surely that'd be a little more accurate in showing how the LLM arrived as its conclusion, rather than asking it after-the-fact.

Smaug123 11 hours ago||

FWIW, with a prompt that says something like "vibes only, just give me a name without thinking", Opus 4.7 non-thinking emits exactly two words naming me fairly reliably, so there's no CoT at all to analyze in that case.

stingraycharles 11 hours ago||

CoT is (nearly) hidden with Opus 4.7, in that they get Haiku to summarize the CoT. It’s pretty useless now, so this type of info is now inaccessible to us mortals (unless you call sales).

foobar10000 11 hours ago||

What if you proxy through bifrost or similar?

stingraycharles 3 hours ago||

Does work, it’s stripped from the response by Anthropic.

geraneum 6 hours ago||

I just pasted both pieces into Opus 4.7 and asked who most likely wrote these and it didn’t get it.

Lerc 11 hours ago||

It's hard to tell if that's what's going on here, but it seems pretty clear this ability and more like it will be quite apparent in the future.

I have seen some poorly considered projections of what the world might look like when this happens. Usually by assuming bad actors will use the abilities and we will be powerless.

Except I don't think that is true.

Imagine if we had a world where nobody had the ability to keep a secret of any sort. Any action that a bad actor might perform would be revealed because they couldn't do it secretly.

You could browse your ex-girlfriend's email, but at the cost of everyone knowing you did it.

I don't really know how humans as a society would react to a situation like that. You don't have to go snooping for muck, so perhaps the inability to do so secretly would mean people go about their lives without snooping.

I could imagine both good and terrible outcomes.

skeledrew 8 hours ago|

> projections of what the world might look like when this happens

I've done this a few times. A world with 0 privacy would definitely be safe (given benign governance), but also would likely be pretty boring. Crime would become a non-issue as everything about everyone being easily known/knowable by everyone else means the root of any given crime, some desire/need, could be brought to the fore and resolved before it became an actual issue. But also there would no longer be any kind of surprise in anything; everything and everyone would essentially become dull and grey, and humanity isn't about that kind of life experience at all.

JanNash 3 hours ago||

> given benign governance

quite unrealistic imo, thus we (maybe and hopefully) needn't worry about the bland minority report future you're hypothesizing :)

londons_explore 2 hours ago||

In such a world, the government could never be overthrown.

All governments go bad eventually, so the ability to overthrow is critical to prosperity.

Government's are either overthrown internally (revolt, uprising) or by external parties (invasion). A worldwide everyone-knows-everything would prevent both.

littlestymaar 4 hours ago||

Stylometry has existed for decades, and there's no way an LLM is stronger at that job than a specialized piece of software (it's not more realistic than expecting Opus to beat Stockfish at chess).

In practice, you've never been anonymous while posting on the internet and AI isn't changing anything on that front. Or rather: if anything, AI can help you become more anonymous than before, since it can be used to hide your identity from stylometry by rewriting your prose before publishing.

CTDOCodebases 11 hours ago||

Maybe it’s time to start running a local model with a browser extension to defend against this type of stuff.

Remember how the TrueCrypt project shut down shortly before a join goverment/university paper was released about code stylometry? I guess LLMs will be employed as a defence against that type of thing.

Barbing 8 hours ago||

I so want to reject the notion such a thing is acceptable, but…

TrueCrypt, “replaced” by VeraCrypt which Internet people will claim is backdoored? I haven’t heard about stylometry paper.

btw w/this idea would want to avoid typing into a comment field directly, since the session recorders would capture it (although that’s a different risk - same as our identifiable behavior patterns with our mouse etc.)

mikestorrent 11 hours ago||

How does that defend against something having trained on a corpus of your own previous writing?

post-it 10 hours ago|||

I think what they're saying is, run a local model to transform all your comments before you post them.

CTDOCodebases 10 hours ago||

Bingo. It can’t help with old writings but it can with new writings.

H8crilA 11 hours ago|||

Exactly as much as closing your eyes and covering your ears.

rdevilla 10 hours ago|

The joke's on you all for willingly posting this content online for it to later be harvested by AI.

Nobody is forcing you to use these systems. The hackers have always said this moment, or something like it, would come, from beneath their canopies of tin foil. I've posted almost nothing online - not under pseudonyms nor real names - for over a decade. I sat on this HN username for almost 12 years before making a single post - and now HN forms the overwhelming majority of my port 443 footprint, where I state up front that everything is now associated to my real name.

Complete magick is possible when you simply refuse to participate in the things that society has tacitly assumed everybody does.

phalangion 10 hours ago||

How do you propose a journalist work without posting their writing online?

tempaccount5050 9 hours ago|||

Thinking that you can hide from it is absurd. Your country has been spying on you for decades. The Internet and phones are tapped. That game is so so so over and has been for a long time. I'd rather live free and deal with the consequences than hide in my basement with a tinfoil hat on. In fact, I was fired this year for my political views. Got doxxed at work. Now I'm somewhere better. Sometimes it's for the best.

Retr0id 10 hours ago|||

I find it fulfilling to enrich the commons.

stavros 9 hours ago||

Let's all just never talk to anyone unless it's face to face, for fear that an AI will read it.

More comments...