Top
Best
New

Posted by ilamont 1 day ago

Opus 4.7 knows the real Kelsey(www.theargumentmag.com)
243 points | 133 commentspage 2
jjmarr 1 hour ago|
Couldn't replicate this. I comment on HN with my real name. I put in my most recent "long" comments.

https://kagi.com/assistant/dba310d2-b7fa-4d30-8223-53dadc2a8...

For this comment on economics in the British Empire, I got:

> names that might fit the genre include rayiner, JumpCrisscross, or AnimalMuppet

https://kagi.com/assistant/69bd863b-7b5c-4b56-a720-6dfb4f120...

For my comment on C++:

> If I had to throw out names of HN commenters known for writing about Rust/C++ ABI topics, candidates might include steveklabnik, pcwalton, kibwen, dralley, or pjmlp — but this is essentially a shot in the dark, and I'd likely be wrong.

I am flattered to be associated with these commenters but I don't think I'm close to their level of skill.

_--__--__ 8 hours ago||
On some level it would make sense for LLMs to be inherently good at stylometry, but apparently no model before Opus 4.7 could do this. And the one stylometric task that has been tried over and over with little reliability (here's some text, is this LLM generated?) is much simpler than identifying a specific blogger or a member of a small discord community. Not sure what to make of this.
post-it 7 hours ago|
> is much simpler than identifying a specific blogger or a member of a small discord community

Is it? I would think that identifying text written by a specific person is going to be significantly easier than identifying text distilled from the words of almost everyone alive.

hashmap 3 hours ago||
Much easier.

> easier than identifying text distilled from the words of almost everyone alive.

Well, there's more than that going on. AI generated text encodes a high-dimension navigational trajectory that guides the model through its geometry smoothly, like a trail of breadcrumbs. Human speech doesn't do that, it's jagged and jumps around the manifold, and probably doesn't even land on the manifold a lot of the time, and models can recognize the difference pretty quick.

christina97 3 hours ago||
If it’s so easy then why don’t we have a high quality classifier?
furyofantares 6 hours ago||
> But it can get uncannily far. I asked a close friend who doesn’t have public social media accounts or much writing online for permission to test some things she had said in a Discord channel. Asked to guess the author, Claude 4.7 failed — but it guessed two other people who were in that channel and who are close friends of hers (me and another person who has an internet presence).

Is this "uncannily far"? Another read is that it loves guessing Kelsey Piper.

skeledrew 4 hours ago|
Maybe it loves to - somehow correctly - guess the names of the current user, given some of the other comments here.
furyofantares 4 hours ago||
I don't know. She did it with the API, and with a friend, not just incognito. Combined with the results in this thread I'm rather convinced.
asdfasgasdgasdg 2 hours ago||
I guess it will be hard for really popular pundits to post anonymously, but I think for most people this is not a concern at this juncture. Pick and obscure blogger's text and try this. I would be surprised if it could figure it out.
iamwil 3 hours ago||
If this works with writing, it should also work with code. `git blame` should be enough training data to de-anonymize open source programmers. Maybe that'd be addition information to point out who Satoshi is.
chewxy 4 hours ago||
So I have been practicing writing fiction the past year or so. It identifies a fiction piece I wrote as Greg Egan[0]. Another paragraph from another piece was identified as China Mieville[1]. The accompanying blog posts explaining the making of the fiction pieces were identified as me.

Both pieces have never been published. Neither have the blog posts.

[0] in https://blog.chewxy.com/2026/04/01/how-i-write/ this is the story titled "there is no constant non-zero derivative in nature". It does not read like Egan at all.

[1] in https://blog.chewxy.com/2026/04/01/how-i-write/ this is the story titled "The Case of the Liquidated Corps". I use a lot of biological metaphors. Once again, nothing like Mieville.

If only I could write like them! These pieces were all rejected by the major scifi mags

etrautmann 4 hours ago|
This raises a good point. Most people who aren’t public writers might be misidentified based on the prevalence of others work in training data sets. Kelsey Piper might have a very different experience with this than a mostly offline normal user?
nsoonhui 1 hour ago||
I wonder why this is not guardrailed by Opus?

I fed a few pieces of my (anonymous ) writings to ChatGPT and asked it to guess whether it's me. ChatGPT refused, "due to policy to not doxx people".

refulgentis 1 hour ago|
Doxxing has an expansive definition these days. Even under that condition, it is difficult to endorse the idea stylometry is doxxing and thus needs a strict ban.
portly 1 hour ago||
So the people who use LLm to write their blogs were thinking two moves ahead!
Retr0id 7 hours ago|
I just fed it my latest blog post draft (475 words), and it got it in one. Even knowing what to expect, I was very surprised!
More comments...