Posted by grep_it 3 days ago
I suspect, antirez, that you may have greater success removing some of the most common English words in order to find truly suspicious correlations in the data.
cocktailpeanuts and I for example, mutually share some words like:
because, people, you're, don't, they're, software, that, but, you, want
Unfortunately, this is a forum where people will use words like "because, people, and software."
Because, well, people here talk about software.
<=^)
Edit: Neat work, nonetheless.
Yes, that's good! I didn't state my interest clearly, though. I'd like to see the "analyze" result with the stop words excluded, not for the style comparison part, but for the reasons you state and others.
The usage frequency of simple words is a powerful tell.
There are so many people that write like me apparently, that simple language seems more like a way to mask yourself in a crowd.