Posted by jmsflknr 4 days ago
Grammarly is using our identities without permission, https://www.theverge.com/ai-artificial-intelligence/890921/g..., https://archive.ph/1w1oO
In other words an LLM can spit out a plausible "output of X", however it cannot encode the process that lead X to transform their inputs into their output.
i can ask it to tell me how to write like a person X right now.
The person in that room, looking up a dictionary with Chinese phrases and patterns, certainly follows a process, but it's easy to dismiss the notion that the person understands Chinese. But the question is if you zoom out, is the room itself intelligent because it is following a process, even if it's just a bunch of pattern recognition?
can you give a specific example of what an llm can't do? be specific so we can test it.
Not sure why you need a concrete example to "test", but just think about the fact that the LLM has no idea how a writer brainstorms, re-iterates on their work, or even comes up with the ideas in the first place.
ex: i read a lot of shakespeare, understand patterns, understand where he came from, his biography and i will be able to write like him. why is it different for an LLM?
i again don't get what the point is?
As another example, I can write a story about hobbits and elves in a LotR world with a style that approximates Tolkien. But it won't be colored by my first-hand WW1 experiences, and won't be written with the intention of creating a world that gives my conlangs cultural context, or the intention of making a bedtime story for my kids. I will never be able to write what Tolkien would have written because I'm not Tolkien, and do not see the world as Tolkien saw it. I don't even like designing languages
that's why we have really good fake van gogh's for which a person can't tell the difference.
of course you can't do the same as the original person but you get close enough many times and as humans we do this frequently.
in the context of this post i think it is for sure possible to mimic a dead author and give steps to achieve writing that would sound like them using an LLM - just like a human.
The LLM does not model text at this meta-level. It can only use those texts as examples, it cannot apply what is written there to it's generation process.
can you provide a _single_ example where LLM might fail? lets test this now.
Most importantly, negative but unused signals might not be available if the text does not mention it.
When the “how many ‘r’ in ‘strawberry’” question was all the rage, you could definitely get LLMs to explain the steps of counting, too. It was still wrong.
This isn't 2023 anymore
I suggest „randomly adjusting parameters while trying to make things better“ as that accurately reflects the „precision“ that goes into stuffing LLMs with more data.
This Grammarly thing seems to be a bastardized form of that not even sparing the dead.
I'd say that there was some incentive by the AI companies to muddle up the water here.
i give the LLM my codebase and it indeed learns about it and can answer questions.
Unless you are actually fine tuning models, in which case sure, learning is taking place.
if i showed a human a codebase and asked them questions with good answers - yes i would say the human learned it. the analogy breaks at a point because of limited context but learning is a good enough word.
Generative AI is a plague at this point. Everybody is adding to their wares to see what happens. It's almost like ricing a car. All noise, no go.
Unrelated but surprising to me that I've found built-in grammar checking within JetBrains IDEs far more useful at catching grammar mistakes while not forcing me to rewrite entire sentences.
Or do they?
Words paint the picture, but the meaning of the picture is what matters.
It really feels so wrong to spare nobody, not even dead writer/people.
All it's gonna do is something similar to em-dashes where people who use it are now getting called LLM when it was their writing which would've trained LLM (the irony)
If this takes off, hypothetically, we will associate slop with the writing qualities similar to how Ghibli art is so good but it felt so sloppy afterwards and made us less appreciate the Ghibli artstyle seeing just about anyone make it.
The sad part is that most/some of these dead writers/artists were never appreciated by the people of their time and they struggled with so many feelings and writing/art was their way of expressing that. Van Gogh is an example which comes to my mind.[0] Many struggled from depression and other feelings too. To take that and expression of it and turn it into yet another product feels quite depressing for a company to do
[0]: https://en.wikipedia.org/wiki/Health_of_Vincent_van_Gogh
That train left at full steam when companies scraped the whole internet and claimed it was fair use. Now it's a slippery slope covered with slime.
I believe there'll be no slowing down from now on.
They are doing something amazing, will they ask for permission? /s.
"The work is public, hence the name. It's well known, it's in the data. Who cares".
What will they do next? Create similar publications with domainsquatting and write all-AI articles with the "public" names?
Is it still fair use, then?
It's very enlightening, if you ask me.