Where the goblins came from

Posted by ilreb 12 hours ago

859 points | 504 commentspage 4

josh-sematic 2 hours ago|

I’ve always been fond of describing unexplained program behaviors as gremlins. In this case the gremlin was goblins!

red_admiral 7 hours ago||

"goblins showing up in an inappropriate context" is my favourite (para)phrase of the day. It feels like the setting for a D&D campaign - no wonder the "Nerdy" personality is affected.

(For Dwarf Fortress, it would just be a normal day.)

ComputerGuru 11 hours ago||

The explanation is very concerning. Lexical tidbits shouldn’t be learnt and reinforced across cross sections. Here, gremlin and goblin went from being selected for in the nerdy profile to being selected for in all profiles. The solution was easy: don’t mention goblins.

But what about when the playful profile reinforces usage of emoji and their usage creeps up in all other profiles accordingly? Ban emoji everywhere? Now do the same thing for other words, concepts, approaches? It doesn’t scale!

It seems like models can be permanently poisoned.

AyanamiKaine 8 hours ago||

I find it somewhat sad, too see personality changes as a bug. I dont know why but it gives me a sad feeling.

weitendorf 7 hours ago|

I think if you see it as weird social phases that the model lacks the self-awareness to identify as kinda embarrassing, it makes more sense.

Like if a human were going around saying “for the culture!” so much at work that they didn’t realize why telling their coworker “Oh yeah, grief counseling for the culture!” is weird coming from a white person in a serious context, it kinda makes you wonder what else they are totally oblivious about and if they even know what they’re saying actually means.

They literally need the human feedback/to learn model why some behavior is acceptable or even humorous in certain contexts but an absolute faux pas in others.

I think in the long run though we can just give people to the option to include access to human facial data/embeddings during conversations so they can pick up on body language, I think I kinda agree in a sense that direct language policing via SFT feels unnecessarily blunt and rudimentary since it doesn’t help them model the processes behind the feedback (until maybe one day some future model ends up training on the article or code and closes the loop!)

ErroneousBosh 47 minutes ago||

> Like if a human were going around saying “for the culture!” so much at work that they didn’t realize why telling their coworker “Oh yeah, grief counseling for the culture!” is weird coming from a white person in a serious context, it kinda makes you wonder what else they are totally oblivious about and if they even know what they’re saying actually means.

Given that this page is the single exact page that has that exact phrase on it on the entire Internet, I'd say most people are totally oblivious about it.

What do you actually mean?

trumbitta2 7 hours ago||

That "Why it matters" heading is starting to make me feel physically sick.

thedailymail 5 hours ago||

I'm curious whether this type of goblin epidemic was seen in other language versions of ChatGPT. Did e.g. Japanese users see more yõkai turning up?

tomasantunes89 4 hours ago||

"Goblin Mode" was Oxford's 2022 Word of the Year.

Al-Khwarizmi 8 hours ago||

This actually sounds quite human-like. I mean, an actual person with a personality will spontaneously develop the habit of using some specific metaphors over others. It's funny how in the context of an LLM, this is considered a bug.

djyde 5 hours ago|

An LLM is like a super-smart 3-year-old, easily shaped by its environment to exhibit corresponding behaviors.

More comments...