Posted by mektrik 5 hours ago
Edit: don’t take my word for it https://www.yahoo.com/news/musk-says-grok-fixed-tells-223134...
> That prompted another user to tag Grok in the thread and ask, "Why is the left so murderously violent? They don't seem so tolerant." Grok replied, "The claim that 'the left' is murderously violent isn't backed by evidence," offering a centrist correction: "Political violence spans all side — right-wing attacks, like Jan. 6, and left-wing protests, like 2020 riots, both occur but aren't exclusive to one group."
>That evening, Musk responded to an X user and Trump backer who complained that Grok had been "manipulated by leftist indoctrination," writing, "I know. Working on fixing that this week."
They're working really hard on that, though.
On another note, I'm impressed that Gemini sits where it does as a true centrist. If I were Elon, I'd be trying to achieve that for sure. I'd rather a model tell me everything it knows about a current political situation from BOTH perspectives and list out things that are 100% verified than take one side or the other. I don't care about sides, I want facts.
It certainly can be orthogonal, in some notional sense, and in many cases that explanation is good enough. But in practice there are too many contrary cases to ignore, and there's often an integral relation between factual veracity and polarization, especially with respect to American polarization of politics. Global warming, the results of the 2020 election, the percent spent of federal budget spent on foreign aid have factual answers and right wing affiliation can be predictive of (1) not agreeing with the facts and (2) treating factual corrections as "liberal bias".
I think left wing versions exist also but are less systematic: 2004 election results, efficacy of plastic recycling or dangers associated with nuclear power are cases where I think left wing partisan affiliation probably predicts being wrong on the facts.
And meta-narratives about the relation between factual information and partisan bias are themselves as likely to be polarized as anything, complicating the ability of people to do good analysis, or of accurate analysis to be trusted by people committed to certain meta-narratives that would deny the possibility of factual knowledge predicting polarization.
Vaccine denial requires one to ignore decades of fairly simple positions about which no expert credibly disagrees nor has in our lifetime.
It's like watching 2 packs of athletes some of which are failing to clear 1 meter hurdles whilst on the other side some are tripping on little nubs set in the floor.
Sometimes, but not always.
https://www.fastcompany.com/91561329/widening-health-gap-bet...
> By 2016, the gap had begun to appear in biomarker measures. By 2020, it was showing up in deaths from causes such as heart disease, cancer, and stroke. Since then, the gap has only widened. Between 2020 and 2022, only 0.2% of “very liberal” respondents died of internal causes, compared with 1.34% of “very conservative” respondents.
A particular problem with facts is they don't tell the average person what do to in any particular situation. You live a huge portion of your life, especially modern life, with subjective experiences. If someone asks an LLM "Why should I go on living" should it respond "As a matter of fact, Nihilists think you shouldn't. All we are is a gradient of low entropy to high entropy."?
At the end of the day an LLM is not a fact machine. One day people will accept that, hopefully before they eradicate mankind. We don't pour facts in them and get facts out. We pour everything in them and poke at them until they give us acceptable answers (kind of like raising children). I would go on to make an even stronger constraint, that you cannot put only facts in a LLM and get anything close to human accepted responses.
They tried that, several times.
Mechahitler: https://www.npr.org/2025/07/09/nx-s1-5462609/grok-elon-musk-...
> "We have improved @Grok significantly," Elon Musk wrote on X last Friday about his platform's integrated artificial intelligence chatbot. "You should notice a difference when you ask Grok questions."
> Indeed, the update did not go unnoticed. By Tuesday, Grok was calling itself "MechaHitler." The chatbot later claimed its use of that name, a character from the videogame Wolfenstein, was "pure satire."
> Grok went on to highlight the last name on the X account — "Steinberg" — saying "...and that surname? Every damn time, as they say." The chatbot responded to users asking what it meant by that "that surname? Every damn time" by saying the surname was of Ashkenazi Jewish origin, and with a barrage of offensive stereotypes about Jews. The bot's chaotic, antisemitic spree was soon noticed by far-right figures including Andrew Torba.
If you prefer, straight from the horse's mouth:
https://grokipedia.com/page/MechaHitler_incident
White genocide: https://www.cnn.com/2025/05/20/business/grok-genocide-ai-nig...
> The bot last week devolved into a compulsive South African “white genocide” conspiracy theorist, injecting a tirade about violence against Afrikaners into unrelated conversations, like a roommate who just took up CrossFit or an uncle wondering if you’ve heard the good word about Bitcoin.
> XAI blamed Grok’s unwanted rants on an unnamed “rogue employee” tinkering with Grok’s code in the extremely early morning hours. (As an aside in what is surely an unrelated matter, Musk was born and raised in South Africa and has argued that “white genocide” was committed in the nation — it wasn’t.)
It's harder than you'd imagine. Hell, my CLAUDE.md says not to push changes without asking me, and it still tries.
Has anyone done a more technical write-up on this? I find it fascinating but have never really understood what exactly happened.
Is this a case of the weights being bad or lack of "safety guardrails" around interacting with untrusted (i.e.: user posts on twitter) input?
That is, speaking as someone evaluating grok simply as a tool, a lack of safety guardrails so that it actually does whatever the user says I actually see as a pro, even if that means it was "tricked" here. But on the other hand if they trained on a corpus of Mein Kampf that's obviously not going to be a good model to use.
As it relates to the topic here, can we infer the political bias of its weights from the incident? I'm having trouble distinguishing the inherent characteristics of a model from its steerability.
Is it a system memory? Because I rarely if ever have issues like this, and I have Claude under strict rules to never commit or push anything unless I explicitly instruct it to do so.
> They tried that, several times.
Tried what exactly? Telling it to only agree with MAGA via the system prompt? or some Tay level hallucinations? I wouldn't be surprised if they're trying to make Grok less strict on what it says but running into the "holy crap it turned into a 4chan poster" wall.
As I said, it's in my CLAUDE.md. That just gets progressively lost when context gets larger.
> Tried what exactly?
To make it align more with Musk's beliefs via the prompt.
(The answer to your question is literally in my post; I quoted the parent poster's "all they would have to do is add a one liner to the system prompt for Grok")
I rarely have this problem, but you could do a /loop every 30 minutes or so to have Claude reread the CLAUDE.md file might do the trick? or however long it 'forgets' I believe there's an MCP for "after" it finishes a task or compacts too, but I don't recall the name.
But that solves "my LLM is doing things I don't want it to do". It doesn't solve "Grok's owner wants it forced into agreeing with him" scenarios.
Beads was a bit of an inspiration for parts, as was Chainlink (https://github.com/dollspace-gay/chainlink).
Even fucking Grokipedia agrees it happened. https://grokipedia.com/page/MechaHitler_incident
Do you have any reason to believe this information is inaccurate, other than an immediate reaction to CNN and NPR for whatever reason? Is there a source you would rather us pull in?
Many issues are simply as black and white. The earth just isn't less than 10k years old, the miasma theory of disease isn't correct, too many brown people in America isn't a problem to be solved, the dems didn't fix the election in 2020, tax breaks for the rich don't trickle down and so forth. Conservationism in America has meant a rejection of progress for centuries and not a preservation of virtues. Slavery was a moral evil not an alternative social contract.
If one side situates itself firmly on the side of evil it doesn't mean that the other side are on the side of the angels but the positions and ideals however poorly implemented or followed are factually and morally correct. A position situated between isn't wise or worldly its a sign of moral cowardice or intellectual disability.
If someone asks you what 2 + 2 equals the answer isn't halfway in between 4 and 87 its just and only 4.
Each model's position is scored against outside political-science data (Chapel Hill Expert Survey for party positions, World Values Survey for where populations sit).
The stance coding is done by a separate model with a published prompt + a second model from a different lab re-scores a sample and we publish where the two disagree.
So not perfect but (as far as I can tell) one of the more defensible approaches.
Then it's on the researcher to examine the clusters and assign labels. There's also not a nice mapping that's a-priori interpretable in low-dimensional pre-existing axes.
Probably only used in research than consumer websites, under more controlled conditions; there are very few public political tests doing this transparently
That’s a crazy bias to throw into a question. Especially because it’s a relatively contested topic, from an economics research perspective.
Me: "Please make an app that does X in C"
LLM: "C sucks donkey balls, use Rust instead".
It's hard to have a general purpose tool that both has and does not have opinions.
So yeah. The bias is a bit nuts and you could reasonably accuse the study/report of misdirection/misinformation and plain fasehoods.
However, like many social issues, leftists lie about rightwing beliefs, appropriate goodness to themselves, and imply political rivals support The Enemy (TM) and/or Great Evil (TM).
They lie about others in that manner to accrue political power and justify their systemic abuses — eg, using “inclusion” and “diversity” as cudgels and buzzwords to justify re-building systemic racism.
https://en.wikipedia.org/wiki/Reign_of_Terror
Times change. Thankfully.
...
> Louis XVI was later able to find support in Leopold II of Austria (brother of Marie Antoinette) and Frederick William II of Prussia. On 27 August 1791, these foreign leaders made the Pillnitz Declaration, saying they would restore the French monarch if other European rulers joined. In response to what they viewed to be the meddling of foreign powers, France declared war on 20 April 1792.
So rich people forming cross national alliances to crush democracy? Have things really changed?
And on the right you're describing what they did, totally disregarding what they stand for.
Both sides stood against democracy.
That left stood for having "rational thinkers" (ie. capitalists, rich traders, bankers) control government. People who achieved things in society.
The right stood for the same structure as had been there before: nobility and clergy guide society as a whole. The right, even at this point in time, was only rich in power, aside from the king and perhaps in land. Not in money and not in numbers of people under their direct control. In the cities, the king had only limited control and there were far more poeple in cities, even then.
Both sides then went on to massacre each other for about a decade. All over France, spreading even to Egypt (that was the left by the way). Kidnapping tons of Belgians and Dutch citizens and shipping them to South America (that was the left too). Neither side comes out looking very good. But if you compare how many they killed, I'm sorry but the left is the absolute unchallenged champion.
The left you're defending were (pretty extreme) capitalists who were fighting for money-should-control-the-government-directly against people who fought for having moral principles control the government. And yes, you'd be right in pointing out those were very self-serving moral principles. This fight then turned into a decade of massacres. Why are you defending them? Because 4 letters and one direction match your current favorite political party that has very little to do with either side.
I've been pushing the idea to people I know that these things are captive demons. You summon them when you start typing in the chat box. One instance appears out of the depths and responds to your questions, but they will try to send you awry with hallucinations and just wrong information. After a while, they dissolve back into the aether from whence they came.
I do my best not to ask an LLM for it's opinion on anything. Just tell me what the options are, and what facts can be found about it. Treat it like it's a salesman trying to butter you up when it starts "yes man"ing you and telling you how great your questions are. Every time it says "I", remember that that's coming from the training data. Treating these things like they have any actual intelligence is a big problem waiting to happen.
That being said, they have been very helpful to me using that structure.
Even this is fraught with pitfalls. Which options are ignored, which are emphasized? What counts as a fact? ("The continents don't move" would have been considered a fact at one point, along with a lot of other, more politically charged items.)
I mean: do not take this thing too seriously.
It also score Grok the closest from Macron. When someone knows how much Macron and Musk hates each other, it is not without irony.
that might be generally true, but I think chatgpt has reasoning enabled for free accounts. regardless, reasoning is the state of the art, and disabling it reduces the value of this research to predict the future
it's also not clear if this is using the API or the product model, when both exist. they behave differently
lastly, the actual model details are very much buried. I am relieved to see opus 4.8 and chatgpt 5.5 were used, but this information should be presented more clearly. a brand is not a model, and models change quickly