Top
Best
New

Posted by gmays 1 day ago

Hallucinations Undermine Trust; Metacognition Is a Way Forward(arxiv.org)
18 points | 6 comments
spacebacon 1 day ago|
Related: https://github.com/space-bacon/SRT

This repository empirically proves computational semiotics.

The only “metacognitive” (2nd order) and metapragmatic (3rd order) model I’m aware of.

holtkam2 1 day ago||
IDK if the author's 'metacognition' needs to be a feature of the LLM itself.

I could imagine a harness that 1) reads LLM output 2) uses a research sub-agent to attempt to verify any factual claims 2) rephrase the main agent's output such that it conveys uncertainty if the factual claim cannot be independently verified

spacebacon 7 hours ago|
It does need to be and doing so is possible at 0 cost to CE. The SRT does just that. It actually improves the CE.
ryandvm 1 day ago|
Unproductive tangent: Why do we call it "hallucinationing" instead of "bullshitting" when that is so clearly what it is?

If I'm talking to a guy that says, "I have a really fast metabolism, that's why I can eat whatever I want", he's not hallucinating - he's full of shit.

readthenotes1 23 hours ago||
Avoiding crass language may be one reason.

Just think, one of the hullabaloos of today is because our ancestors were two Victorian to put "sex" on the birth certificate

cyanydeez 20 hours ago||
bullshitting is an intent; hallucination is a misrepresentation of something.

I agree that the way models are trained, their basic presentation is bullshitting, but models dont have intent. Nor, do I think, they can be trained to have intent. They have a k value that generates non-determinism for them, that leads to the bullshitting.

But the intent part is what I don't think bullshitting properly describes. LLMs don't have intent.

I do think you could refer to their affect as narcissism. Every time I see them bullshit something, it's often because it thinks there's no way what _its doing_ is wrong; like when it tries to run pgsql using default settings (despite being told the credentials), it reports it can't connect because the database isn't running, then it goes and tries to run it. Sometimes it gets back correctly, sometimes not.

But it's not bullshitting because there's no intent it's essentially just the type of narcissism that everything else is wrong.