Posted by 0o_MrPatrick_o0 11 hours ago
If it's useful, it's useful, enjoy. If you aren't comfortable with that, don't use LLMs. You aren't going to get a mathematical proof of your output, just learn to be comfortable with that, or opt out and be a goat farmer.
No, they aren't a summary. They are the actual decoding of the sequence of tokens emitted during the the “thinking” stage of response generation.
Just as with, say, a human onner monolog in words vs actual speech, they are a product of the same output process as the non-thinking tokens. They aren’t a translation of the internal process that precedes the output mapped into language, either as a full result or a summary.
Having access to the reasoning text and output would help with performance measurement.
For daily use I actually like the reasoning summary to be brief/quick to scan.
That said, I understand the author’s desire for the real thing. It just feels better to have that access, especially when Anthropic will give it to you, but encrypted.
This could all be optics as well to try to give the appearance of a defensible moat. E.g. they can claim to investors that they are able to protect a significant chunk of their intellectual property this way. I'm not sure if anyone has a study about how significant the summarization is to distillation.
In the case of makers of open-source models (which are also competition), there is no allegedly, they were (and still are) openly doing that.
That distillation might be inferred from the behavior of commercial models is not the same as them openly doing it.
It may also be that misaligned responses can be in CoT which OpenAI does not want to show to users.
In this case it stops people copying your IP
Being currently in the lead in a category is not a moat,a moat is whatever creates a barrier to competitors catching up when you are in the lead. Merely being in the lead is not a moat except in a market with strong network externalities.
You cant even guarantee WHAT model you get. Or if they downgrade you. Or if you 'offend corporate sensibilities' and they misdirect or lie.
The only way to get good returns on a model is to run it yourself. Quit paying for corporate bullshit.
> The computation we can see looks like it’s just guessing the answer, despite the chain of thought suggesting it’s computed it using a calculator.
It might be hallucinating or lying, it's not like you are actually observing the internals of the model.