Top
Best
New

Posted by 0o_MrPatrick_o0 11 hours ago

The text in Claude Code’s “Extended Thinking” output(patrickmccanna.net)
262 points | 183 commentspage 5
bpodgursky 10 hours ago|
The full thinking logs are also a summary of a thinking process presumably consistent with one necessary to generate the provided answer. Nobody really understands how LLMs think. Thinking logs seem to be accurate, and summary thinking logs seem to be a good summary of the full thinking logs.

If it's useful, it's useful, enjoy. If you aren't comfortable with that, don't use LLMs. You aren't going to get a mathematical proof of your output, just learn to be comfortable with that, or opt out and be a goat farmer.

dragonwriter 9 hours ago||
> The full thinking logs are also a summary of a thinking process presumably consistent with one necessary to generate the provided answer.

No, they aren't a summary. They are the actual decoding of the sequence of tokens emitted during the the “thinking” stage of response generation.

Just as with, say, a human onner monolog in words vs actual speech, they are a product of the same output process as the non-thinking tokens. They aren’t a translation of the internal process that precedes the output mapped into language, either as a full result or a summary.

0o_MrPatrick_o0 10 hours ago|||
I want to measure performance drift over time.

Having access to the reasoning text and output would help with performance measurement.

solarkraft 10 hours ago||
Yeah. The output is magic either way, with or without reasoning.

For daily use I actually like the reasoning summary to be brief/quick to scan.

That said, I understand the author’s desire for the real thing. It just feels better to have that access, especially when Anthropic will give it to you, but encrypted.

poppafuze 7 hours ago||
post title checks out
apothegm 10 hours ago||
Slashdotted.
ur-whale 10 hours ago||
When you have no moat, you have to try and find desperate ways to manufacture one.
anuramat 10 hours ago|
wdym?
singron 9 hours ago|||
Other companies were allegedly distilling the models by training on the reasoning output. By hiding the reasoning tokens, it makes it harder to do this. You can still try to distill the models, but you can't distill reasoning itself as well.

This could all be optics as well to try to give the appearance of a defensible moat. E.g. they can claim to investors that they are able to protect a significant chunk of their intellectual property this way. I'm not sure if anyone has a study about how significant the summarization is to distillation.

dragonwriter 9 hours ago||
> Other companies were allegedly distilling the models by training on the reasoning output

In the case of makers of open-source models (which are also competition), there is no allegedly, they were (and still are) openly doing that.

nullc 7 hours ago||
In the case of the closed models too... Claude would happily tell you it was deepseek-v3 if you asked in chinese until it caught public attention and they papered over it.
dragonwriter 1 hour ago||
The word “openly” in my post there for a reason; the commercial models are not openly distilled from competitors: many open source models have in their model documentation that distillation was done from a dataset drawn from specific other models, including commercial models.

That distillation might be inferred from the behavior of commercial models is not the same as them openly doing it.

nullc 38 minutes ago||
Fair enough!
ur-whale 10 hours ago|||
> wdym?

https://en.wikipedia.org/wiki/Economic_moat

anuramat 10 hours ago||
how is summarized CoT a moat, and how is having the top 2 LLMs not a moat?
Closi 10 hours ago|||
If you have the full outputs, it might make it easier for competitors to distil the model or reverse engineer the full process.

It may also be that misaligned responses can be in CoT which OpenAI does not want to show to users.

anuramat 9 hours ago||
but "harder to reverse engineer" isn't manufacturing, that's protecting your moat
Closi 8 hours ago||
What is a moat if not something used to protect the castle?

In this case it stops people copying your IP

dragonwriter 9 hours ago|||
Not revealing actual thinking traces prevents mdoel distillation on yhe actual output (thinking traces are a key part of the output) which makes it harder for conpetitors to catch up (a moat).

Being currently in the lead in a category is not a moat,a moat is whatever creates a barrier to competitors catching up when you are in the lead. Merely being in the lead is not a moat except in a market with strong network externalities.

anuramat 8 hours ago||
unrestricted access to better models at compute prices = better synthetic data and faster research, so its not just about the product imho
nekusar 8 hours ago||
Yep, its basically a scam to charge you more tokens and provide less compute.

You cant even guarantee WHAT model you get. Or if they downgrade you. Or if you 'offend corporate sensibilities' and they misdirect or lie.

The only way to get good returns on a model is to run it yourself. Quit paying for corporate bullshit.

rustcleaner 5 hours ago|
Never ever subscribe. Let them bankrupt themselves on the altar of safety!
nekusar 4 hours ago||
I'll be safe here, and run Qwen3.6 locally.
ForHackernews 8 hours ago||
Whatever it says is not always what it is doing https://transformer-circuits.pub/2025/attribution-graphs/bio...

> The computation we can see looks like it’s just guessing the answer, despite the chain of thought suggesting it’s computed it using a calculator.

It might be hallucinating or lying, it's not like you are actually observing the internals of the model.

sarracin0 3 hours ago||
[flagged]
yuvrajsa 9 hours ago||
[flagged]
impartshadow 6 hours ago|
[flagged]
More comments...