Anonymous request-token comparisons from Opus 4.6 and Opus 4.7

Posted by anabranch 14 hours ago

Anonymous request-token comparisons from Opus 4.6 and Opus 4.7(tokens.billchambers.me)

464 points | 476 commentspage 4

razodactyl 13 hours ago|

If anyone's had 4.7 update any documents so far - notice how concise it is at getting straight to the point. It rewrote some of my existing documentation (using Windsurf as the harness), not sure I liked the decrease in verbosity (removed columns and combined / compressed concepts) but it makes sense in respect to the model outputting less to save cost.

To me this seems more that it's trained to be concise by default which I guess can be countered with preference instructions if required.

What's interesting to me is that they're using a new tokeniser. Does it mean they trained a new model from scratch? Used an existing model and further trained it with a swapped out tokeniser?

The looped model research / speculation is also quite interesting - if done right there's significant speed up / resource savings.

fumar 3 hours ago||

On API use, I am noticing verbose output across the board. When I task it with plans it now creates more detailed task counts and tasks descriptions. It is more constrained to its directions than 4.6.

andai 13 hours ago||

Interesting. In conversational use, it's noticeably more verbose.

ianberdin 11 hours ago||

Opus 4.6 is the main model on https://playcode.io.

Not a secret, the model is the best on the world. Yet it is crazy expensive and this 35% is huge for us. $10,000 becomes $13,500. Don’t forget, anthropic tokenizer also shows way more than other providers.

We have experimented a lot with GLM 5.1. It is kinda close, but with downsides: no images, max 100K adequate context size and poor text writing. However, a great designer. So there is no replacement. We pray.

WarmWash 2 hours ago||

Gemini is strong and cheap, it's 90% of 4.6 at 20% of the tokens.

sneak 11 hours ago||

How much human developer can you buy for that $13.5k?

They’ve got us by the balls and they know it.

Frannky 5 hours ago||

My subscription was up for renewal today. I gave it a shot with OpenCode Go + Xiaomi model. So far, so good—I can get stuff done the same way it seems.

nickvec 5 hours ago||

For all intents and purposes, aren't the "token change" and "cost change" metrics effectively the same thing?

coldtea 14 hours ago||

This, the push towards per-token API charging, and the rest are just a sign of things to come when they finally establish a moat and full monoply/duopoly, which is also what all the specialized tools like Designer and integrations are about.

It's going to be a very expensive game, and the masses will be left with subpar local versions. It would be like if we reversed the democratization of compilers and coding tooling, done in the 90s and 00s, and the polished more capable tools are again all proprietary.

danny_codes 11 hours ago||

I doubt that’s the case. My guess is we’ll hit asymptomatic returns from transformers, but price-to-train will fall at moore’s law.

So over time older models will be less valuable, but new models will only be slightly better. Frontier players, therefore, are in a losing business. They need to charge high margins to recoup their high training costs. But latecomers can simply train for a fraction of the cost.

Since performance is asymptomatic, eventually the first-mover advantage is entirely negligible and LLMs become simple commodity.

The only moat I can see is data, but distillation proves that this is easy to subvert.

There will probably be a window though where insiders get very wealthy by offloading onto retail investors, who will be left with the bag.

coldtea 9 hours ago||

>I doubt that’s the case. My guess is we’ll hit asymptomatic returns from transformers, but price-to-train will fall at moore’s law.

There hasn't been a real Moore's law for a good while even before LLMs.

And memory isn't getting less expensive either...

quux 13 hours ago|||

If only there were an Open AI company who's mandate, built into the structure of the company, were to make frontier models available to everyone for the good of humanity.

Oh well

slowmovintarget 13 hours ago||

Things used to be better... really.

OpenAI was built as you say. Google had a corporate motto of "Don't be evil" which they removed so they could, um, do evil stuff without cognitive dissonance, I guess.

This is the other kind of enshitification where the businesses turn into power accumulators.

throwaway041207 13 hours ago||

Yep, between this and the pricing for the code review tool that was released a couple weeks ago (15-25 a review), and the usage pricing and very expensive cost of Claude Design, I do wonder if Anthropic is making a conscious, incremental effort to raise the baseline for AI engineering tasks, especially for enterprise customers.

You could call it a rug pull, but they may just be doing the math and realize this is where pricing needs to shift to before going public.

zozbot234 13 hours ago||

There's been speculation that the code review might actually be Mythos. It would seem to explain the cost.

monkpit 13 hours ago||

Does this have anything to do with the default xhigh effort?

QuadrupleA 12 hours ago||

One thing I don't see often mentioned - OpenAI API's auto token caching approach results in MASSIVE cost savings on agent stuff. Anthropic's deliberate caching is a pain in comparison. Wish they'd just keep the KV cache hot for 60 seconds or so, so we don't have to pay the input costs over and over again, for every growing conversation turn.

aray07 13 hours ago||

Came to a similar conclusion after running a bunch of tests on the new tokenizer

It was on the higher end of Anthropics range - closer to 30-40% more tokens

https://www.claudecodecamp.com/p/i-measured-claude-4-7-s-new...

alphabettsy 12 hours ago|

I’m trying to understand how this is useful information on its own?

Maybe I missed it, but it doesn’t tell you if it’s more successful for less overall cost?

I can easily make Sonnet 4.6 cost way more than any Opus model because while it’s cheaper per prompt it might take 10x more rounds (or never) solve a problem.

senordevnyc 12 hours ago|

Everything in AI moves super quickly, including the hivemind. Anthropic was the darling a few weeks ago after the confrontation with the DoD, but now we hate them because they raised their prices a little. Join us!

More comments...