Top
Best
New

Posted by twapi 4 days ago

Claude Token Counter, now with model comparisons(simonwillison.net)
224 points | 84 commentspage 2
mudkipdev 4 days ago|
Why do you need an API key to tokenize the text? Isn't it supposed to be a cheap step that everything else in the model relies on?
kouteiheika 4 days ago||
I'd guess it's because they don't want people to reverse engineer it.

Note that they're the only provider which doesn't make their tokenizer available offline as a library (i.e. the only provider whose tokenizer is secret).

stingraycharles 4 days ago||
Anthropic is somewhat becoming the Apple of AI in terms of closed ecosystem. Not saying I blame them, I just don't like it as a customer.

The fact that it's impossible to get the actual thinking tokens anymore, but we have to do with a rewritten summary, is extremely off-putting. I understand that it's necessary for users, but when writing agentic applications yourself, it's super annoying not to have the actual reasoning of the agent to understand failure modes.

aftbit 3 days ago||
It's _not_ that it's necessary for users. It's that Anthropic got Opus 4.6 ripped off so hard by MiniMax that they no longer want to expose true thinking tokens to random developers. If you're one of the blessed class, you can still get real thinking tokens, but you need to be a major enterprise customer, like the companies that they gave Mythos access.
weird-eye-issue 4 days ago|||
To prevent abuse? It's a completely free endpoint so I don't understand your complaint.
tethys 3 days ago||
It may be free, but it cannot be used without credits.

  Error: {"type":"error","error":{"type":"invalid_request_error","message":"Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits."},"request_id":"req_011CaGaBf6uTHfbmdZ39nx1Z"}
weird-eye-issue 2 days ago||
Again it is to help prevent abuse I don't really see how this is a valid concern? Tokenization is actually fairly CPU intensive
simonw 4 days ago||
I'd love it if that API (which I do not believe Anthropic charge anything for) worked without an API key.
ilioscio 3 days ago||
Anthropic was pulling ahead of their peers, but if they can't hear their customer's complaints about negatively changing value between releases they're going to undermine their position until no advantage is left.
tomglynch 4 days ago||
Interesting findings. Might need a way to downsample images on upload to keep costs down.
simonw 4 days ago|
Yeah that should work - it looks like the same pixel dimension image at smaller sizes has about the same token cost for 4.6 and 4.7, so the image cost increase only kicks in if you use larger images that 4.6 would have presumably resized before inspecting.
tpowell 4 days ago||
I just asked Claude about defaulting to 4.6 and there are several options. I might go back to that as default and use --model claude-opus-4-7 as needed. The token inflation is very real.
sergiopreira 4 days ago||
An interesting question is whether the tokenizer is better at something measurable or just denser. A denser tokenizer with worse alignment to semantic boundaries costs you twice, higher bill and worse reasoning. A denser tokenizer that actually carves at the joints of the model's latent space pays for itself in quality. Nobody outside Anthropic can answer which it is without their eval suite, so the rugpull read is fair but premature. Perhaps the real tell will be whether 4.7 beats 4.6 on the same dollar budget on the benchmarks you care about, not on the per-token ones Anthropic publishes.
potter098 3 days ago|
[dead]
cubefox 3 days ago||
Okay, but what about output tokens?
chattermate 4 days ago||
[dead]
yogigan 4 days ago||
[dead]
jug 3 days ago||
[dead]
alvis 3 days ago|
[dead]