Top
Best
New

Posted by latchkey 21 hours ago

Performance per dollar is getting faster and cheaper(www.wafer.ai)
325 points | 128 commentspage 2
AussieWog93 19 hours ago|
The 2600 tok/s is an "aggregate", not the actual throughput.
technoabsurdist 19 hours ago|
yes it is 213 tok/s single stream (so per user)
unrvl22 13 hours ago|||
that 213 wasn't achieved when saturated though. was probably more like 30 tps per stream when doing 2.6k tps.
3836293648 19 hours ago|||
So per subagent*.
alienbaby 17 hours ago||
*per stream, I guess is more accurate than either?
conorcleary 6 hours ago||
*especially as many currencies weaken
johanvts 10 hours ago||
That sounds literally impossible.
dtgriscom 5 hours ago|
Agreed. The writer is pretty loose with their comparisons:

* What does it mean for "performance per dollar" to get faster? Higher, maybe; rise faster than it has in the past, maybe, but just "faster"? Nope.

* The article cites some equipment as being "2x cheaper". I think they mean "half the cost", but if so they should say it.

oDot 20 hours ago||
Do these providers have 80+% gross margins or is something eating into them? Maybe utilization?
technoabsurdist 19 hours ago|
hi i work at wafer. no the margins are lower averaging at about ~40%. utilization is one of the highest order bits in determining margins here, yes.
keynha 16 hours ago||
[dead]
adammarples 8 hours ago||
Slight criticism of the headline there, you can't get cheaper per dollar.
hahahaa 11 hours ago||
What is a knee, in performance talk?
kgwgk 11 hours ago||
A place where the slope/derivative/incremental-performance-per-price changes.
nnevatie 11 hours ago||
I used to be high-performance like you, then I took an arrow to the knee?
alienbaby 17 hours ago||
I'm interested if anyone knows how much legwork the assumed 60% cache hit, plus running a quantised model is doing? Esp. compared to what the headline half implies is a full fat GLM5.2
ilaksh 5 hours ago||
Can you actually rent an MI355X per hour anywhere right now?
killingtime74 17 hours ago||
No word on what this actually means as a consumer. What's the price. Is it lower than NVIDIA serving?
mixtureoftakes 14 hours ago|
They seem to be serving it at 3x the price while also struggling with maintaining uptime on openrouter; while the vercel router advertizes even bigger speeds but has no clear uptime stats

I guess you really do have to try it at least for some time to actually know

BurningFrog 5 hours ago|
So... the headline is about performance per dollar per dollar?
More comments...