Posted by albelfio 18 hours ago
I almost sure it’s possible to custom build a machine as powerful as their red v2 within 9k budget. And have a lot of fun along the way.
So, context is probably more $/programming worth than inference speed.
$12,000, $65,000, $10,000,000.
the town near my hometown has 650 – 800 houses (according to chatgpt).
crazy.
A typical home just consumes rather little energy, now that LED lighting and heat pump cooling / heating became the norm.
We're not all solidly middle-class (especially in Southern and Eastern Europe) and as such we cannot afford those heat pumps. But we'll have to eat the increased energy costs brought by insane server configurations like the ones from the article, so, yeey!!!
My brother in Christ, you vastly overestimate southern europe
Do you live in a deprived rural village in a very poor country? Because you can't even run a heater and the oven with 3kW.
Most power contracts give you 3 kwh power supply for residential home. That’s the standard.
Bumping to 4.5 or 6kwh must be required explicitly and costs and extra on the base power supply bill
With 6 GPUs you have to deal with risers, pcie retimers, dual PSUs and custom case for so value proposition there was much better IMO
I'm currently shopping for offline hardware and it is very hard to estimate the performance I will get before dropping $12K, and would love to have a baseline that I can at least always get e.g. 40 tok/s running GPT-OSS-120B using Ollama on Ubuntu out of the box.
"likely" doesn't inspire much confidence. Surely, they have those numbers, and if it was, they'd publicize the comparisons.
Can they/someone else give more details as to what workloads pytorch is more than 2x slower than the hardware provides? Most of the papers use standard components and I assume pytorch is already pretty performant at implementing them at 50+% of extractable performance from typical GPUs.
If they mean more esoteric stuff that requires writing custom kernels to get good performance out of the chips, then that's a different issue.
Not revolutionary in any way, but nice. Unless I'm missing something here?