Was my $48K GPU server worth it?

Posted by apwheele 3 days ago

Was my $48K GPU server worth it?(rosmine.ai)

92 points | 78 commentspage 2

tombert 2 hours ago||

I have four old 24gb Nvidia cards. They're not great but they're not useless either. The problem is that I haven't really figured out a good way to actually use them.

Genuine question; would anyone here recommend any specific motherboard to best utilize these cards?

mciancia 2 hours ago|

Depends what you want to do and which cards you have, but usually going with any older (3rd gen+) threadripper pro setup will give you a lot of pcie lanes.

I myself run with gigabyte trx40 aorus xtreme, but since it's regular threadripper (not pro) with 4 GPUs 2 of them will run at x16 and two of them at x8 speeds

amarant 1 hour ago||

The research that's presented in another article on the same site is way more interesting than the betteridges law article linked here. It'll be very useful in my own latest project if this research is incorporated into some model I can rent by the token!

jmyeet 2 hours ago||

So some things have changed since this rig was first built (2024). The most relevant is that $6800 RTX 6000 Ada 48GB has arguably been supplanted by the $9500 RTX 6000 Pro 96GB.

The Ada has a memory bandwidth of 960GB/s. The Pro has 1.8TB/s and about 40-50% better performance so is at least equivalent in processing power, much better in memory bandwidth (important for inference) and can hold larger models on a single card.

I've considered buying a rig with 1-2 6000 Pros for similar reasons but I want to see what happens with this year's Mac Studios with a likely M5 Ultra. Macs have a shared memory architecture whereas NVidia segments the market based on max memory where the biggest consumer card (RTX 5090) has 32GB of VRAM but still excellent memory bandwidth (1.8TB/s). A RTX 5090 rig will still trounce a Mac Studio seems to be the conventional wisdom. Despite being able to hold larger models and being able to chain Mac Studios on TB5, their lower memory bandwidth (~900GB/s) and lower overall GFLOPS mean they still come out behind.

That being said, the current Mac Studios are relatively long in the tooth, being released in 2024.

I'm still not sure any of this is really wroth it because things are still changing so fast. I think there's a decent chance of a number of large AI companies going bust in the next 2-3 years such that you'll be able to buy enterprise AI hardware at cents on the dollar, a bit like how Google bought data centers in the post-dot-com crash.

But anyway, nowadays I'd be looking at the RTX 6000 Pro as the sweet spot, having anywhere from 1-4 in a single server.

The electricial issues the author mentions are interesting. I hadn't really thought about the max amperage on a residential circuit. In a DC, these would typically operate on three phase power and much higher overall amperage. I wonder if there's a device you can buy that can combine multiple residential circuits into a single power source for a server this power hungry?

freediddy 2 hours ago||

I have the Macbook M5 MAX with 128 GB of RAM. I put its performance at roughly equivalent to the RTX 5070 Ti. The M3 Ultra 512 GB for me is about half the performance of the RTX 5070 Ti but obviously it has the ability to do more because of the increased memory.

I don't think anything compares to the nVidia chips at all.

nextos 2 hours ago|||

I am also considering to buy 3-4x RTX 6000 Pro 96GB plus some Ryzen workstation with a grant.

Is this the best general-purpose choice as of 2026 with $50k for training, fine-tuning and running large open models?

trevithick 2 hours ago||

You would install a 240v circuit (in the US) like for an electric clothes dryer.

Edit: I now see the author was in an apartment and couldn't do this, so I concede this is not responsive here.

doctorpangloss 2 hours ago||

> Because of this I got a motherboard with slow GPU interconnect. It’s good for running many small experiments in parallel (which is my main use case) but horrible for any models split across gpus.

:( you paid a professional pc builder and you weren't told this?

shout5 58 minutes ago||

> paid a professional pc builder

They did not. That's a mining rig not a workstation. It's visible from the photo and the chart showing multiple failures over a short period of time including the risers -- which are visibly very low quality -- failing twice.

You have 50K, you call a real expert like Puget Systems or Digital Storm.

mciancia 2 hours ago|||

I wonder why using 2 PSUs resulted in having slower interconnect.

There is no specs in this blogpost regarding cpu/motherboard choice, but if you go with threadripper pro they have 128 pci-e lanes for some time now, so using all GPUs at full speed shouldn't be a problem

zozbot234 2 hours ago|||

If you split models using pipeline/layer parallelism you don't have to care about a slow interconnect, you're just slowed down a lot when running a single inference at a time as opposed to a fully pipelined minibatch. But tensor parallelism requires much faster interconnects than you could get in your average server, so I'm not sure that a different motherboard would help all that much.

m-hodges 2 hours ago|||

what is a "professional pc builder" in 2026

ok_dad 2 hours ago||

A guy on Facebook with more confidence and better insurance

CamperBob2 2 hours ago|||

Consumer motherboards can still make sense even if you leave some performance on the table. Running an actual 8x GPU server is not something you'd want to do in an apartment. Imagine the old Lucasfilm "THX" trailer where an unearthly-sounding foghorn whine rises to a sweeping crescendo at reference level, only without the decay at the end.

At the time he put this rig together, there weren't a lot of open-weight LLMs that could run well on 6x48=288 GB, so it probably wasn't a huge loss. There still aren't, really.

Right now I'm in the process of cramming Blackwell cards into an old DDR4-based Milan server, where the important thing is to be able to run large models at all. The GPU fans alone burn over 400 watts at full throttle.

storus 2 hours ago||

Did you think about Max-Q cards? 300W and they aren't that noisy either, 14% lower perf than non-Max-Q card.

CamperBob2 2 hours ago||

That was an option, but having decided on a true server chassis for other reasons, it made sense to use server-edition cards to take advantage of all those fans. I downclock them to 300W anyway for longevity, but it's nice to have the option to go to 600W if needed.

The server is going to live in the garage, so I'm not that concerned with noise. But I had no idea what to expect when I flipped the switch for the first time. It sounds like something out of the Book of Revelation. No way, no how could something like this be used in an inhabited area.

ginko 2 hours ago|||

Don't those Ada 6000 GPUs support NVLink? I think I can even see the cover for the connectors in OP's pic.

edit: Hm, finding mixed information online on whether that's still supported or not. Apparently it was removed in workstation GPUs.

mciancia 2 hours ago|||

Nope, they don't support it. And afair even if they did, you would be limited to connecting only in pairs, not all 6 together

ryandrake 52 minutes ago||

Honestly, I made the same mistake when I added a GPU to my (not $48K) existing homelab. I got a Ada 4000 for its slim form factor and low wattage, but realize after I bought it that it does not support NVLink, so I can't really effectively double it up later if I wanted to. Live and learn. I suppose you might research that a little before blowing that much money though LOL :)

thecatmak 2 hours ago||

[dead]

pelasaco 2 hours ago||

out of curiosity, did you check how much would cost to rent a cage in a colocation space? Having to power your computer from two different outlets sounds wild..

forsalebypwner 52 minutes ago|

the very last line of the article:

"If I were to do this again, I wouldn’t do a custom build like this. I would buy a standard datacenter server and rent space in a colocation center. But then I would miss saying Hi to grumbl once in a while."

pelasaco 37 minutes ago||

Yes, i mean, he could rent a cage and run grumbl it there. It doesn't have to be a standard datacenter server, even though a standard datacenter server would be better and cheaper.

gosub100 2 hours ago|

It doesn't cover risk. If one or more gpus dies, who pays for it? If you rent, you are guaranteed to be insulated from this risk. But owning, you might not have the best return policy from the vendor. And if you are actually at fault for breaking it, they have every right to deny a return. Or if your apartment is burglarized or catches fire (possibly from overloading the circuit) you are out the entire investment.

0xbadcafebee 2 hours ago||

Also a lightning strike or surge from the electric utility could fry the whole rig. Proper protection costs thousands, and even then it's not guaranteed to protect everything

benjiro3000 1 hour ago||

[dead]