Posted by sabareesh 3 days ago
The story is interesting but it’s hard to read because it’s hard to tell which parts are meaningful and which parts are filler.
E.g. “we pulled the card cold - straight from the rig to the workbench”. Okay, but why would going straight from the rig to the workbench make it cold? If anything it would be warm. But it turns out the temperature is meaningful in your story.
The business of protecting individual power cords was handled by an Eaton PDU that had a 30a twist-lock plug on one side and a couple of rows of current-limited IEC C13 sockets on the other side.
Certainly on my panel the only "single outlet" breakers are hot water, AC, oven/stove, dryer.
Also, another place where you might already have this outlet: some older houses that use window AC units that were larger had 240V 20A outlets. Not common these days, but you can still buy these types of window AC units.
If you're gonna get rewired you may as well install a 240v circuit, and some 120v 20a sockets while you're at it.
I'm very close to just running a cord over or devising a way to put my machine closer to a second circuit because my rental is horribly setup and both my bedroom AC and living room desktop (that also doubles as a ML training box) end up on the same circuit.
It would require an additional run of 14/2G romex (12/2G for 20A) and a single-pole breaker, but allows you to skip cutting in an old work box to add a 2nd duplex receptacle.
You could possibly replace the existing 14/2G with 14/4G which has enough conductors for both circuits.
The receptacle is the easy part, running the new circuit is the hard part.
Or you know, install a new 240V receptacle.
If I have to:
1) Run wire
2) Get a bigger breaker box
3) To do it legally, hire an electrician and maybe get a permit
Replacing the receptacle is like, <1% of what's involved there.
I'm air cooling so I set -pl 450 so I'm not running them all at the full 600w
It cost about £190 in 2006.
Now we have GPUs that are in tens of thousands of pounds with insane performance, but what would their price be without the AI and Datacentre squeeze?
Thanks for making me feel older now
Diablo 2 stopped lagging when a necromancer joined the game and summoned all the skeletons...
On an unrelated note Path of exile 1 still lags even on a 5090
Hint: when you have a piece of metal stuck with thermal goop to a lot of components, the force doesn’t “concentrate” on one of them. You need to detach it from each one with however much force is needed to detach it from that component.
The trouble with this though is, what if that is not the only issue with the card? That’s normally my thought process on reaching for RMA. The unit could be an all-round lemon that should not have passed QA etc. (and as noted in the post itself, working for a week on various tasks is not enough to prove it good)
Something went wrong in manufacturing. The solder should have wicked to cover the entire pad, not just a small square, and there should be no (brown) discoloration.
The phrasing is very claude like:
"That cracked joint is the whole story. The card had passed initial bring-up and ran fine at light loads for a week."
"That sequencing matters — it’s why we have a story to tell. The pilot card failed, taught us a lesson, and the lesson is the reason the other three went on without incident."
"Driver swaps, CUDA reinstalls, and inference-engine theories were dead ends I spent hours on. The failure pattern itself told the story — listen to it earlier."
Stuff like "it's the whole story," "this part matters," and "it's not X" (when X wasn't ever under discussion to begin with).
They're like a bot characterizing to itself what is important, what is unimportant, or sometimes even arguing with itself. Their presentation seems like bits of the internal thinking mechanism leaking into the output queue.