Posted by secure 6 days ago
For both the cooler and the motherboard, AMD have too much control to look the other way. The chip can measure its own temperature and the conceit of undermining partners by moving things on chip and controlling more of the ecosystem is that things perform better. They should at least perform.
The cooler was under the rated tdp of the platform. That and it lasted 6 months and so far seemed the only case of it falling over like it did.
Yea am leaning on it being user error.
I also find that, as performance improvements tolerances get tighter throughout the system, the set of 'things that can screw your build' grows bigger.
The problem is, it's a huge effort to get there. You really have to tune PBO curves for each core individually, as they can vary so much between cores.
Now the test itself is mostly automatic with tools like OCCT, but of course you have to change the settings in the BIOS between each test and you cannot use the computer during that time, so there's a huge opportunity cost. I'm talking about weeks, not days.
To cut a long story short, I sold the system and just bought a M4 Max Mac Studio now. Apple Silicon might not have the top performance of AMD or Intel, but it comes with much less headaches and opportunity cost. Which in the end probably equalizes the difference in purchase cost.
If anyone thinks competition isn't good for the market or that also-rans don't have enough of an effect, just take note. Intel is a cautionary tale. I do agree we would have gotten where we are faster with more viable competitors.
M4 is neat. I won't be shocked if x86 finally gives up the ghost as Intel decides playing in Risc V or ARM space is their only hope to get back into an up-cycle. AMD has wanted to do heterogeneous stuff for years. Risc V might be the way.
One thing I'm finding is that compilers are actually leaving a ton on the table for AMD chips, so I think this is an area where AMD and all of the users, from SMEs on down, can benefit tremendously from cooperatively financing the necessary software to make it happen.
Secondly, what BIOS settings should I be using to run safely? Is XMP/whatever the AMD equivalent is safe? If I don't run XMP then my RAM runs at way below spec (for the stick) default speeds.
Anyone know of a good guide for this stuff?
Maybe the situation is better on DDR5 platforms.
Yet I also use a 7840U in a gaming handheld running Windows, and haven't had any issues there at all. So I think this is related to AMD Linux drivers and/or Wayland. In contrast, my old laptop with an NVIDIA GPU and Xorg has given me zero issues for about a decade now.
So I've decided to just avoid AMD on Linux on my next machine. Intel's upcoming Panther Lake and Nova Lake CPUs seem promising, and their integrated graphics have consistently been improving. I don't think AMD's dominance will continue for much longer.
Make sure it matches the min of the actual spec of the ram that you bought and what the CPU can do.
I used to get crashes like you are describing on a similar machine. The crashes are in the GPU firmware, making debugging a bit of a crap shoot. If you can run windows with the crashing workload on it, you’ll probably find it crashes the same ways as Linux.
For me, it was a bios bug that underclocked the ram. Memory tests, etc passed.
I suspect there are hard performance deadlines in the GPU stack, and the underclocked memory was causing it to miss them, and assume a hang.
If the ram frequency looks OK, check all the hardware configuration knobs you can think of. Something probably auto-detected wrong.
But I'll play around with this and the timings, and check if there's a BIOS update that addresses this. Though I still think that AMD's drivers and firmware should be robust enough to support any RAM configuration (within reason), so it would be a problem for them to resolve regardless.
Thanks for the suggestion!
That gave me solid ground for debugging.
Don't know about transcoding though.
Threadripper is built for this. But I am talking about the consumer options if you are on a budget. Intel has significantly more memory bandwidth than AMD in the consumer end. I don't have the numbers on hand, but someone at /r/localllama did a comparison a while ago.
I can't see how that supports your conclusion.
> AMD 7900X - 68.9 GB/sec
> Intel 13900K - 93.4 GB/sec
That's 35% better.
I had differences of like 20 or more between different cores... i.e. one core might work fine at -20, the other maybe only at +5.
And while all core CO might not be optimal, based on personal experience and what I've seen across multiple enthusiast communities, more often than not you can get an worthwhile improvement to temps/perf with an all core CO.
That being said, there are certainly ways to find and set the best CO values per core, but it will certainly take more effort, stress testing and time.
> After switching my PC from Intel to AMD, I end up at 10-11 kWh per day.
It's kind of impressive to increase household electricity consumption by 10% by just switching one CPU.
For a time I ran it 24/7 without suspend. It's a big system, lots of disks, expansion cards, etc. If it doesn't suspend, and doesn't do anything remarkable, it uses about ~5kWh per day. Needless to say, it suspends after 60 minutes now (my daily energy usage went from ~9 to ~4 kWh).
[1]: https://en.wikipedia.org/wiki/European_countries_by_electric...
I recently hit this testing pre-release kernels on my gaming PC, a 9900X3D: https://lore.kernel.org/lkml/20250623083408.jTiJiC6_@linutro...
A pile of older Skylake machines was never able to reproduce that bug one single time in 100+ hours of running the same workload. The fast new AMD chips would almost always hit it in a few hours.