Posted by mindcrime 21 hours ago
It has been a bit of a nightmare and had to package like 30+ deps and their heavily customized LLVM, but got the runtime to build this morning finally.
Things are looking bright for high security workloads on AMD hardware due to them working fully in the open however much of a mess it may be.
It truly is a nightmare to build the whole thing. I got past the custom LLVM fork and a dozen other packages, but eventually decided it had been too much of a time sink.
I’m using llama.cpp with its vulkan support and it’s good enough for my uses. Vulkan so already there and just works. It’s probably on your host too, since so many other things rely on it anyway.
That said, I’d be curious to look at your build recipes. Maybe it can help power through the last bits of the Alpine port.
Agreed on the others though. Why's it even installing ncurses, surely that's just expected to be on the system?
There are however quite a large list of other issues that have been blocking builds on systems with somewhat more modern toolchains / OSes than whatever the target is here (Ubuntu 24.04 I suspect). I really want to be able to engage directly with TheRock & compile & run it natively on Ubuntu 25.04 and now Ubuntu 26.04 too. For people eager to use the amazing leading edge capabilities TheRock offers, I suspect they too will be more bleeding edge users, also with more up to date OS choices. They are currently very blocked.
I know it's not the intent at all. There's so much good work here that seems so close & so well considered, an epic work spanning so many libraries and drivers. But this mega thread of issues gives me such vibes of the bad awful no good Linux4Tegra, where it's really one bespoke special Linux that has to be used, that nothing else works. In this case you can download the tgz and it will probably work on your system, but that means you don't have any chance to improve or iterate or contribute to TheRock, that it's a consume only relationship, and that feels bad and is a dangerous spot to be in, not having usable source.
I'd really really like to see AMD have CI test matrixes that we can see, that shows the state of the build on a variety of Linux OSes. This would give the discipline and trust that situations like what we have here do not arise. This obviously cannot hold forever, Ubuntu 24.04 is not acceptable as a build machine for perpetuity, so these problems eventually have to be tackled, but it's really a commitment to avoiding making the build work on one blessed image only that needs to happen. This situation should not have developed; for TheRock to be accepted and useful, the build needs to work on a variety of systems. We need fixes right now to make that true, and AMD needs to be showing that their commitment to that goal is real, ideally by running and showing a build matrix CI where we can see it that it does compile.
You don't trust Nvidia because the drivers are closed source ?
I think Nvidia's pledged to work on the open source drivers to bring them closer to the proprietary ones.
I'm hopping Intel can catch up , at 32GB of VRAM for around 1000$ it's very accessible
It's like running Windows in a VM and calling it an open source Windows system. The bootstrapping code is all open, but the code that's actually being executed is hidden away.
Intel has the same problem AMD has: everything is written for CUDA or other brand-specific APIs. Everything needs wrappers and workarounds to run before you can even start to compare performance.
https://developer.nvidia.com/blog/nvidia-transitions-fully-t...
Thus being open source isn't of much help without it.
For some workloads, the Arc Pro B70 actually does reasonably well when cached.
With some reasonable bring-up, it also seems to be more usable versus the 32gb R9700.
I had to rebuild llama.cpp from source with the SYCL and CPU specific backends.
Started with a barebones Ubuntu Server 24 LTS install, used the HWE kernel, pulled in the Intel dependencies for hardware support/oneapi/libze, then built llama.cpp with the Intel compiler (icx?) for the SYCL and NATIVE backends (CPU specific support).
In short, built it based mostly on the Intel instructions.
...I have a feeling you might not be at liberty to answer, but... Wat? The hell kind of "I must apparently resist Reflections on Trusting Trust" kind of workloads are you working on?
And what do you mean "binaries only built using a single compiler"? Like, how would that even work? Compile the .o's with compiler specific suffixes then do a tortured linker invo to mix different .o's into a combined library/ELF? Are we talking like mixing two different C compilers? Same compiler, two different bootstraps? Regular/cross-mix?
I'm sorry if I'm pushing for too much detail, but as someone whose actually bootstrapped compilers/user spaces from source, your usecase intrigues me just by the phrasing.
For information on stagex and how we do signed deterministic compiles across independently operated hardware see https://stagex.tools
Stagex is used by governments, fintech, blockchains, AI companies, and critical infrastructure all over the internet, so our threat model must assume at least one computer or maintainer is compromised at all times and not trust any third party compiled code in the entire supply chain.
Beyond the fact they're competing with the most valuable companies in the world for talent while being less than a decade past "Bet the company"-level financial distress?
I would love AMD to be competitive. The entire industry would be better off if NVIDIA was less dominant. But AMD did this to themselves. One hundred percent.
according to public information NVIDIA started working on CUDA in 2004, that was before AMD made the ATI acquisition.
my suspicion is that back then ATI and NVIDIA had very different orientations. neither AMD nor ATI were ever really that serious about software. so in that sense i guess it was a match made in heaven.
so you have a cultural problem, which is bad enough, then you add in the lean years AMD spent in survival mode. forget growing software team, they had to cling on to fewer people just to get through.
now they're playing catch-up in a cutthroat market that's moving at light speed compared to 20 years ago.
we're talking about a major fumble here so it's easy to lose context and misunderstand things were a little more complex than they appeared.
one API is much more than a plain old SYCL distribution, and still.
People like to talk about Apple CPUs, but keep forgetting they don't sell chips, and overall desktop market is around 10% world wide.
ARM is mostly about phones and tablets, good luck finally getting those Windows ARM or GNU/Linux desktop cases or laptops.
Servers, depends pretty much about which hyperscalers we are on.
RISC-V is still to be seen, on the desktop, laptops and servers.
Where AMD is doing great are game consoles.
maybe on some level but not that level you're describing. pretty much everyone at AMD understands the situation, and has for a while.
Look where Apple Silicon managed going in the same time frame...
Because of this, I would never consider another AMD GPU for a long time. Gaming isn't everything I want my GPU doing. How do they keep screwing this up? Why isn't it their top priority?
Right now the entire source base is literally throw a bunch of crap into the rocm brand and hope it builds together vs some overarching architecture. Presumably the entire spend it also tied to "whatever big Co's evaluation needs this week" when it comes to developing with it.
Must have been by waiting for each of the 1000 complainers to die of old age, because I do not know what old hardware they have added support for.
ROCm Device Support Wishlist (205 points, 107 comments)
I'm not talking gaming CUDA, but CV and data science workloads seem to scale well on Arc and work well on Edge on Core Ultra 2/3.
I am critical of AMD for not fully supporting all GPUs based on RNDA1 and RDNA2. While backwards compatibility is always better than less for the consumer, the RX 580 was a lightly-updated RX 480, which came out in 2016. Yes, ROCm technically came out in 2016 as well, but I don't mind acknowledging that it is a different beast to support the GCN architecture than the RDNA/CDNA generations that followed (Vega feels like it is off on an island of its own, and I don't even know what to say about it).
As cool as it would be to repurpose my RX 580, I am not at all surprised that GCN GPUs are not supported for new library versions in 2026.
I would be MUCH more annoyed if I had any RDNA1 GPU, or one of the poorly-supported RDNA2 GPUs.
It's not ideal. CUDA for comparison still supports Turing (two years older than RDNA 2) and if you drop down one version to CUDA 12 it has some support for Maxwell (~2014).
Especially when as you say, the latest generation is slow to gain support, while they are simultaneously dropping old generations, leaving you with a 1-2 year window of support.
Vulkan backend of Ollama works fine for me, but it took one year or two for them to officially support it.
A few years ago I thought I had used the ROCm drivers/libraries with hashcat on a RX580
Now it’s obsolete ?
(1) - Supporting only Server grade hardware and ignoring laptop/consumer grade GPU/APU for ROCm was a terrible strategical mistake.
A lot of developers experiments first and foremost on their personal laptop first and scale on expensive, professional grade hardware later. In addition: some developers simply do not have the money to buy server grade hardware.
By locking ROCm only to server grade GPUs, you restrict the potential list of contributors to your OSS ROCm ecosystem to few large AI users and few HPC centers... Meaning virtually nobody.
A much more sensible strategy would be to provide degraded performance for ROCm on top of consummer GPUs, and this is exactly what Nvidia with CUDA is doing.
This is changing but you need to send a clear message there. EVERY new released device should be properly supported by ROCm.
- (2) Supporting only the two last generations of architecture is not what customers want to see.
https://rocm.docs.amd.com/projects/install-on-linux/en/docs-...
People with existing GPU codebase invests significant amount of effort to support ROCm.
Saying them two years later: "Sorry you are out of update now!" when the ecosystem is still unstable is unacceptable.
CUDA excels to backward compatibility. The fact you ignore it entirely plays against you.
(3) - Focusing exclusively on Triton and making HIP a second class citizen is non-sensical.
AI might get all the buzz and the money right now, we go it.
It might look sensible on the surface to focus on Python-base, AI focused, tools like Triton and supporting them is definitively necessary.
But there is a tremendous amount of code that is relying on C++ and C to run over GPU (HPC, simulation, scientific, imaging, ....) and that will remain there for the multiple decades to come.
Ignoring that is loosing, again, custumers to CUDA.
It is currently pretty ironic to see such a move like that considering that AMD GPUs currently tend to be highly competitive over FP64, meaning good for these kind of applications. You are throwing away one of your own competitive advantage...
(4) - Last but not least: Please focus a bit on the packaging of your software solution.
There has been complained on this for the last 5 years and not much changed.
Working with distributions packagers and integrating with them does not cost much... This would currently give you a competitive advantage over Nvidia..
NVidia is acknowledging Python adoption, with cuTile and MLIR support for Python, allowing the same flexibility as C++, using Python directly even for kernels.
They seem to be supportive of having similar capabilities for Julia as well.
The IDE and graphical debuggers integration, the libraries ecosystem, which now are also having Python variants.
As someone that only follows GPGPU on the side, due to my interests in graphics programming, it is hard to understand how AMD and Intel keep failing to understand what CUDA, the whole ecosystem, is actually about.
Like, just take the schedule of a random GTC conference, how much of it can I reproduce on oneAPI or ROCm as of today.
NVIDIA is making the same mistake today by deprioritizing the release of consumer-grade GPUs with high VRAM in favour of focusing on server markets.
They already have a huge moat, so it's not as crippling for them to do so, but I think it presents an interesting opportunity for AMD to pick up the slack.
Then in a random Linux kernel update they changed the GPU driver. Trying to run ROCm now hard-crashed the GPU requiring a restart. People in the community figured out which patch introduced the problem, but years later... Still no fix or revert. You know, because it's officially unsupported.
So "Just use HSA_OVERRIDE_GFX_VERSION" is not a solution. You may buy hardware based on that today, and be left holding the bag tomorrow.