Why CUDA translation wont unlock AMD

Posted by JonChesterfield 11/12/2025

Why CUDA translation wont unlock AMD(eliovp.com)

88 points | 81 commentspage 2

doctorpangloss 11/20/2025|

All they have to do is release air cooled 96GB GDDR7 PCIe5 boards with 4x Infinity Link, and charge $1,900 for it.

musicale 11/22/2025||

If you can run PyTorch well, isn't that good enough for a lot of people?

jmward01 11/20/2025||

Right now we need diversity in the ecosystem. AMD is finally getting mature and hopefully that will lead to them truly getting a second, strong, opinion into ecosystem. The friction this article talks about is needed to push new ideas.

buggyworld 11/20/2025||

[flagged]

KetoManx64 11/20/2025||

Which LLM did you use to write this?

throwaway31131 11/20/2025||

I don't think this was the point of the post at all.

Their bottom line summed it up perfectly.

"We’re not saying “never use CUDA-on-AMD compilers or CUDA-to-HIP translators”. We’re saying don’t judge AMD based on them."

measurablefunc 11/20/2025||

[flagged]

jsheard 11/20/2025||

The article is literally about how rote translation of CUDA code to AMD hardware will always give sub-par performance. Even if you wrangled an AI into doing the grunt work for you, porting heavily-NV-tuned code to not-NV-hardware would still be losing strategy.

measurablefunc 11/20/2025||

The point of AI is that it is not a rote translation & 1:1 mapping.

jsheard 11/20/2025||

> Take the ROCm specification, take your CUDA codebase, let one of the agentic AIs translate it all into ROCm

...sounds like asking for a 1:1 mapping to me. If you meant asking the AI to transmute the code from NV-optimal to AMD-optimal as it goes along, you could certainly try doing that, but the idea is nothing more than AI fanfic until someone shows it actually working.

measurablefunc 11/20/2025||

Now that I have clarified the point about AI optimizing the code from CUDA to fit AMD's runtime what is your contention about the possibility of such a translation?

bigyabai 11/20/2025||

There is an old programmer's joke about writing abstractions and expecting zero-cost.

measurablefunc 11/20/2025||

How does that apply in this case? The whole point is that the agentic AI/AGI skips all the abstractions & writes optimized low-level code for each GPU vendor from a high-level specification. There are no abstractions other than whatever specifications GPU vendors provide for their hardware which are fed into the agentic AI/AGI to do the necessary work of creating low-level & optimized code for specific tasks.

cbarrick 11/20/2025|||

Has this been done successfully at scale?

There's a lot of handwaving in this "just use AI" approach. You have to figure out a way to guarantee correctness.

measurablefunc 11/20/2025||

There are tons of test suites so if the tests pass then that provides a reasonable guarantee of correctness. Although it would be nice if there was also proof of correctness for the compilation from CUDA to AMD.

bee_rider 11/20/2025|||

The AI is too busy making Ghibli profile pictures or whatever the thing is now.

We asked it to make a plan for how to fix the situation, but it got stuck.

“Ok, I’m helping the people build an AI to translate NVIDIA codes to AMD”

“I don’t have enough resources”

“Simple, I’ll just use AMD chips to run an AI code translator, they are under-utilized. I’ll make a step by step process to do so”

“Step 1: get code kernels for the AMD chips”

And so on.

measurablefunc 11/20/2025||

The real question is whether it will be as unprofitable to do this type of automated runtime translation from one GPU vendor to another as it is to generate Mario clips & Ghibli images.

j16sdiz 11/20/2025|||

The same as "Why just outsourcing it to <some country >"

AI aint magic.

You need more effort to manage, test and validate that.

measurablefunc 11/20/2025||

[flagged]

j16sdiz 11/20/2025|||

So, your strategy for solving this is: Convert it to another harder problem (AGI). Now it is somebody else (AI researcher)'s problem.

This is outsourcing the task to AI researchers.

measurablefunc 11/20/2025||

They keep promising that this kind of capability is right around the corner & they keep showing how awesome they are at passing math exams so why is this a more difficult problem than solving problems in abstract algebra & scheme theory on humanity's last exam or whatever is the latest & greatest benchmark for mathematical capabilities?

Daedren 11/20/2025||

They all have to make promises and have to dream big to keep the AI bubble from popping.

measurablefunc 11/20/2025||

I agree which is why it's a bit odd that so many people still think that Sam Altman & Elon Musk are honest technologists instead of unscrupulous grifters.

j16sdiz 11/20/2025||||

I am not saying this is impossible, but I am down voting this because this is _not an interesting discussion_.

The whole point of having an online discussion forum is to exchange and create new ideas. What you are advocating is essentially "maybe we can stop generating new ideas because we don't have to. we should just sit and wait"... Well, yes, no, maybe. but this is not what I expect to get from here.

measurablefunc 11/20/2025||

You can do whatever you want & I didn't ask you to participate in my thread so unless you are going to address the actual points I'm making instead of telling me it is not interesting then we don't have anything to discuss further.

nutjob2 11/20/2025|||

> Isn't AGI around the corner?

There isn't even a concrete definition of intelligence, let alone AGI, so no it's not.

That's just mindless hype at this point.

measurablefunc 11/20/2025||

[flagged]

colonCapitalDee 11/20/2025|||

No. This is far beyond the capabilities of current AI, and will remain so for the foreseeable future. You could let your model of choice churn on this for months, and you will not get anywhere. It will be able to reach a somewhat working solution quickly, but it will soon reach a point where for every issue it fixes, it introduces one or more issues or regressions. LLMs are simply not capable of scaffolding complexity like a human, and lack the clarity and rigorousness of thought required to execute an *extremely* ambitious project like performant CUDA to ROCm translation.

impossiblefork 11/20/2025|||

I don't think it really is, especially not if it's turned into a system, with multiple prompts, verification, etc.

Humans have problems with IMO problems, and this kind of kernel translation is a problem which is easier to humans, where there's more probably actually more data and a problem where the system can get feedback by simply running it and measuring memory use, runtime etc.

It'd be a system and no one has developed it, but I think it can be done with present LLMs as a core mechanism. They just need to be trained with RL on this specific problem.

Anyone with a good LLM, from Google to Mistral could probably do this, but it'd be a project.

measurablefunc 11/20/2025|||

[flagged]

colonCapitalDee 11/20/2025|||

Well that's your problem. Here's a tip: just because someone says something doesn't mean you have to listen to them

measurablefunc 11/20/2025||

[flagged]

bigyabai 11/20/2025|||

This explains everything.

imtringued 11/20/2025|||

The AI needs a mental model of the hardware for that to work.

measurablefunc 11/20/2025||

Algorithms do not have mental models of anything.

Blackthorn 11/20/2025|||

I don't know why you're being downvoted because even if you're Not Even Wrong, that's exactly the sort of thing that has been endlessly presented by people trying to sell AI as something that AI will absolutely do for us.

measurablefunc 11/20/2025||

[flagged]

bigyabai 11/20/2025||

It's hard to catch-on to a deliberately dishonest pretense. You could clone 10,000 John Carmacks to do the job for you, Nvidia would still be a $5 trillion business next time you wake up.

measurablefunc 11/20/2025||

[flagged]

bigyabai 11/20/2025||

I'm not talking to them. I am responding to you - your sardonic piss-take is against HN guidelines and written in bad-faith.

measurablefunc 11/20/2025||

[flagged]

bigyabai 11/20/2025||

Sure, and thieves probably recommend that the cops move on & refrain from following where they're headed.

Be honest and you won't have to fend-off accusations of bad-faith. I'm inclined to agree with your overall point of AI being overhyped, but you've gutted your own logic so hard in the process that your stance is unrecognizable. You've developed a meaningfully ambiguous stance to an elaborate and deeply incorrect series of arguments.

measurablefunc 11/20/2025||

[flagged]

bigyabai 11/20/2025||

I didn't even read the first iteration of your profile. If your stance can't be substantiated without hidden subtext, you're not making a good point.

Your future comments are definitely going to be flagged unless you switch to a good-faith writing style.

measurablefunc 11/20/2025||

Doesn't bother me either way but you can keep trying to pathologize instead of actually making substantive points to address anything I have actually clearly laid out.

bigyabai 11/20/2025||

Because it doesn't work like that. TFA is an explanation of how GPU architecture dictates the featureset that is feasibly attainable at runtime. Throwing more software at the problem would not enable direct competition with CUDA.

measurablefunc 11/20/2025||

I am assuming that is all part of the specification that the agentic AI is working with & since AGI is right around the corner I think this is a simple enough problem that can be solved with AI.

pixelpoet 11/20/2025|

Actual article title says "won't"; wont is a word meaning habit or proclivity.

InvisGhost 11/20/2025|

In situations like this, I try to focus on whether the other person understood what was being communicated rather than splitting hairs. In this case, I don't think anyone would be confused.

philipallstar 11/20/2025||

Probably best to just fix the spelling.

Eliovp 11/20/2025||

That's what you get when you don't use AI to write an article :p