No Graphics API - Hacker News

Posted by ryandrake 12/16/2025

No Graphics API(www.sebastianaaltonen.com)

845 points | 183 comments

vblanco 12/16/2025|

This is a fantastic article that demonstrates how many parts of vulkan and DX12 are no longer needed.

I hope the IHVs have a look at it because current DX12 seems semi abandoned, with it not supporting buffer pointers even when every gpu made on the last 10 (or more!) years can do pointers just fine, and while Vulkan doesnt do a 2.0 release that cleans things, so it carries a lot of baggage, and specially, tons of drivers that dont implement the extensions that really improve things.

If this api existed, you could emulate openGL on top of this faster than current opengl to vulkan layers, and something like SDL3 gpu would get a 3x/4x boost too.

pjmlp 12/16/2025||

DirectX documentation is on a bad state currently, you have the Frank Lunas's books, which don't cover the latest improvements, and then is hunting through Learn, Github samples and reference docs.

Vulkan is another mess, even if there was a 2.0, how are devs supposed to actually use it, especially on Android, the biggest consumer Vulkan platform?

_bohm 12/16/2025|||

I'm surprised he made no mention of the SDL3 GPU API since his proposed API has pretty significant overlap with it.

tadfisher 12/16/2025|||

Isn't this all because PCI resizable BAR is not required to run any GPU besides Intel Arc? As in, maybe it's mostly down to Microsoft/Intel mandating reBAR in UEFI so we can start using stuff like bindless textures without thousands of support tickets and negative reviews.

I think this puts a floor on supported hardware though, like Nvidia 30xx and Radeon 5xxx. And of course motherboard support is a crapshoot until 2020 or so.

vblanco 12/16/2025||

This is not really directly about resizable BAR. you could do mostly the same api without it. Resizable bar simplifies it a little bit because you skip manual transfer operations, but its not completely required as you can write things to a cpu-writeable buffer and then begin your frame with a transfer command.

Bindless textures never needed any kind of resizable BAR, you have been able to use them since early 2010s on opengl through an extension. Buffer pointers also have never needed it.

exDM69 12/17/2025|||

> tons of drivers that dont implement the extensions that really improve things.

This isn't really the case, at least on desktop side.

All three desktop GPU vendors support Vulkan 1.4 (or most of the features via extensions) on all major platforms even on really old hardware (e.g. Intel Skylake is 10+ years old and has all the latest Vulkan features). Even Apple + MoltenVK is pretty good.

Even mobile GPU vendors have pretty good support in their latest drivers.

The biggest issue is that Android consumer devices don't get GPU driver updates so they're not available to the general public.

pjmlp 12/17/2025||

Neither do laptops, where not using the driver from the OEM with whatver custom code they added can lead to interesting experiences, like power configuration going bad, not able to handle the mixed GPU setups, and so on.

kllrnohj 12/16/2025|||

No longer needed is a strong statement given how recent the GPU support is. It's unlikely anything could accept those minimum requirements today.

But soon? Hopefully

jsheard 12/16/2025||

Those requirements more or less line up with the introduction of hardware raytracing, and some major titles are already treating that as a hard requirement, like the recent Doom and Indiana Jones games.

kllrnohj 12/16/2025|||

Only if you're ignoring mobile entirely. One of the things Vulkan did which would be a shame to lose is it unified desktop and mobile GPU APIs.

m-schuetz 12/17/2025|||

On the contrary, I would say this is the main thing Vulkan got wrong and the main reason whe the API is so bad. Desktop and mobile are way too different for a uniform rendering API. They should be two different flavours with a common denominator. OpenGL and OpenGL ES were much better in that regard.

HelloNurse 12/18/2025||

It is unreasonable to expect to run the same graphics code on desktop GPUs and mobile ones: mobile applications have to render something less expensive that doesn't exceed the limited capabilities of a low-power device with slow memory.

The different, separate engine variants for mobile and desktop users, on the other hand, can be based on the same graphics API; they'll just use different features from it in addition to having different algorithms and architecture.

flohofwoe 12/18/2025||

> they'll just use different features from it in addition to having different algorithms and architecture.

...so you'll have different code paths for desktop and mobile anyway. The same can be achieved with a Vulkan vs VulkanES split which would overlap for maybe 50..70% of the core API, but significantly differ in the rest (like resource binding).

kllrnohj 12/18/2025||

But they don't actually differ, see the "no graphics API" blog post we're all commenting on :) The primary difference between mobile & desktop is performance, not feature set (ignoring for a minute the problem of outdated drivers).

And beyond that if you look at historical trends, mobile is and always has been just "desktop from 5-7 years ago". An API split that makes sense now will stop making sense rather quickly.

m-schuetz 12/19/2025||

Different features/architecture is precisely the issue with mobile, be it due to hardware constraints or due to lack in deiver support. Render passes were only bolted into Vulkan because of mobile tiler GPUs, they never made any sense for desktop GPUs and only made Vulkan worse for desktop graphics development.

And this is the reason why mobile and desktop should be separate graphics APIs. Mobile is holding desktop back not just feature wise, it also fucks up the API.

pjmlp 12/17/2025||||

It is not unified, when the first thing an application has to do is to find out if their set of extension spaghetti is available on the device.

flohofwoe 12/17/2025||||

> One of the things Vulkan did which would be a shame to lose is it unified desktop and mobile GPU APIs.

In hindsight it really would have been better to have a separate VulkanES which is specialized for mobile GPUs.

pjmlp 12/17/2025||

Apparently in many Android devices it is still better to target OpenGL ES than Vulkan due to driver quality, outside Samsung and Google brands.

eek2121 12/18/2025||||

Mobile is getting RT, fyi. Apple already has it (for a few generations, at least), I think Qualcomm does as well (I'm less familiar with their stuff, because they've been behind the game forever, however the last I've read, their latest stuff has it), and things are rapidly improving.

Vulkan is the actual barrier. On Windows, DirectX does an average job at supporting it. Microsoft doesn't really innovate these days, so NVIDIA largely drives the market, and sometimes AMD pitches in.

pjmlp 12/18/2025||

Where do you think many DirectX features came from?

It has been mostly NVidia in collaboration with Microsoft, even HLSL traces back to Cg.

jsheard 12/17/2025|||

Eh, I think the jury is still out on whether unifying desktop and mobile graphics APIs is really worth it. In practice Vulkan written to take full advantage of desktop GPUs is wildly incompatible with most mobile GPUs, so there's fragmentation between them regardless.

kllrnohj 12/17/2025|||

It's quite useful for things like skia or piet-gpu/vello or the general category of "things that use the GPU that aren't games" (image/video editors, effects pipelines, compute, etc etc etc)

Groxx 12/17/2025|||

would it also apply to stuff like the Switch, and relatively high-end "mobile" gaming in general? (I'm not sure what those chips actually look like tho)

there are also some arm laptops that just run Qualcomm chips, the same as some phones (tablets with a keyboard, basically, but a bit more "PC"-like due to running Windows).

AFAICT the fusion seems likely to be an accurate prediction.

deliciousturkey 12/17/2025||

Switch has its own API. The GPU also doesn't have limitations you'd associate with "mobile". In terms of architecture, it's a full desktop GPU with desktop-class features.

kllrnohj 12/17/2025||

well, it's a desktop GPU with desktop-class features from 2014 which makes it quite outdated relative to current mobile GPUs. The just released Switch 2 uses an Ampere-based GPU, which means it's desktop-class for 2020 (RTX 3xxx series), which is nothing to scoff about but "desktop-class features" is a rapidly moving target and the Switch ends up being a lot closer to mobile than it does to desktop since it's always launching with ~2 generations old GPUs.

deliciousturkey 12/21/2025|||

The context was

Only if you're ignoring mobile entirely. One of the things Vulkan did which would be a shame to lose is it unified desktop and mobile GPU APIs.

In this context, both old Switch and Switch 2 have full desktop-class GPUs. They don't need to care about the API problems that mobile vendors imposed to Vulkan.

pjmlp 12/17/2025|||

Still beats the design of all Web 3D APIs, and has much better development tooling, let that sink in how behind they are.

pjmlp 12/18/2025||||

Those already have their own abstraction API, and implementing a RHI isn't a big issue as FOSS circles make it to be.

jsheard 12/17/2025|||

I suppose that's true, yeah. I was focusing too much on games specifically.

ablob 12/17/2025||||

I feel like it's a win by default. I do like to write my own programs every now and then and recently there's been more and more graphics sprinkled into them. Being able to reuse those components and just render onto a target without changing anything else seems to be very useful here. This kind of seamless interoperability between platforms is very desirable in my book. I can't think of a better approach to achieve this than the graphics API itself.

Also there is no inherent thing that blocks extensions by default. I feel like a reasonable core that can optionally do more things similar to CPU extensions (i.e. vector extensions) could be the way to go here.

eek2121 12/18/2025||||

I definitely disagree here. What matters for mobile is power consumption. Capabilities can be pretty easily implemented...if you disagree, ask Apple. They have seemingly nailed it (with a few unrelated limitations).

Mobile vendors insisting on using closed, proprietary drivers that they refuse to constantly update/stay on top of is the actual issue. If you have a GPU capable of cutting edge graphics, you have to have a top notch driver stack. Nobody gets this right except AMD and NVIDIA (and both have their flaws). Apple doesn't even come close, and they are ahead of everyone else except AMD/NVIDIA. AMD seems to do it the best, NVIDIA, a distant second, Apple 3rd, and everyone else 10th.

aleph_minus_one 12/18/2025||

> If you have a GPU capable of cutting edge graphics, you have to have a top notch driver stack. Nobody gets this right except AMD and NVIDIA (and both have their flaws). Apple doesn't even come close, and they are ahead of everyone else except AMD/NVIDIA. AMD seems to do it the best, NVIDIA, a distant second, Apple 3rd, and everyone else 10th.

What about Intel?

pjmlp 12/18/2025||

It is quite telling how good their iGPUs are at 3D that no one counts them in.

I remember there was time about 15 years ago, they were famous for reporting OpenGL capabilities as supported, when they were actually only available as software rendering, which voided any purpose to use such features in first place.

aleph_minus_one 12/18/2025||

I know that in the past (such as your mentioned 15 years ago) Intel GPUs did have driver issues.

> It is quite telling how good their iGPUs are at 3D that no one counts them in.

I'm not so certain about this: in

> https://old.reddit.com/r/laptops/comments/1eqyau2/apuigpu_ti...

APUs/iGPUs are compared, and here Intel's integrated GPUs seem to be very competitive with AMD's APUs.

---

You of course have to compare dedicated graphics cards with each other, and similarly for integrated GPUs, so let's compare (Intel's) dedicated GPUs (Intel Arc), too:

When I look at

> https://www.tomshardware.com/reviews/gpu-hierarchy,4388.html

the current Intel Arc generation (Intel-Arc-B, "Battlemage") seems to be competitive with entry-level GPUs of NVidia and AMD, i.e. you can get much more powerful GPUs from NVidia and AMD, but for a much higher price. I thus clearly would not call Intel's dedicated GPUs to be so bad "at 3D that no one counts them in".

01HNNWZ0MV43FF 12/17/2025|||

If the APIs aren't unified, the engines will be, since VR games will want to work on both standalone headsets and streaming headsets

tjpnz 12/17/2025|||

Doom was able to drop it and is now Steam Deck verified.

nicolaslem 12/17/2025|||

Little known fact, the Steam Deck has hardware ray tracing, it's just so weak as to be almost non-existent.

torginus 12/17/2025|||

It's weird how the 'next-gen' APIs will turn out to be failures in many ways imo. I think still as sizeable amount of graphics devs still stuck to the old way of doing things. I know a couple graphics wizards (who work on major AAA titles) who never liked Vulkan/DX12, and many engines haven't really been rebuilt to accomodate the 'new' way of doing graphics.

Ironically a lot of the time, these new APIs end up being slower in practice (something confirmed by gaming benchmarks), probably exactly because of the issues outlined in the article - having precompiled 'pipeline states', instead of the good ol state machine has forced devs to precompile a truly staggering amount of states, and even then sometimes compilation can occur, leading to these well known stutters.

The other issue is synchronization - as the article mentions how unnecessarily heavy Vulkan synchronization is, and devs aren't really experts or have the time to figure out when to use what kind of barrier, so they adopt a 'better be safe than sorry approach', leading to unneccessary flushes and pipeline stalls that can tank performance in real life workloads.

This is definitely a huge issue combined with the API complexity, leading many devs to use wrappers like the aforementioned SDL3, which is definitely very conservative when it comes to synchronization.

Old APIs with smart drivers could either figure this out better, or GPU driver devs looked at the workloads and patched up rendering manually on popular titles.

Additionally by the early to mid 10s, when these new APIs started getting released, a lot of crafty devs, together with new shader models and OpenGL extensions made it possible to render tens of thousands of varied and interesting objects, essentially the whole scene's worth, in a single draw call. The most sophisticated and complex of these was AZDO, which I'm not sure made it actually into a released games, but even with much less sophisticated approaches (and combined with ideas like PBR materials and deferred rendering), you could pretty much draw anything.

This meant much of the perf bottleneck of the old APIs disappeared.

eek2121 12/18/2025||

I think the big issue is that there is no 'next-gen API'. Microsoft has largely abandoned DirectX, Vulkan is restrictive as anything, Metal isn't changing much beyond matching DX/Vk, and NVIDIA/AMD/Apple/Qualcomm aren't interested in (re)-inventing the wheel.

There are some interesting GPU improvements coming down the pipeline, like a possible OoO part from AMD (if certain credible leaks are valid), however, crickets from Microsoft, and NVIDIA just wants vendor lock-in.

Yes, we need a vastly simpler API. I'd argue even simpler than the one proposed.

One of my biggest hopes for RT is that it will standardize like 80% of stuff to the point where it can be abstracted to libraries. It probably won't happen, but one can wish...

aleph_minus_one 12/18/2025||

> Microsoft has largely abandoned DirectX

What does Microsoft then intend to use to replace the functionality that DirectX provides?

PeterStuer 12/17/2025||

Still have some 1080's in gaming machines going strong. But as even nVidea retired support I guess it is time to move on.

opminion 12/16/2025||

The article is missing this motivation paragraph, taken from the blog index:

> Graphics APIs and shader languages have significantly increased in complexity over the past decade. It’s time to start discussing how to strip down the abstractions to simplify development, improve performance, and prepare for future GPU workloads.

alberth 12/16/2025||

Would this be analogous to NVMe?

Meaning ... SSDs initially reused IDE/SATA interfaces, which had inherent bottlenecks because those standards were designed for spinning disks.

To fully realize SSD performance, a new transport had to be built from the ground up, one that eliminated those legacy assumptions, constraints and complexities.

rnewme 12/16/2025||

...and introduced new ones.

stevage 12/16/2025||

Thanks, I had trouble figuring out what the article was about, lost in all the "here's how I used AI and had the article screened by industry insiders".

masspro 12/17/2025|||

I read that whole (single) paragraph as “I made really, really, really sure I didn’t violate any NDAs by doing these things to confirm everything had a public source”

beAbU 12/17/2025||

This is literally the second paragraph in the article. There is no need for interpretation here.

Unless the link of the article has changed since your comment?

yuriks 12/17/2025||||

I was lost when it suddenly jumped from a long retrospective on GPUs to abruptly talking about "my allocator API" on the next paragraph with no segue or justification.

jama211 12/17/2025||||

You only read two paragraphs in then?

doctorpangloss 12/17/2025|||

haha, instead of making them read an AI-coauthored blog post, which obviously, they didn't do, he could have asked them interesting questions like, "Do better graphics make better games?" or "If you could change anything about the platforms' technology, what would it be?"

starkparker 12/17/2025||

> GPU hardware started to shift towards a generic SIMD design. SIMD units were now executing all the different shader types: vertex, pixel, geometry, hull, domain and compute. Today the framework has 16 different shader entry points. This adds a lot of API surface and makes composition difficult. As a result GLSL and HLSL still don’t have a flourishing library ecosystem ... despite 20 years of existence

A lot of this post went over my head, but I've struggled enough with GLSL for this to be triggering. Learning gets brutal for the lack of middle ground between reinventing every shader every time and using an engine that abstracts shaders from the render pipeline. A lot of open-source projects that use shaders are either allergic to documenting them or are proud of how obtuse the code is. Shadertoy is about as good as it gets, and that's not a compliment.

The only way I learned anything about shaders was from someone who already knew them well. They learned what they knew by spending a solid 7-8 years of their teenage/young adult years doing nearly nothing but GPU programming. There's probably something in between that doesn't involve giving up and using node-based tools, but in a couple decades of trying and failing to grasp it I've never found it.

canyp 12/17/2025|

This page is a good place to start for shader programming:

https://lettier.github.io/3d-game-shaders-for-beginners/inde...

I agree on the other points. GPU graphics programming is hard in large part because of terrible or lack of documentation.

pjmlp 12/16/2025||

I have followed Sebastian Aaltonen's work for quite a while now, so maybe I am a bit biased, this is however a great article.

I also think that the way forward is to go back to software rendering, however this time around those algorithms and data structures are actually hardware accelerated as he points out.

Note that this is an ongoing trend on VFX industry already, about 5 years ago OTOY ported their OctaneRender into CUDA as the main rendering API.

gmueckl 12/16/2025||

There are tons of places within the GPU where dedicated fixed function hardware provides massive speedups within the relevant pipelines (rasterization, raytracing). The different shader types are designed to fit inbetween those stages. Abandoning this hardware would lead to a massive performance regression.

formerly_proven 12/16/2025|||

Just consider the sheer number of computations offloaded to TMUs. Shaders would already do nothing but interpolate texels if you removed them.

efilife 12/16/2025|||

Offtop, but sorry, I can't resist. "Inbetween" is not a word. I started seeing many people having trouble with prepositions lately, for some unknown reason.

> “Inbetween” is never written as one word. If you have seen it written in this way before, it is a simple typo or misspelling. You should not use it in this way because it is not grammatically correct as the noun phrase or the adjective form. https://grammarhow.com/in-between-in-between-or-inbetween/

Antibabelic 12/17/2025|||

"Offtop" is not a word. It's not in any English dictionary I could find and doesn't appear in any published literature.

Matthew 7:3 "And why beholdest thou the mote that is in thy brother's eye, but considerest not the beam that is in thine own eye?"

Joker_vD 12/17/2025|||

Oh, it's a transliteration of Russian "офтоп", which itself started as a borrowing of "off-topic" from English (but as a noun instead of an adjective/stative) and then went some natural linguistic developments, namely loss of a hyphen and degemination, surface analysis of the trailing "-ic" as Russian suffix "-ик" [0], and its subsequent removal to obtain the supposed "original, non-derived" form.

[0] https://en.wiktionary.org/wiki/-%D0%B8%D0%BA#Russian

fngjdflmdflg 12/17/2025||

>subsequent removal to obtain the supposed "original, non-derived" form

Also called a "back-formation". FWIF I don't think the existence of corrupted words automatically justifies more corruptions nor does the fact that it is a corruption automatically invalidate it. When language among a group evolves, everyone speaking that language is affected, which is why written language reads pretty differently looking back every 50 years or so, in both formal and informal writing. Therefore language changes should have buy-in from all users.

speed_spread 12/17/2025|||

Language evolves in mysterious ways. FWIW I find offtop to have high cromulency.

dist-epoch 12/16/2025||||

If enough people use it, it will become correct. This is how language evolves. BTW, there is no "official English language specification".

And linguists think it would be a bad idea to have one:

https://archive.nytimes.com/opinionator.blogs.nytimes.com/20...

mikestorrent 12/17/2025||||

Surely you mean "I've started seeing..." rather than "I started seeing..."?

dragonwriter 12/17/2025||

Either the present perfect that you suggest or the past perfect originally presented is correct, and the denotation is basically identical. The connotation is slightly different, as the past perfect puts more emphasis on the "started...lately" and the emergent nature of the phenomenon, and the present perfect on the ongoing state of what was started, but there’s no giant difference.

cracki 12/17/2025|||

Your entire post does not once mention the form you call correct.

If you intend for people to click the link, then you might just as well delete all the prose before it.

torginus 12/17/2025|||

I really want to make a game using a software rasterizer sometime - just to prove its possible. Back in the good ol' days, I had to get by on my dad's PC, which had no graphics acceleration, but a farily substatial Pentium 3 processor.

Games like the original Half-Life, Unreal Tournament 2004, etc. ran surprisingly well and at decent resolutions.

With the power of modern hardware, I guess you could do a decent FPS in pure software with even naively written code, and not having to deal with the APIs, but having the absolute creative freedom to say 'this pixel is green' would be liberating.

Fun fact: Due to the divergent nature of computation, many ray tracers targeting real time performance were written on CPU, even when GPUs were quite powerful, software raytracers were quite good, until the hardware apis started popping up.

darzu 12/17/2025|||

You should! And you might enjoy this video about making a CPU rasterizer: https://www.youtube.com/watch?v=yyJ-hdISgnw

Note that when the parent comment says "software rendering" they're referring to software (compute shaders) on the GPU.

pjmlp 12/18/2025|||

You could start by staying on the CPU side, and make use of AVX, Larrabee style.

Which is easier to debug.

Going with Mesh shaders, or GPU compute would be the next step.

mrec 12/16/2025|||

Isn't this already happening to some degree? E.g. UE's Nanite uses a software rasterizer for small triangles, albeit running on the GPU via a compute shader.

jsheard 12/16/2025|||

Things are kind of heading in two opposite directions at the moment. Early GPU rasterization was all done in fixed-function hardware, but then we got programmable shading, and then we started using compute shaders to feed the HW rasterizer, and then we started replacing the HW rasterizer itself with more compute (as in Nanite). The flexibility of doing whatever you want in software has gradually displaced the inflexible hardware units.

Meanwhile GPU raytracing was a purely software affair until quite recently when fixed-function raytracing hardware arrived. It's fast but also opaque and inflexible, only exposed through high-level driver interfaces which hide most of the details, so you have to let Jensen take the wheel. There's nothing stopping someone from going back to software RT of course but the performance of hardware RT is hard to pass up for now, so that's mostly the way things are going even if it does have annoying limitations.

HelloNurse 12/18/2025||

And hardware raytracing is on the same trajectory as hardware rasterization: devs finding ways to repurpose it, leading to pressure for more general APIs, which enable further repurposing, until hardware raytracing evolves into a flexible hardware accelerated facility for indexing, reordering, etc.

gmueckl 12/19/2025||||

Nanite is just working around an inefficiency that occurs on small triangles that require screen space derivatives, which the hardware approxinates using finite differences between neighbors, e.g. for the texture footprint estimation in mipmapping. The rasterizer invokes additional shader instances around triangle borders to get the source values for these operations. That gets excessive when triangles are tiny. This is an edge case, but it becomes important when there is lots of tiny geometric details on screen.

djmips 12/16/2025|||

Why do you say 'albeit'? I think it's established that 'software rendering' can mean running on the GPU. That's what Octane is doing with CUDA in the comment you are replying to. But good callout on Nanite.

mrec 12/17/2025||

No good reason, I'm just very very old.

Q6T46nT668w6i3m 12/17/2025||

But they still rely on fixed functions for a handful of essential ops (e.g., intersection).

aarroyoc 12/16/2025||

Impressive post, so many details. I could only understand some parts of it, but I think this article will probably be a reference for future graphics API.

I think it's fair to say that for most gamers, Vulkan/DX12 hasn't really been a net positive, the PSO problem affected many popular games and while Vulkan has been trying to improve, WebGPU is tricky as it has is roots on the first versions of Vulkan.

Perhaps it was a bad idea to go all in to a low level API that exposes many details when the hardware underneath is evolving so fast. Maybe CUDA, as the post says in some places, with its more generic computing support is the right way after all.

apitman 12/17/2025||

The PSO problem is referring to this, right? https://therealmjp.github.io/posts/shader-permutations-part1...

erwincoumans 12/17/2025|||

Yes, an amazing and detailed post, enjoyed all of it. In AI, it is common to use jit compilers (pytorch, jax, warp, triton, taichi, ...) that compile to cuda (or rocm, cpu, tpu, ...). You could write renderers like that, rasterizers or raytracers.

For example: https://github.com/StafaH/mujoco_warp/blob/render_context/mu...

(A new simple raytracer that compiles to cuda, used for robotics reinforcement learning, renders at up to 1 million fps at low resolution, 64x64, with textures, shadows)

qiine 12/17/2025||

yeah.. let's make nvidia control more things..

m-schuetz 12/17/2025||

Problem is that NVIDIA literally makes the only sane graphics/compute APIs. And part of it is to make the API accessible, not needlessly overengineered. Either the other vendors start to step up their game, or they'll continue to lose.

Archit3ch 12/18/2025||

> Problem is that NVIDIA literally makes the only sane graphics/compute APIs.

Hot take, Metal is more sane than CUDA.

m-schuetz 12/18/2025||

I'm having a hard time taking an API seriously that uses atomic types rather than atomic functions. But at least it seems to be better than Vulkan/OpenGL/DirectX.

dundarious 12/17/2025||

I see this as an expression of the same underlying complaint as Casey Muratori's 30 Million Line Problem: https://caseymuratori.com/blog_0031

Casey argues for ISAs for hardware, including GPUs, instead of heavy drivers. TFA argues for a graphics API surface that is so lean precisely because it fundamentally boils down to a simple and small set of primitives (mapping memory, simple barriers, etc.) that are basically equivalent to a simple ISA.

If a stable ISA was a requirement, I believe we would have converged on these simpler capabilities ahead of time, as a matter of necessity. However, I am not a graphics programmer, so I just offer this as an intellectual provocation to drive conversation.

newpavlov 12/17/2025|

I generally agree with this opinion and would love to see a proper well documented low-level API for working with GPU. But it would probably result in different "GPU ISAs" for different vendors and maybe even for different GPU generations from one vendor. The bloated firmwares and drivers operating on a higher abstraction level allow to hide a lot of internal implementation details from end users.

In such world most of software would still probably use something like Vulkan/DX/WebGPU to abstract over such ISAs, like we use today Java/JavaScript/Python to "abstract" over CPU ISA. And we also likely to have an NVIDIA monopoly similar to x86.

loup-vaillant 12/17/2025|||

There’s a simple (but radical) solution that would force GPU vendors to settle on a common, stable ISA: forbid hardware vendors to distribute software. In practice, stop at the border hardware that comes from vendors who still distribute software.

That simple. Now to sell a GPU, the only way is to make an ISA so simple even third parties can make good drivers for it. And the first successful ISA will then force everyone else to implement the same ISA, so the same drivers will work for everyone.

Oh, one other thing that has to go away: patents must no longer apply to ISAs. That way, anyone who wants to make and sell x86, ARM, or whatever GPU ISA that emerges, legally can. No more discussion about which instruction set is open or not, they all just are.

Not that the US would ever want to submit Intel to such a brutal competition.

dundarious 12/17/2025|||

I wouldn't be so sure, as if we analogize to x86(_64), the ISA is stable and used by many vendors, but the underlying microarchitecture and caching model, etc., are free reign for impl-specific work.

delifue 12/17/2025||

This reminds of me Makimoto’s Wave:

https://semiengineering.com/knowledge_centers/standards-laws...

There is a constant cycle between domain-specific hardware-hardcoded-algorithm design, and programmable flexible design.

pavlov 12/17/2025|

It's also known as Sutherland's Wheel of Reincarnation:

http://www.cap-lore.com/Hardware/Wheel.html

SunlitCat 12/17/2025||

This article already feels like it’s on the right track. DirectX 11 was perfectly fine, and DirectX 12 is great if you really want total control over the hardware but I even remember some IHV saying that this level of control isn’t always a good thing.

When you look at the DirectX 12 documentation and best-practice guides, you’re constantly warned that certain techniques may perform well on one GPU but poorly on another, and vice versa. That alone shows how fragile this approach can be.

Which makes sense: GPU hardware keeps evolving and has become incredibly complex. Maybe graphics APIs should actually move further up the abstraction ladder again, to a point where you mainly upload models, textures, and a high-level description of what the scene and objects are supposed to do and how they relate to each other. The hardware (and its driver) could then decide what’s optimal and how to turn that into pixels on the screen.

Yes, game engines and (to some extent) RHIs already do this, but having such an approach as a standardized, optional graphics API would be interesting. It would allow GPU vendors to adapt their drivers closely to their hardware, because they arguably know best what their hardware can do and how to do it efficiently.

canyp 12/17/2025|

> but I even remember some IHV saying that this level of control isn’t always a good thing.

Because that control is only as good as you can master it, and not all game developers do well on that front. Just check out enhanced barriers in DX12 and all of the rules around them as an example. You almost need to train as a lawyer to digest that clusterfuck.

> The hardware (and its driver) could then decide what’s optimal and how to turn that into pixels on the screen.

We should go in the other direction: have a goddamn ISA you can target across architectures, like an x86 for GPUs (though ideally not that encumbered by licenses), and let people write code against it. Get rid of all the proprietary driver stack while you're at it.

SunlitCat 12/18/2025||

The problem with DX12/Vulkan isn’t just that “low-level control is hard”, it’s that a lot of performance-critical decisions are now exposed at a level where they’re extremely GPU- and generation-specific. The same synchronization strategy, command ordering, or memory usage can work great on one GPU and badly on another.

A GPU ISA wouldn’t fix that, it would push even more of those decisions onto the developer.

An ISA only really helps if the underlying execution and memory model is reasonably stable and uniform. That’s true for CPUs, which is why x86 works. GPUs are the opposite: different wave sizes, scheduling models, cache behavior, tiling, memory hierarchies, and those things change all the time. If a GPU ISA is abstract enough to survive that, it’s no longer a useful performance target. If it’s concrete enough to matter for performance, it becomes brittle and quickly outdated.

DX12 already moved the abstraction line downward. A GPU ISA would move it even further down. The issues being discussed here are largely a consequence of that shift, not something solved by continuing it.

What the blog post is really arguing for is the opposite direction: higher-level, more declarative APIs, where you describe what you want rendered and let the driver/hardware decide how to execute it efficiently on a given GPU. That’s exactly what drivers are good at, and it’s what made older APIs more robust across vendors in the first place.

So while a GPU ISA is an interesting idea in general, it doesn’t really address the problem being discussed here.

canyp 12/20/2025||

But the driver can't decide how to execute efficiently more than the application does, that's how we got the modern APIs. The declarative API would necessarily have to tackle very specific use cases, which again is what the older APIs did.

So I guess we're stuck with what exists today for a while.

reactordev 12/16/2025||

I miss Mantle. It had its quirks but you felt as if you were literally programming hardware using a pretty straight forward API. The most fun I’ve had programming was for the Xbox 360.

djmips 12/16/2025|

You know what else is good like that? The Switch graphics API - designed by Nvidia and Nintendo. Easily the most straightforward of the console graphics APIs

reactordev 12/16/2025||

Yes but it’s so underpowered. I want RTX 5090 performance with 16 cores.

alaingalvan 12/17/2025|

If you enjoyed history of GPUs section, there's a great book that goes into more detail by Jon Peddie titled "The History of the GPU - Steps to Invention", definitely worth a read.

More comments...