I hope the IHVs have a look at it because current DX12 seems semi abandoned, with it not supporting buffer pointers even when every gpu made on the last 10 (or more!) years can do pointers just fine, and while Vulkan doesnt do a 2.0 release that cleans things, so it carries a lot of baggage, and specially, tons of drivers that dont implement the extensions that really improve things.
If this api existed, you could emulate openGL on top of this faster than current opengl to vulkan layers, and something like SDL3 gpu would get a 3x/4x boost too.
Vulkan is another mess, even if there was a 2.0, how are devs supposed to actually use it, especially on Android, the biggest consumer Vulkan platform?
I think this puts a floor on supported hardware though, like Nvidia 30xx and Radeon 5xxx. And of course motherboard support is a crapshoot until 2020 or so.
Bindless textures never needed any kind of resizable BAR, you have been able to use them since early 2010s on opengl through an extension. Buffer pointers also have never needed it.
This isn't really the case, at least on desktop side.
All three desktop GPU vendors support Vulkan 1.4 (or most of the features via extensions) on all major platforms even on really old hardware (e.g. Intel Skylake is 10+ years old and has all the latest Vulkan features). Even Apple + MoltenVK is pretty good.
Even mobile GPU vendors have pretty good support in their latest drivers.
The biggest issue is that Android consumer devices don't get GPU driver updates so they're not available to the general public.
But soon? Hopefully
The different, separate engine variants for mobile and desktop users, on the other hand, can be based on the same graphics API; they'll just use different features from it in addition to having different algorithms and architecture.
...so you'll have different code paths for desktop and mobile anyway. The same can be achieved with a Vulkan vs VulkanES split which would overlap for maybe 50..70% of the core API, but significantly differ in the rest (like resource binding).
And beyond that if you look at historical trends, mobile is and always has been just "desktop from 5-7 years ago". An API split that makes sense now will stop making sense rather quickly.
And this is the reason why mobile and desktop should be separate graphics APIs. Mobile is holding desktop back not just feature wise, it also fucks up the API.
Vulkan is the actual barrier. On Windows, DirectX does an average job at supporting it. Microsoft doesn't really innovate these days, so NVIDIA largely drives the market, and sometimes AMD pitches in.
It has been mostly NVidia in collaboration with Microsoft, even HLSL traces back to Cg.
In hindsight it really would have been better to have a separate VulkanES which is specialized for mobile GPUs.
there are also some arm laptops that just run Qualcomm chips, the same as some phones (tablets with a keyboard, basically, but a bit more "PC"-like due to running Windows).
AFAICT the fusion seems likely to be an accurate prediction.
Only if you're ignoring mobile entirely. One of the things Vulkan did which would be a shame to lose is it unified desktop and mobile GPU APIs.
In this context, both old Switch and Switch 2 have full desktop-class GPUs. They don't need to care about the API problems that mobile vendors imposed to Vulkan.
Also there is no inherent thing that blocks extensions by default. I feel like a reasonable core that can optionally do more things similar to CPU extensions (i.e. vector extensions) could be the way to go here.
Mobile vendors insisting on using closed, proprietary drivers that they refuse to constantly update/stay on top of is the actual issue. If you have a GPU capable of cutting edge graphics, you have to have a top notch driver stack. Nobody gets this right except AMD and NVIDIA (and both have their flaws). Apple doesn't even come close, and they are ahead of everyone else except AMD/NVIDIA. AMD seems to do it the best, NVIDIA, a distant second, Apple 3rd, and everyone else 10th.
What about Intel?
I remember there was time about 15 years ago, they were famous for reporting OpenGL capabilities as supported, when they were actually only available as software rendering, which voided any purpose to use such features in first place.
> It is quite telling how good their iGPUs are at 3D that no one counts them in.
I'm not so certain about this: in
> https://old.reddit.com/r/laptops/comments/1eqyau2/apuigpu_ti...
APUs/iGPUs are compared, and here Intel's integrated GPUs seem to be very competitive with AMD's APUs.
---
You of course have to compare dedicated graphics cards with each other, and similarly for integrated GPUs, so let's compare (Intel's) dedicated GPUs (Intel Arc), too:
When I look at
> https://www.tomshardware.com/reviews/gpu-hierarchy,4388.html
the current Intel Arc generation (Intel-Arc-B, "Battlemage") seems to be competitive with entry-level GPUs of NVidia and AMD, i.e. you can get much more powerful GPUs from NVidia and AMD, but for a much higher price. I thus clearly would not call Intel's dedicated GPUs to be so bad "at 3D that no one counts them in".
Ironically a lot of the time, these new APIs end up being slower in practice (something confirmed by gaming benchmarks), probably exactly because of the issues outlined in the article - having precompiled 'pipeline states', instead of the good ol state machine has forced devs to precompile a truly staggering amount of states, and even then sometimes compilation can occur, leading to these well known stutters.
The other issue is synchronization - as the article mentions how unnecessarily heavy Vulkan synchronization is, and devs aren't really experts or have the time to figure out when to use what kind of barrier, so they adopt a 'better be safe than sorry approach', leading to unneccessary flushes and pipeline stalls that can tank performance in real life workloads.
This is definitely a huge issue combined with the API complexity, leading many devs to use wrappers like the aforementioned SDL3, which is definitely very conservative when it comes to synchronization.
Old APIs with smart drivers could either figure this out better, or GPU driver devs looked at the workloads and patched up rendering manually on popular titles.
Additionally by the early to mid 10s, when these new APIs started getting released, a lot of crafty devs, together with new shader models and OpenGL extensions made it possible to render tens of thousands of varied and interesting objects, essentially the whole scene's worth, in a single draw call. The most sophisticated and complex of these was AZDO, which I'm not sure made it actually into a released games, but even with much less sophisticated approaches (and combined with ideas like PBR materials and deferred rendering), you could pretty much draw anything.
This meant much of the perf bottleneck of the old APIs disappeared.
There are some interesting GPU improvements coming down the pipeline, like a possible OoO part from AMD (if certain credible leaks are valid), however, crickets from Microsoft, and NVIDIA just wants vendor lock-in.
Yes, we need a vastly simpler API. I'd argue even simpler than the one proposed.
One of my biggest hopes for RT is that it will standardize like 80% of stuff to the point where it can be abstracted to libraries. It probably won't happen, but one can wish...
What does Microsoft then intend to use to replace the functionality that DirectX provides?
> Graphics APIs and shader languages have significantly increased in complexity over the past decade. It’s time to start discussing how to strip down the abstractions to simplify development, improve performance, and prepare for future GPU workloads.
Meaning ... SSDs initially reused IDE/SATA interfaces, which had inherent bottlenecks because those standards were designed for spinning disks.
To fully realize SSD performance, a new transport had to be built from the ground up, one that eliminated those legacy assumptions, constraints and complexities.
Unless the link of the article has changed since your comment?
A lot of this post went over my head, but I've struggled enough with GLSL for this to be triggering. Learning gets brutal for the lack of middle ground between reinventing every shader every time and using an engine that abstracts shaders from the render pipeline. A lot of open-source projects that use shaders are either allergic to documenting them or are proud of how obtuse the code is. Shadertoy is about as good as it gets, and that's not a compliment.
The only way I learned anything about shaders was from someone who already knew them well. They learned what they knew by spending a solid 7-8 years of their teenage/young adult years doing nearly nothing but GPU programming. There's probably something in between that doesn't involve giving up and using node-based tools, but in a couple decades of trying and failing to grasp it I've never found it.
https://lettier.github.io/3d-game-shaders-for-beginners/inde...
I agree on the other points. GPU graphics programming is hard in large part because of terrible or lack of documentation.
I also think that the way forward is to go back to software rendering, however this time around those algorithms and data structures are actually hardware accelerated as he points out.
Note that this is an ongoing trend on VFX industry already, about 5 years ago OTOY ported their OctaneRender into CUDA as the main rendering API.
> “Inbetween” is never written as one word. If you have seen it written in this way before, it is a simple typo or misspelling. You should not use it in this way because it is not grammatically correct as the noun phrase or the adjective form. https://grammarhow.com/in-between-in-between-or-inbetween/
Matthew 7:3 "And why beholdest thou the mote that is in thy brother's eye, but considerest not the beam that is in thine own eye?"
Also called a "back-formation". FWIF I don't think the existence of corrupted words automatically justifies more corruptions nor does the fact that it is a corruption automatically invalidate it. When language among a group evolves, everyone speaking that language is affected, which is why written language reads pretty differently looking back every 50 years or so, in both formal and informal writing. Therefore language changes should have buy-in from all users.
And linguists think it would be a bad idea to have one:
https://archive.nytimes.com/opinionator.blogs.nytimes.com/20...
If you intend for people to click the link, then you might just as well delete all the prose before it.
Games like the original Half-Life, Unreal Tournament 2004, etc. ran surprisingly well and at decent resolutions.
With the power of modern hardware, I guess you could do a decent FPS in pure software with even naively written code, and not having to deal with the APIs, but having the absolute creative freedom to say 'this pixel is green' would be liberating.
Fun fact: Due to the divergent nature of computation, many ray tracers targeting real time performance were written on CPU, even when GPUs were quite powerful, software raytracers were quite good, until the hardware apis started popping up.
Note that when the parent comment says "software rendering" they're referring to software (compute shaders) on the GPU.
Which is easier to debug.
Going with Mesh shaders, or GPU compute would be the next step.
Meanwhile GPU raytracing was a purely software affair until quite recently when fixed-function raytracing hardware arrived. It's fast but also opaque and inflexible, only exposed through high-level driver interfaces which hide most of the details, so you have to let Jensen take the wheel. There's nothing stopping someone from going back to software RT of course but the performance of hardware RT is hard to pass up for now, so that's mostly the way things are going even if it does have annoying limitations.
I think it's fair to say that for most gamers, Vulkan/DX12 hasn't really been a net positive, the PSO problem affected many popular games and while Vulkan has been trying to improve, WebGPU is tricky as it has is roots on the first versions of Vulkan.
Perhaps it was a bad idea to go all in to a low level API that exposes many details when the hardware underneath is evolving so fast. Maybe CUDA, as the post says in some places, with its more generic computing support is the right way after all.
For example: https://github.com/StafaH/mujoco_warp/blob/render_context/mu...
(A new simple raytracer that compiles to cuda, used for robotics reinforcement learning, renders at up to 1 million fps at low resolution, 64x64, with textures, shadows)
Hot take, Metal is more sane than CUDA.
Casey argues for ISAs for hardware, including GPUs, instead of heavy drivers. TFA argues for a graphics API surface that is so lean precisely because it fundamentally boils down to a simple and small set of primitives (mapping memory, simple barriers, etc.) that are basically equivalent to a simple ISA.
If a stable ISA was a requirement, I believe we would have converged on these simpler capabilities ahead of time, as a matter of necessity. However, I am not a graphics programmer, so I just offer this as an intellectual provocation to drive conversation.
In such world most of software would still probably use something like Vulkan/DX/WebGPU to abstract over such ISAs, like we use today Java/JavaScript/Python to "abstract" over CPU ISA. And we also likely to have an NVIDIA monopoly similar to x86.
That simple. Now to sell a GPU, the only way is to make an ISA so simple even third parties can make good drivers for it. And the first successful ISA will then force everyone else to implement the same ISA, so the same drivers will work for everyone.
Oh, one other thing that has to go away: patents must no longer apply to ISAs. That way, anyone who wants to make and sell x86, ARM, or whatever GPU ISA that emerges, legally can. No more discussion about which instruction set is open or not, they all just are.
Not that the US would ever want to submit Intel to such a brutal competition.
https://semiengineering.com/knowledge_centers/standards-laws...
There is a constant cycle between domain-specific hardware-hardcoded-algorithm design, and programmable flexible design.
When you look at the DirectX 12 documentation and best-practice guides, you’re constantly warned that certain techniques may perform well on one GPU but poorly on another, and vice versa. That alone shows how fragile this approach can be.
Which makes sense: GPU hardware keeps evolving and has become incredibly complex. Maybe graphics APIs should actually move further up the abstraction ladder again, to a point where you mainly upload models, textures, and a high-level description of what the scene and objects are supposed to do and how they relate to each other. The hardware (and its driver) could then decide what’s optimal and how to turn that into pixels on the screen.
Yes, game engines and (to some extent) RHIs already do this, but having such an approach as a standardized, optional graphics API would be interesting. It would allow GPU vendors to adapt their drivers closely to their hardware, because they arguably know best what their hardware can do and how to do it efficiently.
Because that control is only as good as you can master it, and not all game developers do well on that front. Just check out enhanced barriers in DX12 and all of the rules around them as an example. You almost need to train as a lawyer to digest that clusterfuck.
> The hardware (and its driver) could then decide what’s optimal and how to turn that into pixels on the screen.
We should go in the other direction: have a goddamn ISA you can target across architectures, like an x86 for GPUs (though ideally not that encumbered by licenses), and let people write code against it. Get rid of all the proprietary driver stack while you're at it.
A GPU ISA wouldn’t fix that, it would push even more of those decisions onto the developer.
An ISA only really helps if the underlying execution and memory model is reasonably stable and uniform. That’s true for CPUs, which is why x86 works. GPUs are the opposite: different wave sizes, scheduling models, cache behavior, tiling, memory hierarchies, and those things change all the time. If a GPU ISA is abstract enough to survive that, it’s no longer a useful performance target. If it’s concrete enough to matter for performance, it becomes brittle and quickly outdated.
DX12 already moved the abstraction line downward. A GPU ISA would move it even further down. The issues being discussed here are largely a consequence of that shift, not something solved by continuing it.
What the blog post is really arguing for is the opposite direction: higher-level, more declarative APIs, where you describe what you want rendered and let the driver/hardware decide how to execute it efficiently on a given GPU. That’s exactly what drivers are good at, and it’s what made older APIs more robust across vendors in the first place.
So while a GPU ISA is an interesting idea in general, it doesn’t really address the problem being discussed here.
So I guess we're stuck with what exists today for a while.