Sony's Mark Cerny Has Worked on "Big Chunks of RDNA 5" with AMD

Posted by ZenithExtreme 3 days ago

Sony's Mark Cerny Has Worked on "Big Chunks of RDNA 5" with AMD(overclock3d.net)

103 points | 134 comments

phkahler 3 days ago|

Why not link to the original article here:

https://www.tomsguide.com/gaming/playstation/sonys-mark-cern...

rfl890 2 days ago|

This was published after TFA, how is it the original?

high_na_euv 2 days ago||

Wdym?

2 vs 1 days

rfl890 2 days ago||

Nevermind, I was confused.

DiabloD3 3 days ago||

There isn't an RDNA5 on the roadmap, though. It's been confirmed 4 is the last (and was really meant to be 3.5, but grew into what is assumed to be the PS5/XSX mid-gen refresh architecture).

Next is UDNA1, a converged architecture with it's older sibling, CDNA (formerly GCN).

Like, the article actually states this, but runs an RDNA 5 headline anyways.

blasphemers 3 days ago||

Maybe read the article before commenting on it, it's not that long.

"Big chunks of RDNA 5, or whatever AMD ends up calling it, are coming out of engineering I am doing on the project"

greenknight 3 days ago|||

AMD does do semi-custom work.

Whats to stop sony being like we dont want UDNA 1, we want a iteration of RDNA 4.

For all we know, it IS RDNA 5... it just wont be available to the public.

Moto7451 3 days ago||

And their half step/semi-custom work can find their way back to APUs. RDNA 3.5 (the version marketed as such) is in the Zen 5 APUs with Mobile oriented improvements. It wouldn’t surprise me if a future APU gets RDNA 5. GCN had this sort of APU/Console relationship as well.

cma 3 days ago||

Also steamdeck before the OLED version and magic leap 2 shared a custom chip, with some vision processing parts fused off for steamdeck.

cubefox 3 days ago||

It's just a name. I'm sure this is all pretty iterative work.

dragontamer 3 days ago||

UDNA isn't a name but instead a big shift in strategy.

CDNA was for HPC / Supercomputers and Data center. GCN always was a better architecture than RDNA for that.

RDNA itself was trying to be more NVidia like. Fewer FLOPs but better latency.

Someone is getting the axe. Only one of these architectures will win out in the long run, and the teams will also converge allowing AMD to consolidate engineers to improving the same architecture.

We won't know what the consolidated team will release yet. But it's a big organizational shift that surely will affect AMDs architectural decisions.

timschmidt 3 days ago||

My understanding was that CDNA and RDNA shared much if not most of their underlying architecture, and that the fundamental differences had more to do with CDNA supporting a greater variety of numeric representations to aid in scientific computing. Whereas RDNA really only needed fp32 for games.

DiabloD3 1 day ago|||

That's not entirely wrong.

https://gpuopen.com/download/RDNA_Architecture_public.pdf

I've been showing this one to people for a few years as a good introduction on how RDNA diverged from GCN->CDNA.

The main thing they did was change where wavefront steps (essentially, quasi-VLIW packets) execute: instead of being at the head of the pipeline (which owns 4x SIMD16 ALUs = 64 items) and requires executing 64 threads concurrently (thus, 64x registers/LDS/etc space), it issues non-blocking segments of the packet into per-ALU sub-pipelines, requiring far fewer concurrent threads to maintain peak performance (and, in many cases, far less concurrent registers used for intermediates that don't leave the packet).

GCN is optimized for low instruction parallelism but high parallelism workloads. Nvidia since the dawn of their current architecture family tree has been optimized for high instruction parallelism but not simple highly parallel workloads. RDNA is optimized to handle both GCN-optimal and NVidia-optimal cases.

RDNA, since this document has been written, also has been removing all the roadblocks to improve performance on this fundamental difference. RDNA4, the one that just came out, increased the packet processing queue to be able to schedule more packets in parallel and more segments of the packets into their per-ALU slots, is probably the most influential change: in software that performed bad on all GPUs (GCN, previous RDNA, anything Nvidia), a 9070XT can perform like a 7900XTX with 2/3rds the watts and 2/3rds the dollars.

While CDNA has been blow for blow against Nvidia's offerings since it's name change, RDNA has eradicated the gap in gaming performance. Nvidia functionally doesn't have a desktop product below a 5090 now, and early series 60 rumors aren't spicy enough to make me think Nvidia has an answer in the future, either.

dragontamer 3 days ago||||

Who told ya that??

CDNA is 64 wide per work item. And CDNA1 I believe was even 16 lanes executed over 4 clock ticks repeatedly (ie: minimum latency of all operations, even add or xor, was 4 clock ticks). It looks like CDNA3 might not do that anymore but that's still a lot of differences...

RDNA actually executes 32-at-a-time and per clock tick. It's a grossly different architecture.

That doesn't even get to Infinity Cache, 64-bit support, AI instructions, Raytracing, or any of the other differences....

sharpneli 3 days ago|||

CDNA is based on the older gcn arch so they share the same as pre RDNA ones and RDNA ones.

whatever1 3 days ago||

PS5 was almost twice as fast as the PS4 pro, yet we did not see the generational leap we saw with the previous major releases.

It seems that we are the stage where incremental improvements in graphics will require exponentially more computing capability.

Or the game engines have become super bloated.

Edit: I stand corrected in previous cycles we had orders of magnitude improvement in FLOPS.

pjmlp 3 days ago||

A reason was backwards compatibility, studios were already putting lots of money into PS4 and XBox One, thus PS5 and XBox X|S (two additional SKUs), were already too much.

Don't forget one reason that studios tend to favour consoles has been regular hardware, and that is no longer the case.

When middleware starts to be the option, it is relatively hard to have game features that are hardware specific.

throwaway48476 1 day ago||

Games budgets ballooned and it was not longer financially viable for single platform games.

cosmic_cheese 3 days ago|||

Less effort going into optimization also plays a factor. On average games are a lot less optimized than they used to be. The expectation seems to be that hardware advances will fix deficiencies in performance.

This doesn’t affect me too much since my backlog is long and by the time I play games, they’re old enough that current hardware trivializes them, but it’s disappointing nonetheless. It almost makes me wish for a good decade or so of performance stagnation to curb this behavior. Graphical fidelity is well past the point of diminishing returns at this point anyway.

martinald 3 days ago|||

We have had a decade of performance stagnation.

Compare PS1 with PS3 (just over 10 years apart).

PS1: 0.03 GFLOPS (approx given it didn't really do FLOPS per se) PS3: 230 GFLOPS

Nearly 1000x faster.

Now compare PS4 with PS5 pro (also just over 10 years apart):

PS4: ~2TFLOPS PS5 Pro: ~33.5TFLOPS

Bit over 10x faster. So the speed of improvement has fallen dramatically.

Arguably you could say the real drop in optimization happened in that PS1 -> PS3 era - everything went from hand optimized assembly code to running (generally) higher level languages and using abstrated graphics frameworks like DirectX and OpenGL. Just noone noticed because we had 1000x the compute to make up for it :)

Consoles/games got hit hard by first crypto and now AI needing GPUs. I suspect if it wasn't for that we'd have vastly cheaper and vastly faster gaming GPUs, but when you were making boatloads of cash off crypto miners and then AI I suspect the rate of progress fell dramatically for gaming at least (most of the the innovation I suspect went more into high VRAM/memory controllers and datacentre scale interconnects).

SlowTao 3 days ago|||

It is not just GPU performance, it is that visually things are already very refined. A ten times leap in performance doesn't really show as ten times the visual spectical like it used to.

Like all this path tracing/ray tracing stuff, yes it is very cool and can add to a scene but most people can barely tell it is there unless you show it side by side. And that takes a lot of compute to do.

We are polishing an already very polished rock.

martinald 3 days ago||

Yes but in the PS1 days we were doing a 1000x compute performance a decade.

I agree that 10x doesn't move much, but that's sort of my point - what could be done with 1000x?

cosmic_cheese 3 days ago||||

Yeah there’s been a drop off for sure. Clearly it hasn’t been steep enough for game studios to not lean on anyway, though.

One potential forcing factor may be the rise of iGPUs, which have become powerful enough to play many titles well while remaining dramatically more affordable than their discrete counterparts (and sometimes not carrying crippling VRAM limits to boot), as well as the growing sector of PC handhelds like the Steam Deck. It’s not difficult to imagine that iGPUs will come to dominate the PC gaming sphere, and if that happens it’ll be financial suicide to not make sure your game plays reasonably well on such hardware.

martinald 3 days ago||

I get the perhaps mistaken impression the biggest problem games developers have is making & managing absolutely enormous amounts of art assets at high resolution (textures, models, etc). Each time you increase resolution from 576p, to 720p to 1080p and now 4k+ you need a huge step up in visual fidelity of all your assets, otherwise it looks poor.

And given most of these assets are human made (well, until very recently) this requires more and more artists. So I wonder if games studios are more just art studios with a bit of programming bolted on, vs before with lower res graphics where you maybe had one artist for 10 programmers, now it is more flipped the other way. I feel that at some point over the past ~decade we hit a "organisational" wall with this and very very few studios can successfully manage teams of hundreds (thousands?) of artists effectively?

MindSpunk 3 days ago|||

This hits the nail pretty close to the head. I work on an in-house AAA engine used by a number of different games. It's very expensive to produce art assets at the quality expected now.

Many AAA engine's number one focus isn't "performance at all costs", it's "how do we most efficiently let artists build their vision". And efficiency isn't runtime performance, efficiency is how much time it takes for an artist to create something. Performance is only a goal insofar as to free artists from being limited by it.

> So I wonder if games studios are more just art studios with a bit of programming bolted on.

Not quite, but the ratio is very in favor of artists compared to 'the old days'. Programming is still a huge part of what we do. It's still a deeply technical field, but often "programming workflows" are lower priority than "artist workflows" in AAA engines because art time is more expensive than programmer time from the huge number of artists working on any one project compared to programmers.

Just go look at the credits for any recent AAA game. Look at how many artists positions there are compared to programmer positions and it becomes pretty clear.

kasool 2 days ago||

Just to add to this, from a former colleague of mine who currently works as a graphics programmer at a UE5 studio: most graphics programmers are essentially tech support for artists nowadays. In an age where much of AAA is about making the biggest, most cinematic, most beautiful game, your artists and game content designers are the center of your production pipeline.

It used to be that the technology tended to drive the art. Nowadays the art drives the tech. We only need to look at all the advertised features of UE5 to see that. Nanite allows artists to spend less time tweaking LODs and optimizing meshes as well as flattening the cost of small triangle rendering. Lumen gives us realtime global illumination everywhere so artists don’t have to spend a million hours baking multiple light maps. Megalights lifts restrictions on the number of dynamic lights and shadows a lighting artist can place in the scene. The new Nanite foliage shown off in the Witcher 4 allows foliage artists to go ham with modeling their trees

cosmic_cheese 3 days ago||||

That depends a lot on art direction and stylization. Highly stylized games scale up to high resolutions shockingly well even with less detailed, lower resolution models and textures. Breath of the Wild is one good example that looks great by modern standards at high resolutions, and there’s many others that manage to look a lot less dated than they are with similarly cartoony styles.

If “realistic” graphics are the objective though, then yes, better displays pose serious problems. Personally I think it’s probably better to avoid art styles that age like milk, though, or to go for a pseudo-realistic direction that is reasonably true to life while mixing in just enough stylization to scale well and not look dated at record speeds. Japanese studios seem pretty good at this.

spookie 3 days ago|||

Yeah, its flipped. Overall, it has meant studios are more and more dependent on third party software (and thus license fees), it led to game engine consolidation, and serious attrition when attempting to make something those game engines werent built for (non-pbr pipelines come to mind).

It's no wonder nothing comes out in a playable state.

imtringued 2 days ago||||

>I suspect if it wasn't for that we'd have vastly cheaper and vastly faster gaming GPUs

This feels very out of touch since AMD's latest GPU series is specialized in gaming only, to the point where they sell variants with 8GB, which is becoming a bit tight if you want to play modern games.

martinald 2 days ago||

Yes but AMD also has an enterprise line of AI cards to protect. And regardless, if NVidia wasn't also making bank selling AI GPUs then we'd have seen them add more performance on gaming, which would have forced AMD to, etc.

PoshBreeze 1 day ago||||

> Arguably you could say the real drop in optimization happened in that PS1 -> PS3 era - everything went from hand optimized assembly code to running (generally) higher level languages and using abstrated graphics frameworks like DirectX and OpenGL. Just noone noticed because we had 1000x the compute to make up for it :)

Maybe / Kind of. Consoles in the PS1/N64 they were not running optimised assembly code. The 8bit and 16 bit machines were.

As for DirectX / OpenGL / Glide actually massively improved performance over running stuff on the CPU. You only ran stuff with software rendering if you had a really low performance GPU. Just look at Quake running in software vs Glide. It easily doubles on a Pentium based system.

> Consoles/games got hit hard by first crypto and now AI needing GPUs. I suspect if it wasn't for that we'd have vastly cheaper and vastly faster gaming GPUs, but when you were making boatloads of cash off crypto miners and then AI I suspect the rate of progress fell dramatically for gaming at least (most of the the innovation I suspect went more into high VRAM/memory controllers and datacentre scale interconnects).

The PC graphics card market got hit hard by those. Console markets were largely unaffected. There are many reasons why performance has stagnated. One of them I would argue is the use of the Unreal 4/5 engine. Every game that runs either of these engines has significant performance issues. Just look at Star wars: Jedi Survivor and the previous game Star wars Jedi: Fallen Order. Both games run poorly even on a well spec'd PC and even runs poorly on my PS5. Doesn't really matter though as Jedi Survivor sold well and I think Fallen Order also sold well.

The PS5 is basically a fixed PS4 (I've owned both). They've put a lot of effort into the PS5 into reducing loading times. Loading times on the PS4 were painful and were far longer than the PS3 (even games loading from Bluray). This was something Sony was focusing on. Every presentation about the PS5 talked about the new NVME drives and the external drive and the requirements for it.

The other reason is that the level of graphical fidelity achieved in the mid-2000s to early-2010s is good enough. A lot of reasons why some games age worse than others is due to the art style, rather than the graphical fidelity. Many of the high earning games don't have state of the art graphics e.g Fortnite prints cash and the graphics are pretty bad IMO.

Performance and Graphics just isn't the focus anymore. It doesn't really sell games like it used to.

Dylan16807 3 days ago|||

You divided 230 by .03 wrong, which would be 10000-ish, but you underestimated the PS1 by a lot anyway. The CPU does 30 MIPS, but also the geometry engine does another 60 MIPS and the GPU fills 30 or 60 million pixels per second with multiple calculations each.

deaddodo 3 days ago||

Not to mention that few developers were doing hand optimized assembly by the time of PSX. They were certainly hand optimizing models and the 3D pipeline (with some assembler tuning), but C and SDKs were well in use by that point.

Even Naughty Dog went with their own LISP engine for optimization versus ASM.

dmbaggett 2 days ago|||

I don’t know about other developers at the time, but we had quite a lot of hand-written assembly code in the Crash games. The background and foreground renderers were all written in assembly by hand, as was the octree-based collision detection system. (Source: me; I wrote them.)

And this thread comes full circle: Mark Cerny actually significantly improved the performance of my original version of the Crash collision detection R3000 code. His work on this code finally made it fast enough, so it’s a really good thing he was around to help out. Getting the collision detection code correct and fast enough took over 9 months —- it was very difficult on the PS1 hardware, and ended up requiring use of the weird 2K static RAM scratchpad Sony including in place of the (removed) floating point unit.

GOOL was mainly used for creature control logic and other stuff that didn’t have to be optimized so much to be feasible. Being able to use a lisp dialect for a bunch of the code in the game saved us a ton of time. The modern analogue would be writing most of the code in Python but incorporating C extensions when necessary for performance.

Andy made GOAL (the successor lisp to GOOL) much more low-level, and it indeed allowed coding essentially at the assembly level (albeit with lispy syntax). But GOOL wasn’t like this.

deaddodo 1 day ago||

I've never seen the Crash source code, so was making my statements based on second hand knowledge. So thanks for that clarification. I do think it's worth pointing out that Naughty Dog and Insomnia were two companies well known for making highly optimized software for the PSX; so probably not a standard most other companies matched.

Additionally, I have written my own PSX software as well as reviewed plenty of contemporaneous PSX software. While many have some bit of assembler, it's usually specifically around the graphics pipeline. About 90+% of all code is C. This is in line with interviews from developers at the time, as well.

The point wasn't that ASM wasn't used at all (in fact, I specifically acknowledged it in my original post), it was that the PSX was in an era passed the time when entire codebases were hand massaged/tuned assembler (e.g. "the 16-bit era" and before).

dmbaggett 1 day ago||

Insomniac was down the hall from us when we wrote Crash 1 and yes, the Hastings brothers definitely wrote some very tight assembly code!

p_l 3 days ago|||

Naughty Dog's GOAL was PS2 specific and essentially chock full of what would be called intrinsics these days that let you interleave individual assembly instructions particularly for the crazy coprocessor setup of Emotion Engine.

My understanding is that the mental model of programming in PS2 era was originally still very assembly like outside of few places (like Naughty Dog) and that GTA3 on PS2 made possibly its biggest impact by showing it's not necessary.

deaddodo 2 days ago||

If by "mental model" you mean "low-level" programming, sure. But you might as well conflate "religion" with "Southern Baptist protestantism" then. You're working with the same building blocks, but the programming style is drastically different.

The vast majority of PSX games were done completely in C, period. Some had small bits of asm here and there, but so do the occasional modern C/C++ apps.

To your last point, before there was GOAL there was GOOL (from the horse's mouth itself):

https://all-things-andy-gavin.com/tag/lisp-programming/

And it was used in all of Naughty Dog's PSX library.

p_l 2 days ago||

The quote I recall reading about long ago summarized the semi-official guidance as "write C like you write ASM".

Because outside of ports from PC, large amount of console game developers at the time were experienced a lot with with programming earlier consoles which had a lot more assembly level coding involved. GTA3 proved that "PC style" engine was good enough despite Emotion Engine design.

Didn't help that PS2 was very much oriented towards assembly coding at pretty low level, because getting the most of the hardware involved writing code for the multiple coprocessors to work somewhat in-sync - which at least for GOAL was done by implementing special support for writing the assembly code in line with rest of the code (because IIRC not all assembly involved was executed from the same instruction stream)

As for GOOL, it was the way more classic approach (used by ND on PS3 and newer consoles too) of core engine in C and "scripting" language on top to drive gameplay.

deaddodo 2 days ago||

> The quote I recall reading about long ago summarized the semi-official guidance as "write C like you write ASM".

You could read that in pretty much any book about C, until the mid-00s. C was called "portable assembler" for the longest time because it went against the grain of ALGOL, Fortran, Pascal, etc by encouraging use of pointers and being direct to the machine. Thus why it only holds a viability in embedded development these days.

I've written C on the PSX, using contemporaneous SDKs and tooling, and I've reviewed source code from games at the time. There's nothing assembler about it, at least not more so than any systems development done then or today. If you don't believe me, there are plenty of retail PSX games that accidentally released their own source code that you can review yourself:

https://www.retroreversing.com/source-code/retail-console-so...

You're just arguing for the sake of arguing at this point and, I feel, being intellectually dishonest. Believe what you'd like to believe, or massage the facts how you like; I'm not interested in chasing goal (heh) posts.

jayd16 3 days ago|||

By what metric can you say this with any confidence when game scope and fidelity has ballooned?

cosmic_cheese 3 days ago||

Because optimized games aren’t completely extinct and there’s titles with similar levels of size, fidelity, and feature utilization with dramatically differing performance profiles.

rtpg 3 days ago||

Given the N64-PS1 era is filled with first party games that run at like 20 fps, I'm having a hard time saying things are worse now.

I am a bit uncomfortable with the performance/quality stuff that people have set up but I personally feel that the quality floor for perf is way higher than it used to be. Though there seem to be less people parking themselves at "60fps locked", which felt like a thing for a while

ryao 3 days ago|||

This is the result of an industry wide problem where technology just is not moving forward as quickly as it used to move. Dennard scaling is dead. Moore’s law is also dead for SRAM and IO logic. It is barely clinging to life for compute logic, but the costs are skyrocketing as each die shrink happens. The result is that we are getting anemic improvements. This issue is visible in Nvidia’s graphics offerings too. They are not improving from generation to generation like they did in the past, despite Nvidia turning as many knobs as they could to higher values to keep the party going (e.g. power, die area, price, etcetera).

timschmidt 3 days ago||

Jim Keller disagrees: https://www.youtube.com/watch?v=oIG9ztQw2Gc

ryao 2 days ago|||

That talk predates the death of SRAM scaling. I will not bother wasting my time watching a video that is out of date.

That said, you should read that I did not say Moore’s Law was entirely dead. It is dead for SRAM and IO logic, but is still around for compute logic. However, pricing is shooting upward with each die shrink far faster than it did in the past.

timschmidt 1 day ago||

> I will not bother

Your loss.

pjmlp 2 days ago|||

Hardware improvements only matter to the extent software is actually able to make use of them.

timschmidt 2 days ago||

And? Software is getting more sophisticated and capable too. First time I switched an iter to a par_iter in Rust and saw the loop spawn as many threads as I have logical cores felt like magic. Writing multi-threaded code used to be challenging.

pjmlp 2 days ago||

Now make that multi-threaded code exhaust a 32 core desktop system, all the time, not only at peak execution.

As brownie points, keep the GPU busy as well, beyond twirling its fingers while keeping the GUI desktop going.

Even more points if the CPU happens to have a NPU or integrated FPGA, and you manage to also keep them going alongside those 32 cores, and GPU.

timschmidt 2 days ago||

> Now make that multi-threaded code exhaust a 32 core desktop system

Switching an iter to par_iter does this. So long as there are enough iterations to work through, it'll exhaust 1024 cores or more.

> all the time, not only at peak execution.

What are you doing that keeps a desktop or phone at 100% utilization? That kind of workload exists in datacenters, but end user devices are inherently bursty. Idle when not in use, race to idle while in use.

> As brownie points, keep the GPU busy as well... Even more points if the CPU happens to have a NPU or integrated FPGA

In a recent project I serve a WASM binary from an ESP32 via Wifi / HTTP, which makes use of the GPU via WebGL to draw the GUI, perform CSG, calculate toolpaths, and drip feed motion control commands back to the ESP. This took about 12k lines of Rust including the multithreaded CAD library I wrote for the project, only a couple hundred lines of which are gated behind the "parallel" feature flag. It was way less work than the inferior C++ version I wrote as part of the RepRap project 20 years ago. Hence my stance that software has become increasingly sophisticated.

https://github.com/timschmidt/alumina-firmware

https://github.com/timschmidt/alumina-ui

https://github.com/timschmidt/csgrs

What's your point?

pjmlp 2 days ago||

The point being those are very niche cases that still don't keep the hardware busy as it should 24h around the clock.

Most consumer software even less, hence why anyone will hardly see a computer on the shopping mall with higher than 16 core count, and on average most shops will have something between 4 and 8.

Also a reason why systems with built-in FPGAs failed in the consumer market, specialised tools without consumer software to help sell them.

timschmidt 2 days ago||

> don't keep the hardware busy as it should 24h around the clock.

If your workload demands 24/7 100% CPU usage, Epyc and Xeon are for you. There you can have multiple sockets with 256 or more cores each.

> Most consumer software even less

And yet, even in consumer gear which is built to a minimum spec budget, core counts, memory capacity, pcie lanes, bus bandwidth, IPC, cache sizes, GPU shaders, NPU TOPS, all increasing year over year.

> systems with built-in FPGAs failed in the consumer market

Talk about niche. I've never met an end user with a use for an FPGA or the willingness to learn what one is. I'd say that has more to do with it. Write a killer app that regular folks want to use that requires one, and they'll become popular. Rooting for you.

pjmlp 1 day ago||

You have to root for those hardware designers to have software devs in quantities, actually using what they produce, at scale.

CoolGuySteve 3 days ago|||

The current generation has a massive leap in storage speed but games need to be architected to stream that much data into RAM.

Cyberpunk is a good example of a game that straddled the in between, many of it's performance problems on the PS4 were due to constrained serialization speed.

Nanite and games like FF16 and Death Stranding 2 do a good job of drawing complex geometry and textures that wouldn't be possible on the previous generation

Vilian 3 days ago||

Nanite is actively hurting performance

teamonkey 2 days ago|||

Nanite has a performance overhead for simple scenes but will render large, complex scenes with high-quality models much more efficiently, providing a faster and more stable framerate.

It’s also completely optional in Unreal 5. You use it if it’s better. Many published UE5 games don’t use it.

CoolGuySteve 2 days ago|||

Well yeah, features that push the graphics card typically incur performance hits.

cwbriscoe 3 days ago|||

A lot of the difference went into FPS rather than improved graphics.

adamwk 3 days ago|||

And loading times. I think people already forgot how long you had to wait on loading screens or how many faked loading (moving through a brush while the next area loads) there was on PS4

SlowTao 3 days ago|||

PS4 wasnt too terrible but jumping back to PS3... wow I completely forgot how memory starved that machine was. Working on it, we knew at the time but in retro spect it was just horrible.

Small RAM space with the hard CPU/GPU split (so no reallocation) feeding off a slow HDD which is being fed by an even slower Bluray disc, you are sitting around for a while.

PoshBreeze 1 day ago||

PS3 loading times IME were better than the PS4.

Izikiel43 2 days ago||||

Bloodborne when it came out was around 1 minute between deaths.

ryao 3 days ago|||

Did you forget that on the N64, load times were near instantaneous?

derrasterpunkt 3 days ago||

The N64 was cartridge based.

MindSpunk 3 days ago||

If only we could just ship a 256GB NVMe SSD with every game and memory map the entire drive like you could with cartridges back then. Never have loading times again.

cubefox 2 days ago||

Also: I think it got less common on the N64, but games on SNES and NES and other old home consoles routinely accessed static game data, like graphic tiles, directly from the cartridge ROM. Without loading it into system RAM at all.

So there literally were no "loading" times for these assets. This might not even be realistically possible with NAND flash based SSDs, e.g. because of considerations like latency.

Though directly accessing ROM memory would also prevent things like texture block compression I believe.

bentt 3 days ago||||

This is correct. Also, it speaks to what players actually value.

ThatMedicIsASpy 3 days ago|||

I have played through CP2077 with 40, 30 and 25 fps. A child doesn't care if Zelda runs with low FPS.

The only thing I value is a consistent stream of frames on a console.

adamwk 3 days ago|||

When given a choice, most users prefer performance over higher fidelity

teamonkey 3 days ago||

I would like to see the stats for that.

jayd16 3 days ago|||

> "When asked to decide on a mode, players typically choose performance mode about three-quarters of the time,

From PS5 Pro reveal https://youtu.be/X24BzyzQQ-8?t=172

bzzzt 2 days ago||

Seems like an overgeneralization. I get it when FPS players want the best performance: players have FOMO of the best reaction time and the games are more built for fast action than contemplative scenery watching.

I wonder if players of single player action/adventure games make the same choice. Those games are played less (can be finished in 10-30 hours instead of endlessly) so the statistics might be skewed to favor performance mode.

theshackleford 2 days ago||

> I wonder if players of single player action/adventure games make the same choice.

Anecdotally, I do. Because modern displays are horrible blurry messes at lower framerates. I don't care about my input latency, I care about my image not being a smear every time the camera viewport moves.

cubefox 2 days ago||||

Yeah. Case in point: "Zelda: Ocarina of Time" was at the time and several years afterward often labeled as one of the best games ever made, despite the fact that it ran with 20 FPS on NTSC consoles and with 16.67 FPS on PAL machines.

I'm sure it would have been even more successful with modern 60 FPS, but that difference couldn't have been very large, because other 60 FPS games did exist back then as well, mostly without being nearly as popular.

jayd16 3 days ago|||

Children eat dirt. I'm not sure "children don't care" is a good benchmark.

LikesPwsh 3 days ago|||

Also FPS just requires throwing more compute at it.

Excessively high detail models require extra artist time too.

kridsdale1 3 days ago|||

Yes PS5 can output 120hz on hdmi. A perfect linear output to direct your more compute at.

vrighter 3 days ago|||

twice as fast, but asked to render 4x the pixels. Do the math

SlowTao 3 days ago||

Well you see... I got nothing.

The path nowadays is to use all kinds of upscaling and temporal detail junk that is actively recreating late 90s LCD blur. Cool. :(

silisili 3 days ago|||

AFAIK, this generation has been widely slammed as a failure due to lack of new blockbuster games. Most things that came out were either for PS4, or remasters of said games.

There have been a few decent sized games, but nothing at grand scale I can think of, until GTA6 next year.

jayd16 3 days ago||

There were the little details of a global pandemic and interest rates tearing through timelines and budgets.

Izikiel43 2 days ago|||

The big jump between 4 and 5 was the NVME SSD and hardware decompression IMO. Load times in a regular PS5 are non existent compared to a PS4, that's the big generational jump.

For graphics, I agree it looks like diminishing returns.

teamonkey 3 days ago|||

This article shows how great a leap there was between previous console generations.

https://www.gamespot.com/gallery/console-gpu-power-compared-...

ErneX 3 days ago|||

GTA VI is going to be a showcase on these consoles.

treyd 3 days ago||

> Or the game engines have become super bloated.

"Bloated" might be the wrong word to describe it, but there's some reason to believe that the dominance of Unreal is holding performance back. I've seen several discussions about Unreal's default rendering pipeline being optimized for dynamic realtime photorealistic-ish lighting with complex moving scenes, since that's much of what Epic needs for Fortnite. But most games are not that and don't make remotely effective use of the compute available to them because Unreal hasn't been designed around those goals.

TAA (temporal anti-aliasing) is an example of the kind of postprocessing effect that gamedevs are relying on to recover performance lost in unoptimized rendering pipelines, at the cost of introducing ghosting and loss of visual fidelity.

babypuncher 3 days ago|||

TAA isn't a crutch being used to hold up poor performance, it's an optimization to give games anti-aliasing that doesn't suck.

Your other options for AA are

* Supersampling. Rendering the game at a higher resolution than the display and downscaling it. This is incredibly expensive.

* MSAA. This samples ~~vertices~~surfaces more than once per pixel, smoothing over jaggies. This worked really well back before we started covering every surface with pixel shaders. Nowadays it just makes pushing triangles more expensive with very little visual benefit, because the pixel shaders are still done at 1x scale and thus still aliased.

* Post-process AA (FXAA,SMAA, etc). These are a post-process shader applied to the whole screen after the scene has been fully rendered. They often just use a cheap edge detection algorithm and try to blur them. I've never seen one that was actually effective at producing a clean image, as they rarely catch all the edges and do almost nothing to alleviate shimmering.

I've seen a lot of "tech" YouTubers try to claim TAA is a product of lazy developers, but not one of them has been able to demonstrate a viable alternative antialiasing solution that solves the same problem set with the same or better performance. Meanwhile TAA and its various derivatives like DLAA have only gotten better in the last 5 years, alleviating many of the problems TAA became notorious for in the latter '10s.

flohofwoe 3 days ago|||

Erm your description of MSAA isn't quite correct, it has nothing to do with vertices and doesn't increase vertex processing cost..

It's more similar to supersampling, but without the higher pixel shader cost (the pixel shader still only runs once per "display pixel", not once per "sample" like in supersampling).

A pixel shader's output is written to multiple (typically 2, 4 or 8) samples, with a coverage mask deciding which samples are written (this coverage mask is all 1s inside a triangle and a combo of 1s and 0s along triangle edges). After rendering to the MSAA render target is complete, an MSAA resolve operation is performed which merges samples into pixels (and this gives you the smoothed triangle edges).

cubefox 3 days ago||||

Yeah. Only problem is that overly aggressive TAA implementations blur the whole frame during camera rotation. The thing that is even better than standard TAA is a combination of TAA and temporal upscaling, called TSR in Unreal. Still better is the same system but performed by an ML model, e.g. DLSS. Though this requires special inference hardware inside the GPU.

In the past, MSAA worked reasonably well, but it was relatively expensive, doesn't apply to all forms of high frequency aliasing, and it doesn't work anymore with the modern rendering paradigm anyway.

Stevvo 2 days ago||||

ThreatInteractive is an anti-TAA developer/YouTuber. They make a compelling argument against TAA and present an alternative they are working on for Unreal.

wtallis 3 days ago|||

> solves the same problem set with the same or better performance

The games industry has spent the last decade adopting techniques that misleadingly inflate the simple, easily-quantified metrics of FPS and resolution, by sacrificing quality in ways that are harder to quantify. Until you have good metrics for quantifying the motion artifacts and blurring introduced by post-processing AA, upscaling, and temporal AA or frame generation, it's dishonest to claim that those techniques solve the same problem with better performance. They're giving you a worse image, and pointing to the FPS numbers as evidence that they're adequate is focusing on entirely the wrong side of the problem.

That's not to say those techniques aren't sometimes the best available tradeoff, but it's wrong to straight-up ignore the downsides because they're hard to measure.

kasool 1 day ago||

This has long been the aspect I've struggled with, having spent some time implementing various temporal solutions in the past 2 years. We can try and pull out all sorts of classic metrics like SSIM, but the truth is a lot of these effects are really hard to objectively evaluate using some sort of metric. Moreover the same technique can have vastly different outcomes depending on the content, and user perception is subjective as well. Many days were spent thinking I had solved a visual issue, only for another edge case to come up under specific conditions. Many of these techniques are fairly difficult to reason about from first-principles and adding ML into the mix only makes that harder.

gmueckl 3 days ago||||

This is a very one-sided perspective on things. Any precomputed solution to lighting comes with enormous drawbacks across the board. The game needs to ship the precomputed data when storage is usually already tight. The iteration cycle for artists and level designers suchs when lighting is precomputed - they almost never see accurate graphics for their work while they are iterating because rebaking takes time away from their work. Game design become restricted to those limitations, too. Can't even think of having the player randomly rearranging big things in a level (e.g. building or tearing down a house) because the engine can't do it. Who knows what clever game mechanics are never thought of because of these types of limitations?

Fully dynamic interactive environments are liberating. Pursuing them in is the right thing to do.

andrekandre 3 days ago||

  > Fully dynamic interactive environments are liberating. Pursuing them in is the right thing to do.

great video from digital foundry that goes into that (for doom: the dark ages)

https://www.youtube.com/watch?v=Ed4vNNQwCDU

mikepurvis 3 days ago|||

In principle, Epic's priorities for Unreal should be aligned to a lot of what we've seen in the PS3/4/5 generation as far as over-the-shoulder 3rd person action adventure games.

I mean, look at Uncharted, Tomb Raider, Spider-Man, God of War, TLOU, HZD, Ghost of Tsushima, Control, Assassins Creed, Jedi Fallen Order / Survivor. Many of those games were not made in Unreal, but they're all stylistically well suited to what Unreal is doing.

kridsdale1 3 days ago|||

I agree. UE3 was made for Gears of War (pretty much) and as a result the components were there to make Mass Effect.

georgeecollins 2 days ago||

I had the pleasure to spend some time with Mark Cerny many years ago. He was honestly one of the most impressive people I have ever met. Down to earth and so, so smart. I also think it speaks volumes for Sony as a company that an American born video game developer (engineer not mba) has such an influential position. They are not insular and respect the craft.

Voultapher 2 days ago|

There are colleagues of mines that called me the smartest person they ever met and I feel so stupid, how do you make the most of what you are given?

LorenDB 3 days ago||

If the Playstation contributions are good enough, maybe RDNA4 -> RDNA5 will be just as good as RDNA3 -> RDNA4. As long as they get the pricing right, anyway.

basfo 1 day ago||

A few days ago there was a similar message from Xbox, saying that AMD will power it's future hardware project, talking about a strategic alliance and so on.

So, Mark Cerny is contributing on the next Xbox? At the end, today all consoles are basically PCs with different frontends and storefronts (and that is also opening up, starting with xbox but probably PS will follow eventually)

erulabs 3 days ago||

Excited to see how the software support for UDNA1 works out. Very hopeful we'll see some real competition to Nvidia soon in the datacenter. Unfortunately I think the risk is quite high: if AMD burns developers again with poor drivers and poor support, it's hard to see how they'll be able to shake the current stigma.

martinald 3 days ago|

Take this with a pinch of salt, but the most recent ROCm release installed out of the box on my WSL2 machine and worked first time with llama.cpp. I even compiled llama.cpp from source with 0 issues. That has never happened ever in my 5+ years of having AMD GPUs. Every other time I've tried this it's either failed and required arcane workarounds, or just not worked entirely (including running on 'real' Linux).

I feel like finally they are turning the corner on software and drivers.

jakogut 2 days ago||

Llama.cpp also has a Vulkan backend that is portable and performant, you don't need to mess with ROCm at all.

martinald 2 days ago||

Oh yes I know, but "can i compile llama.cpp with rocm" has been my yardstick for how good AMD drivers are for some time.

monster_truck 3 days ago||

We've known this for a while, it's an extension of the upscaling and frame generation AMD already worked on in conjunction with Sony for FSR 3 and to a much greater extent FSR 4. Previous articles also have highlighted their shared focus on BVH optimizations

lofaszvanitt 3 days ago|

Yes, but what will use it when there are so few games on the platform in the current PS generation?

More comments...