Nvidia is proposing a beast of a CPU system for Windows PCs

Posted by tosh 1 day ago

Nvidia is proposing a beast of a CPU system for Windows PCs(twitter.com)

319 points | 518 commentspage 5

BoredPositron 1 day ago|

Mediatek and Nvidia the horsemen of abandoning hardware after a year. The Jetson family still left a bad taste in my mouth.

thewebguyd 1 day ago||

Qualcomm is too. They mainlined the GPU firmware for the X Elite 2nd gen, but still have not done so for their 1st gen X Elite which they promised full Linux support for and failed to deliver, and have now moved on pretending they never said that.

burnt-resistor 1 day ago||

How dare you question the golden goose egg-laying algorithm for trillions in stock valuation!

effnorwood 5 hours ago||

beast is right

oldnetguy 20 hours ago||

SGI had unified memory back in 1996.

htk 22 hours ago||

The M1 Max from 2021 has better memory bandwidth. The M3 Max can be specced to 128GB.

Nothing new here, apart from being able to use CUDA on a less power hungry system.

bigyabai 19 hours ago|

The M1 Max has an unusably slow GPU for inference. TTFT on real-world contexts can be over 10 minutes.

> Nothing new here, apart from being able to use CUDA on a less power hungry system.

CUDA has been running on ARM SOCs since the Tegra K1, 12 years ago. Nvidia is not new to ARM, nor is CUDA.

cryo32 1 day ago||

Yeah when laptops are shipping 8Gb and Microsoft is suddenly interested in native apps, nope.

Tech companies have strangled their own market.

thewebguyd 1 day ago|

Laptops shipping with less RAM is exactly the reason to be interested in native apps again. Every app being a chrome/EdgeWebView process is the problem.

npn 1 day ago||

Is this somehow satire? This is just the dgx spark with keyboard and monitor in a convenient format. Since it has more stuff, I'm sure that the price mark up will increase too.

Up to $5000 because why not?

With that money you can build a real PC with rtx 5090!

thewebguyd 1 day ago|

Not with 128GB (less OS) available to the GPU you can't. The unified memory is the point with this machine (and the dgx spark).

snvzz 12 hours ago||

It is not RISC-V.

We aren't so naive as to move from a locked IP ISA like x86 to another locked IP ISA such as ARM.

Right?

derefr 1 day ago||

> The game changer is the unified 128 GB memory. That is the path Apple took years ago. Instead of separate memory for the CPU and GPU, everything shares a single pool. It is increasingly popular.

> The memory is not as fast as dedicated GPU memory, but it is cheap enough while delivering enough bandwidth to run AI models locally.

So, the reason "dedicated GPU memory" is fast, isn't because it's "dedicated"; it's because the types of memory built into GPU cards — GDDR and HBM — are designed for throughput over latency.

Which is to say, GDDR and HBM memory could be shared with the CPU in UMA while still being "fast" (for GPU use-cases.) In fact, the PS4/5 and Xbox 360 / One X / Series consoles have UMA architectures that use GDDR memory as their main memory, with no regular DDR memory to be found.

What I don't understand: why don't we see UMA architectures where there's both regular DDR and GDDR/HBM memory mapped into the address space of the CPU+GPU? That seems like the best of both worlds: you'd have some memory that's "tuned" for random-access CPU usage (regular DDR), and some memory that's "tuned" for streaming GPU usage (GDDR/HBM), but either type of memory can still be put to the use it wasn't "tuned" for, just with slightly-worse performance.

I guess you'd need to do a bit of software work:

1. a bit of work in the OS kernel / malloc library to get CPU workloads to "prefer" allocating DDR memory over the GDDR/HBM memory until they've exhausted DDR memory (or maybe not, if you just tell the kernel the GDDR/HBM memory is something like a zswap thinpool);

2. and a bit of work in supported ML frameworks, to teach them about a hybrid strategy between UMA "allocate anywhere, it's all the same" and NUMA "keep assets in VRAM if possible; if you spill assets to RAM, then they must stream into VRAM on access" (i.e. "at allocation time, allocate as if the system were NUMA, VRAM first then spilling to RAM; but at execution time, use the UMA codepaths, no need to copy RAM into VRAM.")

...but once that's done, it's done.

Rohansi 14 hours ago|

Theoretically, maybe? But they are completely different interfaces so it would surely get complicated. It's also approaching the current behavior in non-unified memory systems where you have two pools of memory with different performance characteristics. You'll realistically want the CPU to always use low latency memory and the GPU to use high bandwidth memory with very little moving between them.

sherazp995 1 day ago||

Wait a minute!

Nvidia going from GPU to CPU now?

wmf 1 day ago|

Nvidia has been making CPUs for 10-15 years.

userbinator 17 hours ago|||

Much longer than that...

https://theretroweb.com/chipsets/182

https://www.nvidia.com/en-us/drivers/uli-m6117c/

TiredOfLife 12 hours ago|||

But this one has mediatek cpu with off the shelf arm cores

buffer_overlord 21 hours ago|

Can it run Ubuntu?

More comments...