Intel's make-or-break 18A process node debuts for data center with 288-core Xeon

Posted by vanburen 5 hours ago

Intel's make-or-break 18A process node debuts for data center with 288-core Xeon(www.tomshardware.com)

190 points | 146 commentspage 2

renewiltord 4 hours ago|

Core density plus power makes so many things worthwhile. Generally human cost of managing hardware scales with number of components under management. CPUs very reliable. So once you get lots of CPU and RAM on single machine you can run with very few.

But right pricing hardware is hard if you’re small shop. My mind is hard-locked onto Epyc processors without thought. 9755 on eBay is cheap as balls. Infinity cores!

Problem with hardware is lead time etc. cloud can spin up immediately. Great for experimentation. Organizationally useful. If your teams have to go through IT to provision machine and IT have to go through finance so that spend is reliable, everybody slows down too much. You can’t just spin up next product.

But if you’re small shop having some Kubernetes on rack is maybe $15k one time and $1.2k on going per month. Very cheap and you get lots and lots of compute!

Previously skillset was required. These days you plug Ethernet port, turn on Claude Code dangerously skip permissions “write a bash script that is idempotent that configures my Mikrotik CCR, it’s on IP $x on interface $y”. Hotspot on. Cold air blowing on face from overhead coolers. 5 minutes later run script without looking. Everything comes up.

Still, foolish to do on prem by default perhaps (now that I think about it): if you have cloud egress you’re dead, compliance story requires interconnect to be well designed. More complicated than just basics. You need to know a little before it makes sense.

Feel like reasoning LLM. I now have opposite position.

PunchyHamster 17 minutes ago|

> Previously skillset was required. These days you plug Ethernet port, turn on Claude Code dangerously skip permissions “write a bash script that is idempotent that configures my Mikrotik CCR, it’s on IP $x on interface $y”. Hotspot on. Cold air blowing on face from overhead coolers. 5 minutes later run script without looking. Everything comes up.

Last time I tried to do anything networking with Claude it set up route preference in opposite order (it thought lower number means more preferred, while it was opposite), fucking it up completely, and then invented config commands that do not exist in BIRD (routing software suite).

Then I looked at 2 different AIs and they both hallucinated same BIRD config commands that were nonexistent. And by same I mean they hallucinated existence of same feature.

> If your teams have to go through IT to provision machine and IT have to go through finance so that spend is reliable, everybody slows down too much. You can’t just spin up next product.

The time of having to order a bunch of servers for new project is long over. We just spun k8s cluster for devs to self-service themselves and the prod clusters just have a bit of accounting shim so adding new namespace have to be assigned to a certain project so we can bill client for it.

Also you're allowed to use cloud services while you have on-prem infrastructure. You get best of both, with some cognition cost involved.

benj111 4 hours ago||

Am I the only one disappointed they didn't settle for 286 cores?

soganess 3 hours ago||

During the 8th gen they made an i7-8086... Hopefully Intel hasn't fired that person.

boltzmann-brain 3 hours ago||

8086K, actually. I still run one inside one of my PCs!

kissiel 4 hours ago|||

At least you got the Intel® Core™ Ultra 9 Processor 386H :)

hedora 3 hours ago||

I wonder if they can bin out ones that have a dead core or two specifically for this purpose.

urthor 4 hours ago||

So TLDR is it competitive?

What are the dimensions and dynamics here vs EPYC?

aliljet 4 hours ago||

This is really what I want to understand. Where can we see real world performance benchmarks?

wmf 1 hour ago||

Phoronix should have them soon. Or if they don't it means the performance is bad.

user5994461 3 hours ago||

Not competitive at all. It's easily visible on the laptop lines, where the same GPU manufactured on TSMC has 3 times the power/performance ratio compared to the Intel one.

Putting more cores is just another desperate move to play the benchmark. Power is roughly quadratic with frequency, every time you fall behind competition, you can double the number of cores and reduce the frequency by 1.414 to compensate.

Repeat a few times and you get CPU with hundreds of cores, but each core is so slow it can hardly do any work.

icegreentea2 3 hours ago||

??? GPU vs CPU workloads are completely different. Comparing Panther Lake iGPU vs Ryzen iGPU is not going to tell you much about how high density server CPU performance will work out.

The Panther Lake vs Ryzen laptop performance comparisons show that Pather Lake does well, basically trading against top end Ryzen AI laptop chips in both absolute performance, and performance per watt.

user5994461 3 hours ago||

If you're not aware, Intel has released a lineup of laptops, with some models having the GPU made by them and some having the same GPU made by TSMC. That makes the comparison very direct. TSMC can deliver nearly 3 times the power/performance.

GPU and CPU manufacturing is the same thing, same node, same result. GPU is always maximizing perf/power ratio because it's embarrassingly parallel, leaving no room to game the benchmark. CPU can be gamed by having a single fast core, that drops performance in half as soon as you use another core.

iberator 4 hours ago||

Why do you needs so many cores for? Apache threads? Any old school wizard here?

toast0 2 hours ago||

I used to run many hosts with 28 cores per host. If performance scales, it's nicer to have a few 288 core hosts rather than a few hundred 28 core hosts.

Getting the performance to scale can be hard, of course. The less inter-core communication the better. Things that tend to work well are either stuff where a bunch of data comes in and a single thread works on it for a significant amount of time then ships the result or things where you can rely on the NIC(s) to split traffic and you can process the network queue for a connecrion on the same core that handles the userspace stuff (see Receive Side Scaling), but you need a fancy NIC to have 288 network queues.

whateverboat 3 hours ago|||

Host it in proxxmox, run 8 different services on it each with 32 cores.

Tepix 3 hours ago|||

Yeah, virtualization, many (small) containers / VMs.

jiggawatts 1 hour ago|||

These almost always run many smaller virtual machines on top of a hypervisor. The target market is large enterprise or hyperscalers like the public clouds, Meta, etc...

andriy_koval 2 hours ago||

data processing

Sweepi 3 hours ago||

if 18A is Intel's make-or-break, its a break. Their next node looks promising.

bigbuppo 3 hours ago||

Meanwhile, somebody put 8192 arm cores on a chip and ran a risc-v emulator on top of that which emulated a 6502 which then emulated a 288 core xeon and it used 0.01% of the power and outperformed the Intel chip in every other metric 10:1, probably.

jvanderbot 3 hours ago||

You know, a link would be great for this comment.

throwaway11456 2 hours ago|||

Well Linux was booted on an Intel 4004, emulating a MIPS R3000. Looks like it booted in 4.76 days. I don't believe this article was AI fabricated.

https://arstechnica.com/gadgets/2024/09/hacker-boots-linux-o...

DetroitThrow 2 hours ago||

Somehow, that still doesn't sound real, but it looks like it is. Wow. Though that one was written by their recently fired hallucination writer.

CoastalCoder 3 hours ago||||

Too risky.

CamperBob2 3 hours ago|||

https://theonion.com, probably

bigbuppo 2 hours ago||

Ah, nice to see a fellow lover of the finest news publication on the planet.

ilaksh 3 hours ago||

Only slightly related, but six years ago I was able to run 400 ZX Spectrum (Z80) emulator instances simultaneously on an AWS graphics workstation.

https://youtu.be/BjeVzEQW4C8?si=0I7UGU0Xz5WUT4ek

bigbuppo 2 hours ago||

I remember that. Neat stuff.

hedora 3 hours ago|

So, they're selling this as an AI accelerator, with drop in compatibility with existing boards, and no boost to RAM bandwidth.

As I understand things, it would be extremely unusual to ship a chip that was bound by floating point throughput, not uncached memory access, especially in the desktop/laptop space.

I haven't been following the Intel server space too carefully, so it's an honest question: Was the old thing compute and not bandwidth limited, or is this going to be running inference at the same throughput (though maybe with lower power consumption)?

Tepix 3 hours ago|

No, they're not selling this as an "AI accelerator":

Here is the quote:

"The company says operators deploying 5G Advanced and future 6G networks increasingly rely on server CPUs for virtualized RAN and edge AI inference, as they do not want to re-architect their data centers in a bid to accommodate AI accelerators."

Edge AI usually means very small models that run fine on CPUs.

hedora 3 hours ago||

A very small model is going to be, what, 8GB? That'll easily blow through the caches. You're going to end up bottlenecked on DRAM either way.

So, I wonder if this is going to be any faster than the previous generation for edge AI.