OpenAI unveils its first custom chip, built by Broadcom

Posted by jamdesk 1 day ago

OpenAI unveils its first custom chip, built by Broadcom(techcrunch.com)

Announcement: https://openai.com/index/openai-broadcom-jalapeno-inference-...

https://decrypt.co/371971/openai-broadcom-jalapeno-first-cus...

https://www.cnn.com/2026/06/24/tech/openai-broadcom-jalapeno...

809 points | 461 commentspage 2

digitaltrees 1 day ago|

We’ve entered the “if you care about software, build hardware” phase of AI

some-guy 1 day ago||

I have been eyeing what Taalas is doing [1] by making pure hardware models. The speed is absurd.

[1] https://taalas.com/products/

mikewarot 1 day ago|||

They talk about products, but they don't sell the hardware, thus they don't really have a product, just a service.

I know, it's nick picking, but when people can just reach in and take services away, like Fable/Mythos, hardware is the only thing worth buying.

LoganDark 1 day ago|||

I'm sure they'll have a product for you if you have millions to invest in a partnership with them.

arcanemachiner 1 day ago|||

"Nitpicking"

digitaltrees 1 day ago||

Underrated. Hits on multiple levels

jupr 1 day ago||||

crazy product. their test chatbot feels a db query.

https://chatjimmy.ai

digitaltrees 1 day ago|||

I have and it was wild. Paradoxically it made me realize that I actually like reading the stream as it's generating.

wmf 1 day ago|||

“People who are really serious about software should make their own hardware.” ― Alan Kay

zwarag 1 day ago||

What are the other phases. Or what are you referring to in general?

digitaltrees 1 day ago||

Mainframe punch card -> PC floppy disk -> cloud SaaS -> AI --> return to the land agrarian

dadoum 1 day ago||

> May we scale smoothly, exponentially and uneventfully through A[SI]

That sentence sounds weird to me. I can't really put my finger on why, maybe the combination of adverbs, or just the fact of writing the desire of scaling as a company so directly. It feels (to me) like openly claiming their selfish goals. Or maybe I am just misinterpreting and they are referring to the whole humanity as "We" (but knowing Broadcom and in a lesser extent OpenAI doings, I am not convinced).

kilroy123 1 day ago||

I hope to see something like this, but in a small form factor like the NVIDIA spark.

I want a super fast LLM that is Opus 4.6+, like, in ability.

wmf 1 day ago||

Memory bandwidth is the bottleneck in the Spark. If you replace the SoC with an optimized ASIC but keep the same 256-bit LPDDR5 the performance will be the same. You can increase performance by using wider memory but that's also more expensive.

phonon 1 day ago||

M3 Ultra has a 1024 bit memory bus (819 GB/s) and starts at $3,999 (96GB of RAM). It can be done....

bigyabai 1 day ago||

The tradeoff is that the M3 Ultra's GPU loses to laptop GPUs in compute benchmarks. All of that bandwidth is wasted idling for token prefill.

For inference workloads, it makes a lot more sense to optimize for prefill/ttft before maxing out memory bandwidth.

Schiendelman 1 day ago||

With the M6 theoretically coming later this year, Apple seems to be realizing they need to catch up with more lanes of GPU.

bigyabai 1 day ago||

Personally, I doubt it. Apple hamstrung themselves with unified SOC memory, there are cheap dGPUs that smoke the M5's prefill speeds and even have faster decode too. Apple is running up against the limitations of putting a mobile integrated chipset up against the desktop form factor. An SOC stops looking like a smart decision at that scale.

The software side is still pretty sketchy, too. Apple's ecosystem is fractured between NPU, MPS and Accelerate BLAS, with libraries like MLX and CoreML built precariously overtop. Apple has to commit to a full rearchitecture of their GPU to challenge Nvidia, which fractures that ecosystem even further.

Schiendelman 1 day ago|||

I don't expect them to be AS fast as Nvidia anytime soon. Understood that they need architectural improvements to get there.

Apple's business model will be to pay Google for compute for now, and then as they get better on device, move more and more locally. So they're very well incentivized to get better. The thing they've been best at in the last 19 years has been spinning flywheels they already have, and this is exactly that.

bigyabai 1 day ago||

I'm just genuinely convinced that Apple's AI flywheel is going in reverse. Their killed their golden goose with OpenCL, which had a genuine shot at dethroning CUDA if Apple took it seriously. It had industry-wide buy in and multiple implementations before Apple threw in the towel. When they designed Apple Silicon, they could have used the lessons learned from that experience to create a CUDA-like ALU layer instead of focusing on raster efficiency for their GPUs. Nvidia had proven that it was possible with low-power ARM SOCs like Jetson and Tegra which did deliver CUDA in handheld experiences. But Apple chose instead to delegate AI to the NPU, which is now dark silicon on devices that defer to MPS backends for most inference. The architecture is locked in to an expensive and suboptimal raster-first GPU design.

It's not hard to see why Apple made those mistakes, and many of them were made by the rest of the industry too. It's specifically tragic that Apple snatched defeat from the jaws of victory with GPGPU programming, and it makes me think that their future will be more subscription services and less half-ass technical efforts. Or they rip up the foundation and start from scratch, it's never too late to start work on Apple Silicon 2.

Schiendelman 1 day ago||

I think it's easy to understand why Apple wouldn't build low level engineering solutions - they'd rather control the platform and just have developers call MLX. I'm not sure, if I was in their shoes, that I'd make the same call. But it's a call, and it's consistent with the rest of their ecosystem decisions.

wmf 1 day ago|||

I love those 128 GB dGPUs.

bigyabai 1 day ago||

Me too! The problem is that people don't love having 128gb of DDR5 held back with a laptop-grade iGPU. It puts up strictly non-interactive speed for LLMs of that size.

When you layer those same models across 128gb of dGPUs, then you can actually fill the KV cache in seconds, instead of minutes. And you get higher memory bandwidth on most professional cards.

smith7018 1 day ago|||

Unfortunately Sam Altman won't be the one to deliver us at-home hardware that can run Opus-level models

blitzar 1 day ago||

I wonder what is happening with the OpenAI / Jony Ive crossover episode.

flyinglizard 1 day ago||

Forget about it. Datacenter class hardware is getting farther and farther from desktop use. It’s not PCIe GPUs anymore.

lifeisstillgood 1 day ago||

So I’ve been wondering about “one or two levels back” chip design. If I understand it, 28nm chips (pre EUV) is just about suitable to run (not train just inference) frontier models.

And so if I was a mid-level State would it be worth while to take my nascent chip industry and push it out to build a 28nm foundry and supporting eco-system.

The models will come but the real challenge of the future is having enough compute power for every one and every use. Even if LLMs don’t become AGI they will still be incredible tools - and as OpenAI seems to spend 8000 for each 200 monthly subscription building one’s own data centres seems sensible

paxys 1 day ago||

You are underestimating how difficult it is even for a large nation state to attract the kind of talent and investment it would take to set up a chip industry. It is out of reach for anyone outside of the 3-5 largest national economies and a few big American/Chinese multinational corporations.

wmf 1 day ago|||

28nm chips is just about suitable to run frontier models

I doubt it. 28 nm is 4-5 generations back so inferencing would need a large number of chips with very high power consumption. Maybe you're thinking more of 7 nm which is what Chinese fabs have; it seems to be OK for companies like Huawei.

And so if I was a mid-level State would it be worth while to take my nascent chip industry and push it out to build a 28nm foundry and supporting eco-system.

It never reaches breakeven so you'd have to provide billions in subsidies per year forever. The sovereign chip stuff only makes sense for the US and China; even the EU probably isn't large enough to make it work. A single country definitely couldn't.

eggsome 1 day ago|||

But the energy requirements per token would be orders of magnitude worse than chips made at 3nm. So probably better for your hypothetical state to just pay the extra for more efficient chips so that they don't have (as much) of an energy problem.

mdp2021 1 day ago||

> Even if LLMs don’t become AGI they will still be incredible tools

(Mostly an aside, but: LLMs have paved the way, now the problem is there, it is a challenge and a geopolitically relevant race... AGI is a goal set: not-having-reached-it will be just a stage.)

bogdiyan 1 day ago||

I am not sure how much of the work is done by OpenAI, or whether it is basically a Broadcom chip specifically built for OpenAI models. It is a necessary step, but building a high-performance chip is not easy. Look at companies like Groq, Amazon, and Google.

u1hcw9nx 1 day ago|

Both Google and Amazon also codesign heavily with Broadcomm (Amazon also with Marvell and Alchip)

Broadcomm does stuff like physical design, provides IP blocks, managing manufacturing process with TSMC, packaging and testing. Google and Amazon work with system architecture, performance targets, and requirements but Broadcomm as consultant.

theowaway213456 1 day ago||

This seems like more competition for Cerebras? Am I understanding correctly?

HarHarVeryFunny 1 day ago|

This is just an uncut wafer - I don't think it's intended to be a wafer-scale chip.

Cerebras etch memory onto the wafer alongside the processing elements, but AFAIK OpenAI are going to be using HBM memory and a conventional chiplet design.

KeplerBoy 1 day ago||

Still competition for cerebras. Seems quite unlikely they will get an OpenAI deal anytime soon.

smsx 1 day ago|||

They have an OpenAI deal right now. https://openai.com/index/cerebras-partnership/

HarHarVeryFunny 1 day ago|||

No - this is OpenAI trying to complete with Google (TPU) and Amazon/Anthropic (Trainium) on cost.

Cerebras are addressing very specific use cases, not general purpose LLM serving, and OpenAI does already partner with them.

groundzeros2015 1 day ago||

This is starting to sound like startup scope creep. Instead of making the AI model it’s now custom silicon, web browsers, and consumer electronics?

krick 21 hours ago||

But there never really was a moat in LLM?.. I mean, I don't know where you stand, but my perception is that we all kinda knew that the whole time since 2017, and really knew that since DeepSeek. What they really care about is:

1. Customer acquisition.

2. Cheap(er) electricity/hardware.

So it's really surprising to me that them making their own chip surprises anyone at all. The electricity thing is already kinda being taken care of by earlier strategic alliances with some other evil people, the chip is a natural next step.

glaslong 1 day ago|||

Definitely has that smell... At the same time though, they NEED inference cost to drop substantially, and even better for them if it only happens for their models on their hardware.

I assume they're doing everything they can to make that happen model-side, but coming at it from the other end makes sense too if they can swing it.

guywithahat 1 day ago|||

Maybe, but they’re also a massive company. At some point Google stopped being a startup and become a massive company with margins to look after

groundzeros2015 20 hours ago||

After they were wildly profitable.

guywithahat 12 hours ago||

OpenAI had around 8 billion in revenue in 2025, and are estimated to have 25-30 billion in revenue this year. I assume they're not profitable but this is a massive company with margins to look after, I don't think we should treat them like a startup conceptually anymore

brcmthrowaway 1 day ago||

Nearly all those initiatives have failed though

jeffybefffy519 12 hours ago||

I wonder if we will see common chiplets that are weights for layers of a model, not the full model just a few common layers that are known for certain things.

jnaina 1 day ago||

Two turkeys don't make an eagle.

I don't have much confidence in either OpenAi/Sama nor Broadcom, given past history. Again this is just pre-IPO shenanigans.

As credible as the "Datacenter in Space" claim by Elmo, before the SPCX IPO.

MangoCoffee 1 day ago|

cheap token is more important now than ever. Chinese open weight model is getting pretty good. the real cost of AI adaption will come down to who (China or US) can provide cheap token for consumers and companies. Microsoft consider DeepSeek for their cowork is an example and now OpenAI with its own AI inference chip.

SV_BubbleTime 1 day ago|

I’m not understanding. If cost per token hits the floor that does not mean that you want a model that uses tokens.

If the Chinese are optimizing for token usage, that’s also speed.

Why use more token if few do trick?

More comments...