Top
Best
New

Posted by jamdesk 1 day ago

OpenAI unveils its first custom chip, built by Broadcom(techcrunch.com)
Announcement: https://openai.com/index/openai-broadcom-jalapeno-inference-...

https://decrypt.co/371971/openai-broadcom-jalapeno-first-cus...

https://www.cnn.com/2026/06/24/tech/openai-broadcom-jalapeno...

809 points | 461 commentspage 3
fennecbutt 1 day ago|
I mean I'd love to be able to buy something like the 17k tps taalas chip as a pcie or m.2.

Imagine when we can roar along at that speed, low power. Can just have the model reason for a while about anything and everything. It reminds me of the "race to idle" for mcus etc.

ipdashc 1 day ago||
> 17k tps taalas chip

It's odd to me that I haven't heard anything about this approach (baking LLMs/weights into silicon directly) since. It seems almost common-sense that we're going to end up there eventually. And it feels like that point is drawing ever closer now that model capabilities, if not quite plateauing out, are at least getting to a "good enough" point for a LOT of use cases.

I wonder if it's being worked on in secret, if there's something about it that makes it infeasible, or if companies are really too nervous to lock in one model like that because the next one down the line could be a huge improvement. Re. infeasability, I have heard that the Taalas demonstration chip ran Llama 3.1 8B (a pretty horrible model) and that even that took a massive amount of transistors / die area. So it might just be the case that the good models are too big to fit on silicon?

topspin 1 day ago|||
I have also been thinking about this a lot, and share your belief that this is inevitable.

Taalas has a running demo here: https://chatjimmy.ai/

It's eye opening: generated an AVX-512 optimized Mersenne Twister in C in 0.076s, 13,706 tok/s. Too fast for the tok/s to be terribly accurate.

mdp2021 1 day ago||||
> It's odd to me that I haven't heard anything about this approach ... I wonder if it's being worked on in secret, if there's something about it that makes it infeasible

The studies and efforts are ongoing and public, and there are technical hurdles to be faced - but the relevant works go back in time quite a lot and there is heightened interest in it now.

It seems that you simply took the "hyped headlines" for the whole of the work.

ipdashc 20 hours ago||
> It seems that you simply took the "hyped headlines" for the whole of the work.

Well, yeah, that's what I'm saying. It's odd that there haven't been any major headlines (customer interest, competitors' announcements, etc) other than their initial demo. Good to hear it's being worked on though!

mdp2021 19 hours ago||
Did we not play with MNIST and placed some calculated bet on NNs well before Yann LeCun started the fire with the explosive success of the Convolutional NNs?

I'd say it pretty consistently starts in the underground.

The real revolution in the context is that it /could/ be done practically - overcoming the hurdles. But for what the interest in the matter is concerned, I'd say there almost cannot be a greater interest at this stage: making NNs efficient. This must be absolutely evident, as evident it is that the separation of memory and processor is against the idea of NNs, as evident as it is that multiplication is achievable just physically.

Of course many have seen that and got on studying it. As soon as it will be optimally practical...

coder543 1 day ago||||
> It's odd to me that I haven't heard anything about this approach since.

It has only been four months since they unveiled their first prototype. I don't understand your confusion. Chip development does not happen overnight...?

Their initial blog post laid out a roadmap, so theoretically they should have another thing to demonstrate this summer.

ipdashc 20 hours ago|||
In the sense of interested customers, online discussion, other companies doing the same thing, etc. Of course it takes time to get actual results, but from an outsider's perspective it's surprising that it was basically just their initial demo and that's more or less it so far. Excited to see if they come out with something this summer though!
mdp2021 1 day ago|||
You are focusing on Taalas, but (specific) analogue computing, electronic NNs, compute-in-memory etc. - the field including the contextual approach - backdate to Rosenblatt.
coder543 1 day ago||
Yes, I’m focused on the topic at hand that the person I replied to was also talking about.

The person I replied to was acting as if Taalas was ancient history. I was pointing out it has only been a few months.

mdp2021 1 day ago||
I'd say the original remark was more general («this approach (baking LLMs/weights into silicon directly) [... as if] worked on in secret») - which is salient, because when I investigated weeks ago, I found a large number of attempts to CIM and to general branching from Von Neumann architecture for the purpose of optimizing NNs implementations in HW.

Universities are studying, startups are proposing - the «approach» is under the big headlines level but quite lively. Not just Taalas, not just their way - which remains remarkable in the scene as the HW is achieved, working, online, available... and amazing.

coder543 23 hours ago||
CIM does not bake the weights into silicon. The level of optimization that you can do down to the last transistor when the weights are fixed is on an entirely different level than CIM where you still need general purpose ALUs all over the place.
mdp2021 19 hours ago||
> CIM does not bake the weights into silicon

If that were the extent of the terms, then what could we call "baking the weights into silicon"? Setting parts of the circuits to determined values for multiplication is is like printing a Read-Only Memory. (And you compute at it: Compute In Memory.)

> CIM where you still need general purpose ALUs all over the place

If that were so, then why do taxonomists present analogue computing as part of CIM? Ohm's Law does not constitute an "ALU" the way you intend it.

Simply, I used CIM, "Compute In Memory", for lack of a better term - for "store data there where you modify data", for "beyond Von Neumann's separation of data storage and processor".

coder543 19 hours ago||
EDIT: It's just not even worth arguing this point, so deleting my original, much longer comment. Abstract taxonomies can claim that Taalas is CIM, but this entirely and utterly misses the point, and misses what makes Taalas' approach special. If you told a room full of chip architects to go build "CIM for AI", they would not build a Taalas-like totally specialized chip, therefore it is not sufficient, and just muddies the conversation from my point of view. People have been doing "CIM" for decades and yet I've never seen anyone build a totally specialized chip at the scale of Taalas. And yes, you can (in theory) build an analog version of any computer, so of course you can build analog CIM, but "analog compute" is not inherently CIM, so conflating the two is just confusing.
mdp2021 18 hours ago||
I can't check everything right now, but for example, the divulgational from Rakesh Kumar mentiones "Analogue CIM".

And I do not get your rant about "analog computing", which has everything to do with NNs (otherwise, well, prove it): they started with that - they are basically that in fact. Analogue computing is a very great temptation since it would solve the issues of inefficiency in digital NNs. Unfortunately, it has drawbacks which are massive for big NNs. Taalas' seems to be the best compromise.

wmf 1 day ago|||
Good models will require multiple Taalas chips but Groq and Cerebras also require a lot of chips and that hasn't stopped them.
ipdashc 20 hours ago||
> Good models will require multiple Taalas chips

I guess that makes sense. Is this feasible, or does the added latency between chips kill any of the performance gains?

wmf 18 hours ago||
Using multiple chips seems to work fine for Cerebras and Groq so it should also work for Taalas. It does sounds challenging to reach >10K tok/s but latency could be below 1 us which is a small part of the token budget.
MichaelNolan 1 day ago||
The current taalas chip is for a 3.1B param model. I’m hope so much that they can get that up to the 30B range. Just imagine Gemma 4 or Qwen 3.6 at 17k tps.
coder543 1 day ago||
Taalas' first chip is for a Llama 3.1 8B quant, not a 3.1B parameter model, to clarify.
_boffin_ 20 hours ago||
My question is: what will this do to Ceberas? It validates them, did they just have their lunch eaten?
OrvalWintermute 1 day ago||
Word of Advice for OpenAI:

Never underestimate Broadcom’s ability to shaft their own customers

- VMware

- CA Technologies

- Symantec Enterprise Security

- Brocade

- LSI Corporation

SV_BubbleTime 1 day ago||
I don’t know. I’m kind of glad that two of my least favorite companies are working together.
antonvs 1 day ago||
CA Technologies was much worse than Broadcom in its heyday.

Three of their top execs - CEO, CFO, and head of sales - went to federal prison on securities fraud, conspiracy, and other charges. The CEO, Sanjay Kumar, who was at least partly the fall guy for co-founder Charles Wang, served 10 years.

Being acquired by Broadcom could only have been an upgrade, as strange as that may sound.

mobile6test 1 day ago||
„ OpenAI says early results show significantly better performance-per-watt than current state-of-the-art alternatives“

would be very interesting to see any papers/data around this

olalonde 1 day ago||
Why even "unveil" it? Seems like giving away competitive intelligence for no reason at all... other than hyping the stock?
satvikpendem 1 day ago||
I'm assuming they used LLMs to (help humans) do custom circuit design. Even pre LLM there were various computer optimizations that didn't require humans like genetic algorithms. It'd be cool to see a paper on how they did it.
paxys 1 day ago||
Very interested to know the distribution of effort between the two companies. Is this truly a brainchild of OpenAI engineers or did they pay to white label and use a new Broadcom chip?
BLKNSLVR 1 day ago||
*requires VMWare license.
Legend2440 1 day ago||
The only surprising thing about this is that they didn't do it three years ago.
GL26 1 day ago|
OpenAI is going to close the one thing it needs to be profitable : calculation power. Love this website : https://isaiprofitable.com/, shows who wins at the AI revolution. Nvidia wins because it has instant revenue, OpenAI is going to close that gap.
More comments...