Posted by jamdesk 1 day ago
https://decrypt.co/371971/openai-broadcom-jalapeno-first-cus...
https://www.cnn.com/2026/06/24/tech/openai-broadcom-jalapeno...
I know, it's nick picking, but when people can just reach in and take services away, like Fable/Mythos, hardware is the only thing worth buying.
That sentence sounds weird to me. I can't really put my finger on why, maybe the combination of adverbs, or just the fact of writing the desire of scaling as a company so directly. It feels (to me) like openly claiming their selfish goals. Or maybe I am just misinterpreting and they are referring to the whole humanity as "We" (but knowing Broadcom and in a lesser extent OpenAI doings, I am not convinced).
I want a super fast LLM that is Opus 4.6+, like, in ability.
For inference workloads, it makes a lot more sense to optimize for prefill/ttft before maxing out memory bandwidth.
The software side is still pretty sketchy, too. Apple's ecosystem is fractured between NPU, MPS and Accelerate BLAS, with libraries like MLX and CoreML built precariously overtop. Apple has to commit to a full rearchitecture of their GPU to challenge Nvidia, which fractures that ecosystem even further.
Apple's business model will be to pay Google for compute for now, and then as they get better on device, move more and more locally. So they're very well incentivized to get better. The thing they've been best at in the last 19 years has been spinning flywheels they already have, and this is exactly that.
It's not hard to see why Apple made those mistakes, and many of them were made by the rest of the industry too. It's specifically tragic that Apple snatched defeat from the jaws of victory with GPGPU programming, and it makes me think that their future will be more subscription services and less half-ass technical efforts. Or they rip up the foundation and start from scratch, it's never too late to start work on Apple Silicon 2.
When you layer those same models across 128gb of dGPUs, then you can actually fill the KV cache in seconds, instead of minutes. And you get higher memory bandwidth on most professional cards.
And so if I was a mid-level State would it be worth while to take my nascent chip industry and push it out to build a 28nm foundry and supporting eco-system.
The models will come but the real challenge of the future is having enough compute power for every one and every use. Even if LLMs don’t become AGI they will still be incredible tools - and as OpenAI seems to spend 8000 for each 200 monthly subscription building one’s own data centres seems sensible
I doubt it. 28 nm is 4-5 generations back so inferencing would need a large number of chips with very high power consumption. Maybe you're thinking more of 7 nm which is what Chinese fabs have; it seems to be OK for companies like Huawei.
And so if I was a mid-level State would it be worth while to take my nascent chip industry and push it out to build a 28nm foundry and supporting eco-system.
It never reaches breakeven so you'd have to provide billions in subsidies per year forever. The sovereign chip stuff only makes sense for the US and China; even the EU probably isn't large enough to make it work. A single country definitely couldn't.
(Mostly an aside, but: LLMs have paved the way, now the problem is there, it is a challenge and a geopolitically relevant race... AGI is a goal set: not-having-reached-it will be just a stage.)
Broadcomm does stuff like physical design, provides IP blocks, managing manufacturing process with TSMC, packaging and testing. Google and Amazon work with system architecture, performance targets, and requirements but Broadcomm as consultant.
Cerebras etch memory onto the wafer alongside the processing elements, but AFAIK OpenAI are going to be using HBM memory and a conventional chiplet design.
Cerebras are addressing very specific use cases, not general purpose LLM serving, and OpenAI does already partner with them.
1. Customer acquisition.
2. Cheap(er) electricity/hardware.
So it's really surprising to me that them making their own chip surprises anyone at all. The electricity thing is already kinda being taken care of by earlier strategic alliances with some other evil people, the chip is a natural next step.
I assume they're doing everything they can to make that happen model-side, but coming at it from the other end makes sense too if they can swing it.
I don't have much confidence in either OpenAi/Sama nor Broadcom, given past history. Again this is just pre-IPO shenanigans.
As credible as the "Datacenter in Space" claim by Elmo, before the SPCX IPO.
If the Chinese are optimizing for token usage, that’s also speed.
Why use more token if few do trick?