Posted by TORcicada 1 day ago
Isn’t this kind of approach feasible for something so purpose-built?
> CERN is using extremely small, custom large language models physically burned into silicon chips to perform real-time filtering of the enormous data generated by the Large Hadron Collider (LHC).
> This work represents a compelling real-world demonstration of “tiny AI” — highly specialised, minimal-footprint neural networks
FPGAs for Neural Networks have been s thing since before the LLM era.
> [ GENEVA, SWITZERLAND — March 28, 2026 ] — CERN is using extremely small, custom large language models physically burned into silicon chips to perform real-time filtering of the enormous data generated by the Large Hadron Collider (LHC).
Like (~9K) Jumbo Frames!
Like anything else, once you work with a system, it gives you ten ideas where to go next...
I hacked on it a while back, added Comv2dTranspose support to it.
Are you perhaps confusing Groq with the Etched approach? IIUC Etched is the company that "burned the transformer onto a chip". Groq uses LPUs that are more generalist (they can run many transformers and some other architectures) and their speed comes from using SRAM.