Top
Best
New

Posted by quesomaster9000 6 hours ago

Show HN: Z80-μLM, a 'Conversational AI' That Fits in 40KB(github.com)
How small can a language model be while still doing something useful? I wanted to find out, and had some spare time over the holidays.

Z80-μLM is a character-level language model with 2-bit quantized weights ({-2,-1,0,+1}) that runs on a Z80 with 64KB RAM. The entire thing: inference, weights, chat UI, it all fits in a 40KB .COM file that you can run in a CP/M emulator and hopefully even real hardware!

It won't write your emails, but it can be trained to play a stripped down version of 20 Questions, and is sometimes able to maintain the illusion of having simple but terse conversations with a distinct personality.

--

The extreme constraints nerd-sniped me and forced interesting trade-offs: trigram hashing (typo-tolerant, loses word order), 16-bit integer math, and some careful massaging of the training data meant I could keep the examples 'interesting'.

The key was quantization-aware training that accurately models the inference code limitations. The training loop runs both float and integer-quantized forward passes in parallel, scoring the model on how well its knowledge survives quantization. The weights are progressively pushed toward the 2-bit grid using straight-through estimators, with overflow penalties matching the Z80's 16-bit accumulator limits. By the end of training, the model has already adapted to its constraints, so no post-hoc quantization collapse.

Eventually I ended up spending a few dollars on Claude API to generate 20 questions data (see examples/guess/GUESS.COM), I hope Anthropic won't send me a C&D for distilling their model against the ToS ;P

But anyway, happy code-golf season everybody :)

131 points | 33 commentspage 2
Zee2 5 hours ago|
This is super cool. Would love to see a Z80 simulator set up with these examples to play with!
Imustaskforhelp 1 hour ago|
100% Please do this! I wish the same
pdyc 4 hours ago||
interesting, i am wondering how far can it go if we remove some of these limitations but try to solve some extremely specific problem like generating regex based on user input? i know small models(270M range) can do that but can it be done in say < 10MB range?
Waterluvian 3 hours ago|
Generate an LLM that is designed to solve one extremely specific problem: answering the ultimate question of life, the universe, and everything.

Even with modern supercomputing the computation would be outpaced by the heat death of the universe, so token output must be limited to a single integer.

dirkt 4 hours ago||
Eliza's granddaughter.
NooneAtAll3 2 hours ago||
did you measure token/s?
Zardoz84 3 hours ago||
Meanwhile, Eliza was ported to BASIC and was run on many home computers in the 80s.
jasonjmcghee 4 hours ago||
For future projects and/or for this project, there are many LLMs available more than good enough to generate that kind of synthetic data (20 Qs) with permissive terms of use. (So you don’t need to stress about breaking TOS / C&D etc)
alfiedotwtf 4 hours ago||
An LLM in a .com file? Haha made my day
teaearlgraycold 3 hours ago|
SLM
quesomaster9000 3 hours ago||
All the 'Small' language models and the 'TinyML' scene in general tend to bottom out at a million parameters, hence I though 'micro' is more apt at ~150k params.
codetiger 4 hours ago|
Imagine, this working on a Gameboy, in those days. Would've sounded like magic
numpad0 20 minutes ago||
Flip phones had predictive texts since forever. LLMs are just* supercharged predi[ctive text algorithms are computer algorithms that are]
Sharlin 4 hours ago|||
I don’t think this could beat an ELIZA-style bot in how magical it feels, given the extreme terseness of its replies.
lodovic 4 hours ago|||
I love these thought experiments. Looking at the code size, it would have been possible for someone to come up with this back in the days, similar to the idea of a million monkeys on a typewriter eventually producing Shakespeare.
alfiedotwtf 4 hours ago||
And would have lasted 3 minutes.

Speaking of - I remember my first digital camera (Fujitsu 1Mb resolution using SmartMedia)… it used so much power that you could take 20-30 photos and then needed to replace all 4 batteries lol