Top
Best
New

Posted by armanified 23 hours ago

Show HN: I built a tiny LLM to demystify how language models work(github.com)
Built a ~9M param LLM from scratch to understand how they actually work. Vanilla transformer, 60K synthetic conversations, ~130 lines of PyTorch. Trains in 5 min on a free Colab T4. The fish thinks the meaning of life is food.

Fork it and swap the personality for your own character.

830 points | 126 commentspage 4
kubrador 18 hours ago|
how's it handle longer context or does it start hallucinating after like 2 sentences? curious what the ceiling is before the 9M params
gnarlouse 20 hours ago||
I... wow, you made an LLM that can actually tell jokes?
murkt 17 hours ago|
With 9M params it just repeats the joke from a training dataset.
winter_blue 8 hours ago||
This is amazing work. Thank you.
ben8bit 15 hours ago||
This is really great! I've been wanting to do something similar for a while.
rclkrtrzckr 17 hours ago||
I could fork it and create TrumpLM. Not a big leap, I suppose.
search_facility 15 hours ago|
probably 8M params are too much even :)
danparsonson 13 hours ago|||
As long as you use the best parameters then it doesn't matter
wiseowise 13 hours ago|||
Grab her by the pointer.
rahen 10 hours ago||
I don't mean to be 'that guy', but after a quick review, this really feels like low-effort AI slop to me.

There is nothing wrong using AI tools to write code, but nothing here seems to have taken more than a generic 'write me a small LLM in PyTorch' prompt, or any specific human understanding.

The bar for what constitutes an engineering feat on HN seems to have shifted significantly.

ananandreas 13 hours ago||
Great and simple way to bridge the gap between LLMs and users coming in to the field!
cpldcpu 16 hours ago||
Love it! Great idea for the dataset.
monksy 17 hours ago||
Is this a reference from the Bobiverse?
nullbyte808 22 hours ago|
Adorable! Maybe a personality that speaks in emojis?
armanified 12 hours ago|
OMG! You just gave me the next idea..
More comments...