Top
Best
New

Posted by armanified 1 day ago

Show HN: I built a tiny LLM to demystify how language models work(github.com)
Built a ~9M param LLM from scratch to understand how they actually work. Vanilla transformer, 60K synthetic conversations, ~130 lines of PyTorch. Trains in 5 min on a free Colab T4. The fish thinks the meaning of life is food.

Fork it and swap the personality for your own character.

845 points | 126 commentspage 7
peifeng07 21 hours ago|
[dead]
zephyrwhimsy 21 hours ago||
[dead]
zephyrwhimsy 16 hours ago||
[dead]
novachen 16 hours ago||
[dead]
weiyong1024 1 day ago||
[flagged]
techpulselab 20 hours ago||
[dead]
aesopturtle 1 day ago||
[flagged]
Morpheus_Matrix 19 hours ago||
[dead]
solsafe_dev 13 hours ago||
[dead]
george_belsky 20 hours ago|
[dead]
More comments...