Show HN: I built a tiny LLM to demystify how language models work

Posted by armanified 23 hours ago

Show HN: I built a tiny LLM to demystify how language models work(github.com)

Built a ~9M param LLM from scratch to understand how they actually work. Vanilla transformer, 60K synthetic conversations, ~130 lines of PyTorch. Trains in 5 min on a free Colab T4. The fish thinks the meaning of life is food.

Fork it and swap the personality for your own character.

830 points | 126 commentspage 4

kubrador 18 hours ago|

how's it handle longer context or does it start hallucinating after like 2 sentences? curious what the ceiling is before the 9M params

gnarlouse 20 hours ago||

I... wow, you made an LLM that can actually tell jokes?

murkt 17 hours ago|

With 9M params it just repeats the joke from a training dataset.

winter_blue 8 hours ago||

This is amazing work. Thank you.

ben8bit 15 hours ago||

This is really great! I've been wanting to do something similar for a while.

rclkrtrzckr 17 hours ago||

I could fork it and create TrumpLM. Not a big leap, I suppose.

search_facility 15 hours ago|

probably 8M params are too much even :)

danparsonson 13 hours ago|||

As long as you use the best parameters then it doesn't matter

wiseowise 13 hours ago|||

Grab her by the pointer.

rahen 10 hours ago||

I don't mean to be 'that guy', but after a quick review, this really feels like low-effort AI slop to me.

There is nothing wrong using AI tools to write code, but nothing here seems to have taken more than a generic 'write me a small LLM in PyTorch' prompt, or any specific human understanding.

The bar for what constitutes an engineering feat on HN seems to have shifted significantly.

ananandreas 13 hours ago||

Great and simple way to bridge the gap between LLMs and users coming in to the field!

cpldcpu 16 hours ago||

Love it! Great idea for the dataset.

monksy 17 hours ago||

Is this a reference from the Bobiverse?

nullbyte808 22 hours ago|

Adorable! Maybe a personality that speaks in emojis?

armanified 12 hours ago|

OMG! You just gave me the next idea..

More comments...