Microgpt - Hacker News

Posted by tambourine_man 1 day ago

Microgpt(karpathy.github.io)

1679 points | 293 commentspage 4

bytesandbits 16 hours ago|

sensei karpathy has done it again

borplk 14 hours ago||

Can anyone mention how you can "save the state" so it doesn't have to train from scratch on every run?

stuckkeys 16 hours ago||

That web interface that someone commented in your github was flawless.

dhruv3006 22 hours ago||

Karapthy with another gem !

charcircuit 21 hours ago|

[flagged]

mold_aid 15 hours ago||

"art" project?

coolThingsFirst 21 hours ago||

Incredibly fascinating. One thing is that it seems still very conceptual. What id be curious about how good of a micro llm we can train say with 12 hours of training on macbook.

shevy-java 18 hours ago||

Microslop is alive!

ViktorRay 23 hours ago||

Which license is being used for this?

dilap 23 hours ago|

MIT (https://gist.github.com/karpathy/8627fe009c40f57531cb1836010...)

ViktorRay 23 hours ago||

Thank you

hackersk 21 hours ago|

[flagged]

lukan 19 hours ago||

"The math makes so much more sense when you implement it yourself vs reading papers."

Something I found to be universal true when dealing with math. My brain pretty much refuses to learn abstract math concepts in theory, but applying them with a practical problem is a very different experience for me (I wish school math would have had a bigger focus on practical applications).

Sammi 16 hours ago||

It's like you learn math best with your hands. The mind catches up to your hands afterwards.

Jaxon_Varr 16 hours ago|||

[dead]

byang364 19 hours ago||

Imagine the people on here spraying their AI takes everywhere while being this oblivious, the code is more or less a standard assignment in all Deep Learning courses. The "reasoning" is two matrix transformations based on how often words appear next to each other.

harvey9 18 hours ago||

Quite a few people on here are neither math nor CS grads and some of us don't work in tech for our day jobs either.

0xEF 16 hours ago||

Right. But HN, among other platforms, is full of users who will confidently run their mouths about something they don't fully understand while believing they do. I think the previous commenter was being too shy in pointing out that even exceptionally smart people sometimes forget where the limits of their own knowledge are, not to mention consider themselves immune to any propaganda that surrounds the subject at hand.

byang364 9 hours ago|||

The Opus 4.6 thread was full of "very smart" and experienced SWEs likening model weights to neurons. And again, any DL curriculum worth its salt will thoroughly debunk that comparison, i.e. Justin Johnson. In this day and age it seems the Darios and Altmans have successfully waged the most damaging propaganda campaign in modern time. Even the Pentagon is lining up to relegate its decision making to black box stochastic ML models. Tech as an industry is unfortunately extremely gullible, all the more so when pressured by the market, VCs, clueless PE analysts, the tech blogger/grifter complex. Foundation model makers can get away with hiding training data while proclaiming they are building a "moral" neural network while no one bats an eyelash.

famouswaffles 8 hours ago|||

>Right. But HN, among other platforms, is full of users who will confidently run their mouths about something they don't fully understand while believing they do.

This is honestly funny and kind of ironic.

If this:

'The "reasoning" is two matrix transformations based on how often words appear next to each other.'

is what byang364 has to say, then he's part of the people you mention.

More comments...