Top
Best
New

Posted by huseyinkeles 1 day ago

NanoChat – The best ChatGPT that $100 can buy(github.com)
https://x.com/karpathy/status/1977755427569111362
1466 points | 301 commentspage 4
jumski 1 day ago|
100$ to train a sort of talkable model in 4 hours? wow
saivishwak 1 day ago||
Very cool project! Hopefully it will propel SLM development
earthnail 1 day ago||
This is absolutely fantastic. I really can't wait for the final course to be live. It's in the "shut up and take my money" category. I had so much fun with the nanoGPT videos.
lostmsu 1 day ago||
This is going to be the single most powerful boost to my indie research efforts in years. Thank you, Andrej!
lebimas 1 day ago||
I see Karpathy, I click
spacecadet 1 day ago||
Built so many nano AIs over the last several years. I have played with nanoGPT, its ok. Just hype for Kpathy... So many tiny LLMs out there now that run on cheap SOCs. Try SmolVLM512, runs fine on a sub $100 pi.
simonw 1 day ago|
You're misunderstanding the project. This isn't about an LLM that runs on $100 hardware. It's about a usable LLM that costs $100 to train from scratch.
tdhz77 1 day ago||
These are the time of community posts that are legendary.
cat_plus_plus 1 day ago||
End to end training is a different beast, but finetuning and inference of impressive LLMs like QWEN3 can be done on pretty run of the mill hardware like Apple Silicon macs and gaming PCs if anyone wants a personalized assistant with character. Just ask AI how to finetune AI using unsloth (if using NVIDIA) or MLX (for apple) and it will give you ready to run python scripts.
oblio 1 day ago||
I wonder, if something like this were trained on Wikipedia, could it become a reliable local Wikipedia search engine, basically?
simonw 1 day ago|
I don't think so. Training on documents is not a great way of building a search engine for those for the information in those documents, because the training process mixes all of that information together in ways that detach the individual words from the source documents they came from.

As usual, if you want an LLM to be able to help search a corpus of text the best way to achieve that is to teach it how to use a search tool against that text.

victor106 1 day ago||
> the best way to achieve that is to teach it how to use a search tool against that text.

Any examples of this?

simonw 1 day ago||
I've seen this called "agentic RAG" by some people. The easiest way to get a local demo is with Claude Code or Codex CLI. They know how to use grep, and you can set them loose on a folder full of text files and tell them to use grep to answer questions - it can work really well.

I just tried this in "claude --dangerously-skip-permissions":

> Use Python and AppleScript to find Apple Notes that mention UPS

... and fell down a rabbit hole of optimizations because my Notes collection is HUGE, but it got there in the end!

yieldcrv 1 day ago|
> nanochat is designed to run on a single 8XH100 node
More comments...