Posted by tosh 2 days ago
Thanks to chain of thought, actually having the LLM be explicit in its output allows it to have more quality.
Caveman only strips filler from what you see... the reasoning depth stays the same.
I found this visualisation pretty interesting - https://vectree.io/c/chain-of-thought-reasoning-how-llms-thi...
What if we started to talk to LLMs in non-human readable languages (programming languages are also just human readable)? Have a tiny model run locally that translates human input, code, files etc into some-LLM-understandable-language, LLM gets this as an input, skips bunch of layers in input/output, returns back this non-human readable language, local LLM translates back into human language/code changes.
Yesterday or two days ago there was a post about using Apple Fundamental Models, they have really tiny context window. But I think it could be used as this translation layer human->LLM, LLM->human to talk with big models. Though initially those LLMs need to discover which is "language" they want to talk with, feels like doable with reinforcement learning. So cheap local LLM to talk to big remote LLM.
Either this is done already, or it's a super fun project to do.
But I think you're onto something, human languages just aren't optimal here. But to actually see this product to conclusion you'd probably need 60 to 100 million. You would have to completely invent a new language and awesome invent new training methods on top of it.
I'm down if someone wants to raise a VC round.
I don't think humans should be involved in developing this AI-AI language, just giving some guidance, but let two agents collaborate to invent the language, and just gratify/punish them with RL methods.
OpenAI looking at you, got an email some days ago "you're not using OpenAI API that much recently, what changed?"
I imagine it's possible, but just a manner of money.
It sort of reminds me of when palm-pilots (circa late-90's early 2000's) used short-hand gestures for stylus-writing characters. For a short while people's handwriting on white-boards looked really bizarre. Except now we're talking about using weird language to conserve AI tokens.
Maybe it's better to accept a higher token burn-rate until things get better? I'd rather not get used to AI jive-talk to get stuff done.
Maybe we could have a smaller LLM just for translating caveman back into redditor?
Now I full caveman.