Caveman: Why use many token when few token do trick

Posted by tosh 2 days ago

Caveman: Why use many token when few token do trick(github.com)

871 points | 358 commentspage 3

VadimPR 2 days ago|

Wouldn't this affect quality of output negatively?

Thanks to chain of thought, actually having the LLM be explicit in its output allows it to have more quality.

functional_dev 1 day ago|

Chain of thought happens in the <think> tags, not the visible output.

Caveman only strips filler from what you see... the reasoning depth stays the same.

I found this visualisation pretty interesting - https://vectree.io/c/chain-of-thought-reasoning-how-llms-thi...

alfanick 1 day ago||

Either this already exists, or someone is going to implement that (should I implement that?): - assumption LLM can input/output in any useful language, - human languages are not exactly optimal away to talk with LLM, - internally LLMs keep knowledge as whole bunch of connections with some weights and multiple layers, - they need to decode human-language input into tokens, then into something that is easy to digest by further layers, then get some output, translate back into tokens and human language (or programming language, same thing), - this whole human language <-> tokens <-> input <-> LLM <-> output <-> tokens <-> language is quite expensive.

What if we started to talk to LLMs in non-human readable languages (programming languages are also just human readable)? Have a tiny model run locally that translates human input, code, files etc into some-LLM-understandable-language, LLM gets this as an input, skips bunch of layers in input/output, returns back this non-human readable language, local LLM translates back into human language/code changes.

Yesterday or two days ago there was a post about using Apple Fundamental Models, they have really tiny context window. But I think it could be used as this translation layer human->LLM, LLM->human to talk with big models. Though initially those LLMs need to discover which is "language" they want to talk with, feels like doable with reinforcement learning. So cheap local LLM to talk to big remote LLM.

Either this is done already, or it's a super fun project to do.

999900000999 1 day ago|

My theory was that someone should write a specific LLM language, and then spend a whole lot of money to train models using that. A few times other commenters here have pointed out that that would be really difficult .

But I think you're onto something, human languages just aren't optimal here. But to actually see this product to conclusion you'd probably need 60 to 100 million. You would have to completely invent a new language and awesome invent new training methods on top of it.

I'm down if someone wants to raise a VC round.

alfanick 1 day ago||

I'm currently downloading Ollama and going to write a simple proof-of-concept with Qwen as local "frontend", talking to OpenAI GPT as "backend". I think the idea is sound, but indeed needs retraining of GPT (hmm like training tiny local LLM in synchronization of a big remote LLM). It might be not bad business venture in the end.

I don't think humans should be involved in developing this AI-AI language, just giving some guidance, but let two agents collaborate to invent the language, and just gratify/punish them with RL methods.

OpenAI looking at you, got an email some days ago "you're not using OpenAI API that much recently, what changed?"

999900000999 1 day ago||

If you want to start a Git repo somewhere let me know and I'll do what I can to help.

I imagine it's possible, but just a manner of money.

ajd555 1 day ago||

So, if this does help reduce the cost of tokens, why not go even further and shorten the syntax with specific keywords, symbols and patterns, to reduce the noise and only keep information, almost like...a programming language?

dr_kiszonka 1 day ago||

I appreciate the effort you put into addressing the feedback and updating the readme. I think the web design of your page and visual distractions in the readme go against the caveman's no-fluff spirit and may not appeal to the folks that would otherwise be into your software. I like the software.

crispyambulance 1 day ago||

I no like.

It sort of reminds me of when palm-pilots (circa late-90's early 2000's) used short-hand gestures for stylus-writing characters. For a short while people's handwriting on white-boards looked really bizarre. Except now we're talking about using weird language to conserve AI tokens.

Maybe it's better to accept a higher token burn-rate until things get better? I'd rather not get used to AI jive-talk to get stuff done.

chmod775 1 day ago||

I cannot wait for this to become the normal and expected way to interact with LLMs in the coming decades as humanity reaches the limit of compute capacity. Why waste 3/4th?

Maybe we could have a smaller LLM just for translating caveman back into redditor?

benjaminoakes 1 day ago|

I was already part caveman in my messages to the LLM.

Now I full caveman.

anigbrowl 1 day ago||

Nothing against this project, it's been the case since forever that you could get better quality responses by simple telling your LLM to be brief and to the point, to ask salient questions rather than reflexively affirm, and eschew cliches and faddish writing styles.

goldenarm 1 day ago||

That's a great idea but has anyone benchmarked the performance difference?

norskeld 2 days ago|

APL for talking to LLM when? Also, this reminded me of that episode from The Office where Kevin started talking like a caveman to make communication efficient.

More comments...