Also have enjoyed playing with https://huggingface.co/HuggingFaceTB/nanowhale-100m-base (but early days for me understanding this space)
But I found its tool calling is reliable than other oss models I tried. I assume that it attributes to interleaved thinking. Its reasoning effort is adjusted automatically by queries. I enjoy reading these reasoning traces from open models because you can't see them from proprietary models.
I would love to try DS4 so bad. Well, I don't have a machine for it. I will just stick to openrouter. I wish I can run a competitive oss model on 32GB machine in 3 years.
You could try DS4 on that machine anyway and see how gracefully it degrades (assuming that it runs and doesn't just OOM immediately). Experimenting with 36GB/48GB/64GB would also be nice; they might be able to gain some compute throughput back by batching multiple sessions together (though obviously at the expense of speed for any single session).
FYI, this to me points to an inference bug, bad sampling, or a non-native quant. OpenRouter is known to route requests to absolutely terrible, borked implementations. A model like DeepSeek V4 Flash shouldn't be making syntax errors like this.
It's so hard to predict what size the open-weight models will be, even in 6 months time. Will a 96GB machine turn out to be a complete waste of money? Who knows.
> Starting from MacBooks with 96GB of RAM.
... oh. And I thought I bought a lot with 48 GB.
Here's one of the top hits: https://forums.developer.nvidia.com/t/fully-custom-cuda-nati...
Bizarre comment; sounds like "How do you know Porsches are fast? Did you drive one?"
I just find it really funny people are willing to write things like "empirically speaking, X is obvious" without actually testing it themselves.
I've seen mixed reviews, and the most honest sounding ones have said it has latency issues.
I don't really care that much what the average LLM power user says at this point, they're impressed by anything an LLM does. They're like toddler's entertained by the sound their Velcro shoes make.
You LLM people are going to be like my mom, once she got an Maps app she completely gave up on navigating anywhere with her own brain, and is lost without a phone.
Except for you LLM people, its going to be reading, writing, problem solving and thinking in general. You'll be completely reliant on an llm to get anything done, have fun with that. You're cooked bro.
"You LLM people". Has it occurred to you that individuals have variation within groups?
Apologies. Where did I form my opinions?