Local AI needs to be the norm

Posted by cylo 1 day ago

1683 points | 663 commentspage 11

1a527dd5 22 hours ago|

Consumer/private needs to be local.

Work? I don't want it local at all. I want it all cloud agent.

rduffyuk 22 hours ago||

agree with the article but the limitation for local llm usefulness is the limited scope from my experiments. eventually context heavy data pipelines require larger models which consumer hardware can't deal with yet. the local model for summary on a page like you describe could be done via code as well, i've found using an llm isn't always the right choice. for example i use ner tagging in my md docs for better indexing and llm search capabilities. this is purely code based and not via an llm. tried with an llm and the results were a lot worse. augmenting tools to make the llm produce better outputs gives better results.

eyk19 23 hours ago||

Apple stock is going to skyrocket

baal80spam 23 hours ago|

Maybe. What about NVDA?

anArbitraryOne 17 hours ago||

Just let me turn it off to preserve battery life

tristor 6 hours ago||

The biggest challenge I have with local models right now (and I use them extensively) is search integration and tool calling. The thing that Claude and ChatGPT get right for most general purpose use cases which is hard to do with a local model is the model deciding when to search vs use its built-in training, and having strong search tooling, as well as tool calling for additional data sources via MCP. If you can incorporate the right data into the context window, local models are more than good enough for general purpose usage as they stand today. Qwen 3.5, Gemma 4, even gpt-oss-120b are solid at reasonable quants if they have the right data.

The moment we see standardized and batteries-included pathways to integrate search, ideally at no additional cost, in things like LM Studio combined with better tool calling in the local models, you'll quickly see local model performance catch up.

Salgat 21 hours ago||

Local models are much less energy efficient right?

HDBaseT 20 hours ago|

It's a good question, although I think hard to quantify.

If you are simply measuring Watt Cost per Token, you are missing the mark drastically. You have to measure quality output per Watt.

It sounds reasonably difficult to benchmark this, maybe I'm wrong though.

prometheus1992 21 hours ago||

Agreed, but the way ram prices are going, I don't think we would be able to afford hardware that can run any useful model.

agentifysh 1 day ago||

Until the hardware is economical and powerful enough, local AI that can compete with frontier models today is still far off.

If we could even get something like GPT 5.5 running locally that would be quite useful.

tuananh 17 hours ago||

local llm doesn't need to match SOTA performance in order to be useful.

worthless-trash 9 hours ago|

How long till we have distributed AI, where we can have different people run/understand different parts of problems and pass off work to different nodes across the internet.

More comments...