I'm dialing back my LLM usage

Posted by sagacity 7/2/2025

422 points | 239 commentspage 3

ivraatiems 7/3/2025|

The way that LLMs are used/are encouraged by business right now is evidence that they are mostly being pushed by people who don't understand software. I don't know very many actual software engineers who advocate for vibe-coding or using LLMs this way. I know ton of engineers who advocate using them as helpful tools, myself included (in fact I changed my opinion on it as their capabilities grew, and I'll continue to do so).

Every tool is just a tool. No tool is a solution. Until and unless we hit AGI, only the human brain is that.

AsmodiusVI 7/2/2025||

Really appreciated this take, hits close to home. I’ve found LLMs great for speed and scaffolding, but the more I rely on them, the more I notice my problem-solving instincts getting duller. There’s a tradeoff between convenience and understanding, and it’s easy to miss until something breaks. Still bullish on using AI for exploring ideas or clarifying intent, but I’m trying to be more intentional about when I lean in vs. when I slow down and think things through myself.

delusional 7/2/2025||

I wish there was a browser addon that worked like ublock but for LLM talk. Like just take it all, every blog post, every announcement, every discussion and wipe it all away. I just want humanity to deal with some of our actual issues, like fascism, war in Europe and the middle east, the centralization of our lines of production, the unfairness in our economies.

Instead we're stuck talking about if the lie machine can fucking code. God.

lucasluitjes 7/2/2025||

Ironically if you wanted to build that accurately and quickly, you would probably end up having an LLM classify content as being LLM-related or not. Keyword-based filtering would have many false positives, and training a model takes more time to build.

pmxi 7/2/2025||

I’m sure you could build a prototype add on to do this pretty quickly with Claude Code or the like

nvahalik 7/2/2025||

Sounds like he shot for the moon and missed.

I've been allowing LLMs to do more "background" work for me. Giving me some room to experiment with stuff so that I can come back in 10-15 minutes and see what it's done.

The key things I've come to are that it HAS to be fairly limited. Giving it a big task like refactoring a code base won't work. Giving it an example can help dramatically. If you haven't "trained" it by giving it context or adding your CLAUDE.md file, you'll end up finding it doing things you don't want it to do.

Another great task I've been giving it while I'm working on other things is generating docs for existing features and modules. It is surprisingly good at looking at events and following those events to see where they go and generating diagrams and he like.

atonse 7/2/2025||

These seem like good checkpoints (and valid criticisms) on the road to progress.

But it's also not crazy to think that with LLMs getting smarter (and considerable resources put into making them better at coding), that future versions would clean up and refactor code written by past versions. Correct?

bwfan123 7/2/2025||

nope, there are limits to what next-token predictions can do, we we have hit those limits. cursor and the like are great for some usecases - for example a semantic search for relevant code snippets, and autocomplete. But beyond that, they only bring frustration in my use.

bunderbunder 7/2/2025||

Arguably most of the recent improvement in AI coding agents didn't exactly come from getting better at next token prediction in the first place. It came from getting better at context management, and RAG, and improvements on the usable context window size that let you do more with context management and RAG.

And I don't really see any reason to declare we've hit the limit of what can be done with those kinds of techniques.

bwfan123 7/2/2025||

I am sure they will continue to improve just as the static-analyzers and linters are improving.

But, fundamentally, LLMs lack a theory of the program as intended in this comment https://news.ycombinator.com/item?id=44443109#44444904 . Hence, they can never reach the promised land that is being talked about - unless there are innovations beyond next-token prediction.

bunderbunder 7/2/2025||

They do lack a theory of program. But also, if there's one consistent theme that you can trace through my 25 of studying and working in ML/AI/whateveryouwanttocallit, it's that symbolic reasoning isn't nearly as critical to building useful tools as we like to think it is.

In other words, I would be wrong of me to assume that the only way I can think of to go about solving a problem is the only way to do it.

yard2010 7/2/2025|||

LLM doesn't get smarter. An LLM is just a statistical tool for text generation, not a form of an AI. Since language is so inherent to intelligence and knowledge, it correlates. But LLM is just a tool for predicting what the average internet person would say.

bunderbunder 7/2/2025||

Maybe. But there's also an argument to be made that an ounce of prevention is worth a pound of cure.

Maybe quite a few pounds, if the cure in question hasn't been invented yet and may turn out to be vaporware.

alexvitkov 7/2/2025||

I've found the Cursor autocomplete to be nice, but I've learned to only accept a completion if it's byte for byte what I would've written. With the context of surrounding code it guesses that often enough to be worth the money for me.

The chatbot portion of the software is useless.

cornfieldlabs 7/2/2025|

For me it's the opposite. Autocomplete suggests the lines I just deleted and also suggests completely useless stuff. I have a shortcut to snooze (it's possible!) it. It interrupts flow my flow a lot. I would rather those stuff myself.

Chat mode on the other hand follows my rules really well.

I mostly use o3 - it seems to be the only model that has "common sense" in my experience

KaiMagnus 7/2/2025||

I’ve been starting my prompts more and more with the phrase „Let’s brainstorm“.

Really powerful seeing different options, especially based on your codebase.

> I wouldn't give them a big feature again. I'll do very small things like refactoring or a very small-scoped feature.

That really resonates with me. Anything larger often ends badly and I can feel the „tech debt“ building in my head with each minute Copilot is running. I do like the feeling though when you understood a problem already, write a detailed prompt to nudge the AI into the right direction, and it executes just like you wanted. After all, problem solving is why I’m here and writing code is just the vehicle for it.

furyofantares 7/2/2025||

I've gone the other way and put a lot of effort into figuring out how to best utilize these things. It's a rough learning curve and not trivial, especially given how effortless stuff looks and feels at first.

veselin 7/2/2025||

I think that people are just too quick to assume this is amazing, before it is there. Which doesn't mean it won't get there.

Somehow if I take the best models and agents, most hard coding benchmarks are at below 50% and even swe bench verified is like at 75 maybe 80%. Not 95. Assuming agents just solve most problems is incorrect, despite it being really good at first prototypes.

Also in my experience agents are great to a point and then fall off a cliff. Not gradually. Just the type of errors you get past one point is so diverse, one cannot even explain it.

billbrown 7/4/2025|

The use case for LLM assistance that provides value for me is solving obscure lint or static analysis warnings and errors.

I take the message, provide the surrounding code, and it gives me a few approaches to solve them. More than half the time, the resolution is there and I can copy the relevant bit in the literal verbiage. (The other times it's garbage but at least I can see that this is going to require some AI—Actual Intelligence.)

More comments...