Scaling LLMs to Larger Codebases

Posted by kierangill 12/22/2025

Scaling LLMs to Larger Codebases(blog.kierangill.xyz)

307 points | 119 commentspage 3

throw-12-16 12/23/2025|

There is no way this is economical.

Burn through your token limit in agent mode just to thrash around a few more times trying to identify where the agent "misunderstood" the prompt.

The only time LLM's work as coding agents for me is tightly scoped prompts with a small isolated context.

Just throwing an entire codebase into an LLM in an agentic loop seems like a fools errand.

smallerize 12/22/2025||

This highlights a missing feature of LLM tooling, which is asking questions of the user. I've been experimenting with Gemini in VS Code, and it just fills in missing information by guessing and then runs off writing paragraphs of design and a bunch of code changes that could have been avoided by asking for clarification at the beginning.

skolos 12/22/2025||

Claude code regularly asks me questions - I like how anthropic implemented this

rockbruno 12/22/2025|||

Yeah I experienced this yesterday and it was really cool. It really only happened once though.

hobofan 12/22/2025|||

So does Cursor in the Plan mode.

tharkun__ 12/22/2025|||

So like most junior to mid level devs ;)

Claude does have this specific interface for asking questions now. I've only had it choose to ask me questions on its own a very few times though. But I did have it ask clarifying questions before that interface was even a thing, when I specifically asked it to ask me clarifying questions.

Again, like a junior dev. And like a junior dev, it can also help to ask it to ask / check what its doing "mid-way", i.e. watch what it's doing and stop it, when it's running down some rabbit hole you know is not gonna yield results.

CPLX 12/22/2025|||

You'd have to make it do that. Here's a cut and paste I keep open on my desktop, I just paste it back in every time things seem to drift:

> Before you proceed, read the local and global Claude.md files and make sure you understand how we work together. Make sure you never proceed beyond your own understanding.

> Always consult the user anytime you reach a judgment call rather than just proceeding. Anytime you encounter unexpected behavior or errors, always pause and consider the situation. Rather than going in circles, ask the user for help; they are always there and available.

> And always work from understanding; never make assumptions or guess. Never come up with field names, method names, or framework ideas without just going and doing the research. Always look at the code first, search online for documentation, and find the answer to things. Never skip that step and guess when you do not know the answer for certain.

And then the Claude.md file has a much more clearly written out explanation of how we work together and how it's a consultative process where every major judgment call should be prompted to the user, and every single completed task should be tested and also asked for user confirmation that it's doing what it's supposed to do. It tends to work pretty well so far.

pteetor 12/22/2025|||

For complicated prompts, I always add this:

"Before you start, please ask me any questions you have about this so I can give you more context. Be extremely comprehensive."

(I got the idea from a Medium article[1].) The LLM will, indeed, stop and ask good questions. It often notices what I've overlooked. Works very well for me!

[1] https://medium.com/@jordan_gibbs/the-most-important-chatgpt-...

zvorygin 12/22/2025||

Append “First ask clarifying questions” to your prompt.

Ayanonymous 12/23/2025||

I'm still learning about how LLMs can be used in coding, but this article helped me understand the importance of giving clear instructions and not relying too much on automation. The point about developers still needing to guide the model really makes sense. Thanks for sharing this!

tschellenbach 12/22/2025||

I wrote this forever ago in AI terms :) https://getstream.io/blog/cursor-ai-large-projects/

But the summary here is that with the right guidance, AI currently crushes it on large codebases.

avree 12/22/2025||

Why do none of these ever touch on token optimization? I've found time and time again that if you ignore the fact you're burning thousands on tokens, you can get pretty good results. Things like prompt libraries and context.md files tend to just burn more tokens per call.

Simplita 12/23/2025||

One thing that helped us as codebases grew was separating decision-making from execution. Let the model reason about intent and scope, but keep execution deterministic and constrained. It reduced drift and made failures much easier to debug once context got large.

jukkat 12/23/2025||

Put every detail into CLAUDE.md and after a while CC starts to forget/ignore what it’s been told.

I’d like to see dynamic task-specific context building. Write a prompt and the model starts to collect relevant instructions.

Also a review loop to check that instructions were followed.

eddywebs 12/23/2025||

How about adding MCP support to large code bases to provide RAG based context to LLMs, ive been playing with this idea with some good results.

uoaei 12/22/2025||

What is the current state of LCMs (large code models)? I.e. models that operate on the AST and not on text tokens.

rootnod3 12/22/2025|

Or why you shouldn't....