Posted by rbanffy 1 day ago
Do you have resources you can point to / mind sharing your setup? What were the biggest problems / delights doing this?
Coding is mostly "agentic" so I'm bit puzzled.
I feel that if you want to build a coding agent / harness the first thing you should do is to build an evaluation framework to track performance for coding by having your internal metrics and task performance, instead I see most coding agents just fiddle with adding features that don't improve the core ability of a coding agent.
I considered creating a PR for that, but found that creating new agents instead worked fine for me.
Now I just started looking into OpenCode yesterday, but seems you can override the system prompts by basically overloading the templates used in for example `~/.opencode/agents/build.md`, then that'd be used instead of the default "Build" system prompt.
At least from what I gathered skimming the docs earlier, might not actually work in practice, or not override all of it, but seems to be the way it works.
The changes I've made locally are:
- Added a discuss mode with almost on tools except read file, ask tool, web search only based no heuristics + being able to switch from discuss to plan mode.
Experiments:
- hashline: it doesn't bring that much benefit over the default with gpt-5.4.
- tried scribe [0]: It seems worth it as it saves context space but in worst case scenarios it fails by reading the whole file, probably worth it but I would need to experiment more with it and probably rewrite some parts.
The nice thing about opencode is that it uses sqlite and you can do experiments and then go through past conversation through code, replay and compare.
I'm actually moving to containerised isolation. I realised the agents waste too much time trying to correctly install dependencies, not unlike a normal nixos user.
"we see occasional complaints about memory issues in opencode
if you have this can you press ctrl+p and then "Write heap snapshot"
Upload here: https://romulus.warg-snake.ts.net/upload
Original post:https://x.com/i/status/2035333823173447885
There are probably IDE plugins that feed prompts or context in based on your interaction with the editor.
I'm guessing that a model which only covers a single language might be more compact and efficient vs a model trained across many languages and non-programming data.
Give it a look, maybe it could inspire you: https://github.com/fulgidus/zignet
Bottom-line: fine-tuning looks like the best option atm
Now I’m using that to generate synthetic sets and clean it up, but man I’m struggling hah. Fun though.
If you want it to stick to better practices you have to write skills, provide references (example code it can read), and provide it with harnessing tools (linters, debuggers, etc) so the agent can iterate on its own output.
I used Claude with paid subscription and codex as well and settled to OpenCode with free models.