OpenCode – Open source AI coding agent

Posted by rbanffy 22 hours ago

OpenCode – Open source AI coding agent(opencode.ai)

1146 points | 562 comments

logicprog 20 hours ago|

OpenCode was the first open source agent I used, and my main workhorse after experimenting briefly with Claude Code and realizing the potential of agentic coding. Due to that, and because it's a popular an open source alternative, I want to be able to recommend it and be enthusiastic about it. The problem for me is that the development practices of the people that are working on it are suboptimal at best; they're constantly releasing at an extremely high cadence, where they don't even spend the time to test or fix things (or even build a proper list of changes for each release), and they add, remove, refine, change, fix, and break features constantly at that accelerated pace.

More than that, it's an extremely large and complex TypeScript code base — probably larger and more complex than it needs to be — and (partly as a result) it's fairly resource inefficient (often uses 1GB of RAM or more. For a TUI).

On top of that, at least I personally find the TUI to be overbearing and a little bit buggy, and the agent to be so full of features that I don't really need — also mildly buggy — that it sort of becomes hard to use and remember how everything is supposed to work and interact.

jmmv 10 hours ago||

> and (partly as a result) it's fairly resource inefficient (often uses 1GB of RAM or more. For a TUI).

That's (one of the reasons) why I'm favoring Codex over Claude Code.

Claude Code is an... Electron app (for a TUI? WTH?) and Codex is Rust. The difference is tangible: the former feels sluggish and does some odd redrawing when the terminal size changes, while the latter definitely feels more snappy to me (leaving aside that GPT's responses also seem more concise). At some point, I had both chewing concurrently on the same machine and same project, and Claude Code was using multiple GBs of RAM and 100% CPU whereas Codex was happy with 80 MB and 6%.

Performance _is_ a feature and I'm afraid the amounts of code AI produces without supervision lead to an amount of bloat we haven't seen before...

ctmnt 8 hours ago|||

I think you’re confusing capital c Claude Code, the desktop Electron app, and lowercase c `claude`, the command line tool with an interactive TUI. They’re both TypeScript under the hood, but the latter is React + Ink rendered into the terminal.

The redraw glitches you’re referring to are actually signs of what I consider to be a pretty major feature, a reason to use `claude` instead of `codex` or `opencode`: `claude` doesn’t use the alternate screen, whereas the other two do. Meaning that it uses the standard screen buffer, meaning that your chat history is in the terminal (or multiplexer) scrollback. I much prefer that, and I totally get why they’ve put so much effort into getting it to work well.

In that context handling SIGWINCH has some issues and trickiness. Well worth the tradeoff, imo.

conradev 1 hour ago|||

Codex is using its app server protocol to build a nice client/server separation that I enjoy on top of the predictable Rust performance.

You can run a codex instance on machine A and connect the TUI to it from machine B. The same open source core and protocol is shared between the Codex app, VS Code and Xcode.

jitl 44 minutes ago||

OpenCode works this way too

jimmydoe 7 hours ago|||

not sure if same reason but window resize feels better in claude than codex.

on my m1, claude is noticeably slower when starting, but it feels ok after that.

petcat 8 hours ago||||

Anthropic needs to spend some tokens rewriting Claude Code in Rust (yes, really).

The difference in feel between Codex and Claude Code is obvious.

The whole thing is vibed anyway, I'm sure they could get it done in a week or two for their quality standards.

MithrilTuxedo 1 hour ago|||

Java (incl. Scala, Closure, Groovy, Jython, etc.) is better suited to running as a server. Let agents write clean readable code and leave performance concerns to the JIT compiler. If you really want you can let agents rewrite components at runtime without losing context.

Erlang would offer similar benefits, because what we're doing with these things is more message passing than processing.

Rust is what I'd want agents writing for edge devices, things I don't want to have to monitor. Granted, our devices are edge devices to Anthropic, but they're more tightly coupled to their services.

seunosewa 8 hours ago||||

I'd suggest Go ahead of Rust. It's more accessible to contributors.

jeremyjh 5 hours ago|||

I think Go might be a better choice but not for that reason at all.

Go could implement something like this with no dependencies outside the standard library. It would make sense to take on a few, but a comparable Rust project would have at least several dozens.

Also, Go can deliver a single binary that works on every Linux distribution right out of the box. In Rust, its possible but you have to static compile with muslc and that is a far less well-trodden path with some significant differences to the glibc that most Rust libraries have been tested with.

whirlwin 1 hour ago|||

Most of, if not every commit of claude code is now written by claude code itself without any human writing code, only promoting.

indigodaddy 4 hours ago|||

Because of all these obvious Go benefits, wonder why they are instead always doing these tools in typescript? Must be some reason?

phaedrix 3 hours ago|||

Because is all the current generation of devs know unfortunately.

jeremyjh 4 hours ago||||

Most developers find it more pleasant.

robutsume 3 hours ago|||

[dead]

t1amat 7 hours ago||||

Claude Code is closed source so this isn’t a concern they should have as Opus is great at Rust.

thayne 2 hours ago||||

I think go will make it easier for more developers to contribute, bit rust would probably attract higher quality contributions.

sam0x17 2 hours ago||||

If anything, the stricter the compiler the better for vibe coding the language

petcat 8 hours ago||||

> It's more accessible to contributors.

What would make go more "accessible to contributors" than Rust?

verandaguy 3 hours ago|||

My personal opinion is that I like Rust much more than Go, but I can’t deny that Rust is a big, and more dauntingly to newcomers, pretty unopinionated language compared to Go.

There are more syntax features, more and more complex semantics, and while rustc and clippy do a great job of explaining like 90% of errors, the remaining 10% suuuuuck.

There’s also some choices imposed by the build system (like cargo allowing multiple versions of the same dep in a workspace) and by the macro system (axum has some unintuitive extractor ordering needs that you won’t find unless you know to look for them), and those things and the hurdles they present become intuitive after a time but just while getting started? Oof

yoz-y 7 hours ago||||

Go is a language one can learn and become functional in an afternoon. Rust is way more involved.

_flux 4 hours ago||||

Frankly I don't think one even needs to learn it, if you know a bunch of other languages and the codebase is good. I was able to just make a useful change to an open source project by just doing it, without having written any lines of Go before. Granted the MR needed some revisions.

Rust is my favorite, though. There are values beyond ease of contribution. I can't replicate the experience with a Rust project anymore, but I suspect it would have been tougher.

Yokohiii 7 hours ago|||

To vibe coders it doesn't matter, right?

kelipso 6 hours ago|||

Even then, you still need to read that code and Rust is way less read friendly than Go.

Yokohiii 6 hours ago||

I have the impression that most vibe coders don't read code. I guess they would probably use something accessible to them, just in case.

seunosewa 1 hour ago||

Successful vibe coders read code.

olmo23 4 hours ago|||

If you already know some php python javascript and/or c, you can pretty much just wing with Claude code.

ksh09 7 hours ago|||

Mature tui packages like bubbletea, lipgloss. Besides TS resemblance to go could push the movement of rewrite, not necessarily easier though.

CC isn't foss in the first place, so the previous comment falls short.

jeremyjh 5 hours ago||

> TS resemblance to go

This is the second time I've seen claims like this in the last 24 hours and I'm afraid I might have lost contact with reality.

baq 8 hours ago||||

agents don't really care and they're doing anywhere between 90-100% of the work on CC. if anything, rust is better as it has more built-in verification out of the box.

echelon 3 hours ago|||

This is a terrible suggestion.

Rust is accessible to everyone now that Claude Code and Opus can emit it at a high proficiency level.

Rust is designed so the error handling is ergonomic and fits into the flow of the language and the type system. Rust code will be lower defect rate by default.

Plus it's faster and doesn't have a GC.

You can use Rust now even if you don't know the language. It's the best way to start learning Rust.

The learning curve is not as bad as people say. It's really gentle.

Rust is the best AI language. Bar none.

leonardcser 6 hours ago||||

already done, this is what I use now: https://github.com/leonardcser/agent

phillipcarter 3 hours ago|||

Claude Code is a Rust app now.

doug_durham 5 hours ago||||

I run many instances of Claude Code simultaneously and have not experienced what you are seeing. It sounds like you have a bias of Rust over Typescript.

jazzypants 5 hours ago||

No, they are describing a typical experience with the two apps. Just open both apps, run a few queries, and take a look at the difference in resource management yourself. It sounds like you have a bias of Claude Code over Codex.

Implicated 5 hours ago||

Uh, it sounds like you're having trouble understanding that people in this thread are talking about two wildly different "claude code" applications. Those who are claiming the resources issues don't apply to them are referring to the cli application, ie: `claude` and those are saying things like "Just open both apps..." are surely referring to their GUI versions.

jazzypants 4 hours ago|||

No, I've never used the GUI version. I literally just had to close and reopen the terminal running the Claude Code CLI on my Mac yesterday because it was taking too many resources. It generally happens when I ask Claude to use multiple sub agents. It's an obvious memory leak.

smugtrain 1 hour ago||||

On the 100% cpu issue, I’m curious to know, what is the processor and was it performing any other cpu intensive work?

RagnarD 9 hours ago||||

Totally agree. I'm baffled by those who don't clearly see that Codex works better than C.C. in many ways.

Aeolun 8 hours ago||

Codex being faster is not at all equivalent to working better. Claude Code does what I need from it most of the time.

trq_ 5 hours ago|||

Claude Code is not an electron app.

tmstieff 4 hours ago||

It does use React for rendering the terminal UI.

jdpigeon 4 hours ago||

Did not realize this. That's bizarre!

rbehrends 19 hours ago|||

I am more concerned about their, umm, gallant approach to security. Not only that OpenCode is permissive by default in what it is allowed to do, but that it apparently tries to pull its config from the web (provider-based URL) by default [1]. There is also this open GitHub issue [2], which I find quite concerning (worst case, it's an RCE vulnerability).

[1] https://opencode.ai/docs/config/#precedence-order

[2] https://github.com/anomalyco/opencode/issues/10939

heavyset_go 15 hours ago|||

It also sends all of your prompts to Grok's free tier by default, and the free tier trains on your submitted information, X AI can do whatever they want with that, including building ad profiles, etc.

You need to set an explicit "small model" in OpenCode to disable that.

integralid 15 hours ago|||

This. I work on projects that warrant a self hosted model to ensure nothing is leaked to the cloud. Imagine my surprise when I discovered that even though the only configured model is local, all my prompts are sent to the cloud to... generate a session title. Fortunately caught during testing phase.

DrewADesign 8 hours ago|||

I’m curious if there’s a reason you’re not just coding in a container without access to the internet, or some similar setup? If I was worried about things in my dev chain accessing any cloud service, I’d be worried about IDE plugins, libraries included in imports, etc. and probably not want internet access at all.

lukewarm707 8 hours ago||

[dead]

jmalicki 5 hours ago||||

Ok wow.

I mean the default model being Grok, whatever - that everyone sets to their favorite.

But the hidden use of a different model is wow.

signal_v1 5 hours ago||||

[dead]

lukewarm707 8 hours ago|||

[dead]

mrighele 7 hours ago||||

Documentation [1] says:

The small_model option configures a separate model for lightweight tasks like title generation. By default, OpenCode tries to use a cheaper model if one is available from your provider, otherwise it falls back to your main model.

I would expect that if you set a local model it would just use the same model. Or if for example you set GPT as main model, it would use something else from OpenAI. I see no mentions of Grok as default

[1] https://opencode.ai/docs/config/

lukewarm707 6 hours ago||

i ran it through mitmproxy, i am using pinned version 1.2.20, 6 march 2026, set up with local chat completions.

on that version, it does not fall back to the main model. it silently calls opencode zen and uses gpt-5-nano, which is listed as having 30 day retention plus openai policy, which is plain text human review by openai AND 3rd party contractors.

i see they removed the title model on v1.2.23.

i was so annoyed i made an account here today

vbernat 10 hours ago||||

From the code, this does not seem to be true anymore. It falls back to the current model if no small model is identified with the current provider. https://github.com/anomalyco/opencode/blob/9b805e1cc4ba4a984...

adam_mckenna 12 hours ago||||

It uses a model called "Big Pickle" by default which is an alias for minimax 2.5, as far as I've been able to tell.

indigodaddy 4 hours ago||||

Wait what, so are you saying if I am on some other model, it still sends my prompts to Grok??

rsanheim 11 hours ago||||

Wait what. For real? I knew their security posture was bad, but this bad??

gambiter 7 hours ago||

They're talking about before it's configured by the user. It defaults to 'free' models so that the user can ask a question immediately on startup. Once you configure a provider, the default models aren't used.

lukewarm707 6 hours ago|||

[dead]

ct520 18 hours ago||||

I second that.

Have fun on windows - automatic no from me. https://github.com/anomalyco/opencode/issues?q=is%3Aissue%20...

larschdk 11 hours ago|||

No surprise that a tool that can run shell scripts, open URLs, etc. is flagged down on Windows where AV try to detect such trojan methods.

foxygen 18 hours ago|||

Who cares about Windows?

Sebguer 17 hours ago|||

people who don't make OS preferences their entire personality

BoorishBears 18 hours ago||||

I do: they're important for ventilation in this heat wave.

UltraSane 15 hours ago|||

People who don't like messing around with drivers and like running Linux VMs on a Windows OS.

igravious 11 hours ago|||

Driver issues are way more of a thing on Windows than Linux or MacOS.

UltraSane 11 hours ago||

Getting hardware to work is MUCH harder on Linux

freehorse 7 hours ago||

Last years I have had more problem with hardware in windows than in linux. It is not so trivial anymore.

UltraSane 3 hours ago||

Please provide examples.

kolinko 11 hours ago||||

I think the parent meant vs MacOS, not vs Linux.

cracki 10 hours ago||

Users of MacOS rarely have an active dislike for Windows, nor are they likely to announce this.

freehorse 7 hours ago||

I use macos and I do actively dislike windows: here I announce it.

nmcfarl 6 hours ago|||

I liked the apple II, and the TRS 80 as I rather like basic. And then I didn’t hate DOS, and then I actively hated the graphical shell of Windows 3, but could not afford a Macintosh -so suffered through it where I had to, but mainly used DOS. Then I discovered UNIX, and did almost all of my work on a timeshare - in the early 90s!

Then Windows 95 came out and I actively hated it, but did think it was amazingly pretty - somehow this was the impetus for me to get a pc again, which I put Windows NT on. Which was profitable for freelance gigs in college. Soon after that, I dual booted it to Linux and spent most of my time in Slackware.

After that, I graduated and had enough money to buy a second rig, which I installed OS/2 warp on - which was good for side gigs. And I really liked. A lot. But my day job required that I have a Windows NT box to shell into the Solaris servers as we ran. Then I got a better class of employer and the next several let me run a Linux box to connect to our solaris (or Aix) servers.

Next my girlfriend at the time got a PowerBook G4 and installed OS X on it. It was obviously amazing. Windows XP came out, and it was once again so much worse than Windows NT - and crashed so much more - which was odd as it was based on Windows NT. (yes 98 was before this but it was really bad). Anyhow, right about here the Linux box I was running at home, died. And it was obvious that I was not going to buy an XP box, so I bought my first Mac.

And it’s been the same for the last 25 years - every time I look at a Windows box it’s horrible. I pretty much always have a Linux box headless somewhere in the house, and one rented in the cloud, and a Mac for interacting with the world.

And like the parent I actively dislike windows. And that’s interesting because I’ve liked most other operating systems I’ve used in my life, including MS-DOS. Modern windows is uniquely bad.

esafak 4 hours ago||

DOS was bad by UNIX standards too. Only Windows NT/2000 was decent.

UltraSane 3 hours ago|||

I use windows and absolutely hate the mac UI. Having the current window title bar always at the top of the screen doesn't make any sense when you have a very big monitor. It only made sense with the tiny monitors available when the mac UI was originally created.

shellwizard 12 hours ago|||

What? Drivers?

woctordho 18 hours ago||||

RCE is exactly the feature of coding agents. I'm happy with it that I don't need to launch OpenCode with --dangerously-skip every time.

mrln 10 hours ago||

No, it is still configurable. You can specify in your opencode.json config that it should be able to run everything. I think they just argued that it shouldn't be the default. Which I agree with.

rbehrends 51 minutes ago|||

No, the problem is that when logging in, the provider's website can provide an authentication shell command that OpenCode will send to the shell sight unseen, even if it is "rm -rf /home". This "feature" is completely unnecessary for the agent to function as an agent, or even for authentication. It's not about it being the default, it's about it being there at all and being designed that way.

indigodaddy 4 hours ago|||

And in the webui there is a don't ask button

TZubiri 17 hours ago||||

I assign a specific user for it, which doesn't have much access to my files. So what I want is complete autonomy.

jee599 12 hours ago||||

[dead]

iam_circuit 16 hours ago|||

[dead]

1dom 25 minutes ago|||

> Due to that, and because it's a popular an open source alternative, I want to be able to recommend it and be enthusiastic about it. The problem for me is that the development practices of the people that are working on it are suboptimal at best;

This is my experience with most AI tools that I spend more than a few weeks with. It's happening so often it's making me question my own judgement: "if everything smells of shit, check your own shoes." I left professional software engineering a couple of years ago, and I don't know how much of this is also just me losing touch with the profession, or being an old man moaning about how we used to do it better.

It reminds me of social media: there was a time where social media platforms were defined by their features, Vine was short video, snapchat was disappearing pictures, twitter was short status posts etc. but now they're all bloated messes that try do everything.

The same looks to be happening with AI and agent software. They start off as defined by one features, and then become messes trying to implement the latest AI approach (skills, or tools, or functions, or RAG, or AGENTS.md, or claws etc. etc.)

westoque 20 hours ago|||

> The problem for me is that the development practices of the people that are working on it are suboptimal at best; they're constantly releasing at an extremely high cadence, where they don't even spend the time to test or fix things (or even build a proper list of changes for each release), and they add, remove, refine, change, fix, and break features constantly at that accelerated pace.

this is what i notice with openclaw as well. there have been releases where they break production features. unfortunately this is what happens when code becomes a commidity, everyone thinks that shipping fast is the moat but at the expense of suboptimality since they know a fix can be implemented quickly on the next release.

siddboots 19 hours ago|||

Openclaw has 20k commits, almost 700k lines of code, and it is only four months old. I feel confident that that sort of code base would have a no coherent architecture at all, and also that no human has a good mental model of how the various subsystems interact.

I’m sure we’ll all learn a lot from these early days of agentic coding.

girvo 16 hours ago|||

> I’m sure we’ll all learn a lot from these early days of agentic coding.

So far what I am learning (from watching all of this) is that our constant claims that quality and security matter seem to not be true on average. Depressingly.

lelanthran 12 hours ago|||

> So far what I am learning (from watching all of this) is that our constant claims that quality and security matter seem to not be true on average.

Only for the non-pro users. After all, those users were happy to use excel to write the programs.

What we're seeing now is that more and more developers find they are happy with even less determinism than the Excel process.

Maybe they're right; maybe software doesn't need any coherence, stability, security or even correctness. Maybe the class of software they produce doesn't need those things.

I, unfortunately, am unable to adopt this view.

nunchiai 13 hours ago||||

I think what we're seeing is a phase transition. In the early days of any paradigm shift, velocity trumps stability because the market rewards first movers.

But as agents move from prototypes to production, the calculus changes. Production systems need: - Memory continuity across sessions - Predictable behavior across updates - Security boundaries that don't leak

The tools that prioritize these will win the enterprise market. The ones that don't will stay in the prototype/hobbyist space.

We're still in the "move fast" phase, but the "break things" part is starting to hurt real users. The pendulum will swing back.

imtringued 7 hours ago||

This makes sense. Development velocity is bought by having a short product life with few users. As you gain users that depend on your product, velocity must drop by definition.

The reason for this is that product development involves making decisions which can later be classified as good or bad decisions.

The good decisions must remain stable, while the bad decisions must remain open to change and therefore remain unstable.

The AI doesn't know anything about the user experience, which means it will inevitably change the good decisions as well.

staticassertion 13 hours ago|||

> our constant claims that quality and security matter

I'm 13 years into this industry, this is the first I'm hearing of this.

usagisushi 12 hours ago|||

I’ve heard the "S" in IoT stands for Security.

baq 8 hours ago||

same with openclaw

girvo 7 hours ago|||

20 for me, and let's not exaggerate. We've given lip service to it this entire time. Hell look at any of the corps we're talking about (including where I work) and they're demanding "velocity without lowering the quality bar", but it's a lie: they don't care about the quality bar in the slightest.

raesene9 3 hours ago||

One of my main lessons after a decent long while in security, is that most orgs care about security, *as long as it doesn't get in the way of other priorities* like shipping new features. So when we get something like Agentic LLM tooling where everything moves super fast, security is inevitably going to suffer.

blks 11 hours ago|||

I’m learning that projects, developed with the help of agents, even when developers claim that they review and steer everything, ultimately are not fully understood or owned by the developers, and very soon turns into a thousand reinvented wheels strapped together by tape.

KronisLV 10 hours ago||

> very soon turns into a thousand reinvented wheels strapped together by tape.

Also most of the long running enterprise projects I’ve seen - there was one that had been around for like 10 years and like about 75% of the devs I hadn’t even heard of and none of the original ones were in the project at all.

The thing had no less than three auditing mechanisms, three ways of interacting with the database, mixed naming conventions, like two validation mechanisms none of which were what Spring recommended and also configurations versioned for app servers that weren’t even in use.

This was all before AI, it’s not like you need it for projects to turn into slop and AI slop isn’t that much different from human slop (none of them gave a shit about ADRs or proper docs on why things are done a certain way, though Wiki had some fossilized meeting notes with nothing actually useful) except that AI can produce this stuff more quickly.

When encountered, I just relied on writing tests and reworking the older slop with something newer (with better AI models and tooling) and the overall quality improved.

bredren 15 hours ago||||

Claude Code breaks production features and doesn't say anything about it. The product has just shifted gears with little to no ceremony.

I expect that from something guiding the market, but there have been times where stuff changes, and it isn't even clear if it is a bug or a permanent decision. I suspect they don't even know.

heavyset_go 15 hours ago||||

We're still in the very early days of generative AI, and people and markets are already prioritizing quality over quantity. Quantity is irrelevant when it comes value.

All code is not fungible, "irreverent code that kinda looks okay at first glance" might be a commodity, but well-tested, well-designed and well-understood code is what's valuable.

danielovichdk 13 hours ago||

Generative what? Code is not a thing anymore, in fact it never really was, but now it's definitely not.

Code today can be as verbose and ugly as ever, because from here on out, fewer people are going to read it, understand and care about it.

What's valuable, and you know this I think, is how much money your software will sell for, not how fine and polished your code is.

Code was a liability. Today it's a liability that cost much much less.

sleepychu 11 hours ago|||

and once you've got your wish: ugly code without tests or a way to comprehend it, but cheap!

How much value are you going to be able to extract over its lifetime once your customers want to see some additional features or improvements?

How much expensive maintenance burden are you incurring once any change (human or LLM generated) is likely to introduce bugs you have no better way of identifying than shipping to your paying customers?

Maybe LLM+tooling is going to get there with producing a comprehensible and well tested system but my anectodal experience is not promising. I find that AI is great until you hit its limit on a topic and then it will merrily generate tokens in a loop suggesting the same won't-work-fix forever.

babol 8 hours ago||

What you wrote aligns with my experience so far. It's fast and easy to get something working, but in a number of cases it (Opus) just gets stuck 'spinning' and no number of prompts is going to fix that. Moreover - when creating things from scratch it tends to use average/insecure/ inefficient approaches that later take a lot of time to fix.

The whole thing reminds me a bit of the many RAD tools that were supposed to 'solve' programming. While it was easy to start and produce something with those tools, at some point you started spending way too much time working around the limitations and wished you started from scratch without it.

heavyset_go 9 hours ago||||

I'm of the opinion that the diligence of experts is part of what makes code valuable assets, and that the market does an alright job of eventually differentiating between reliable products/brands and operations that are just winging it with AI[1].

[1] https://museumoffailure.com/exhibition/wonka-chocolate-exper...

galaxyLogic 11 hours ago||||

I would think that the better the code is designed and factored and refactored, the easier it is to maintain and evolve, detect and remove bugs and security vulnerabilties from it. The ease of maintenance helps both AI and humans.

There are limits to what even AI can do to code, within practical time-limits. Using AI also costs money. So, easier it is to maintain and evolve a piece of software, the cheaper it will be to the owners of that application.

ajb 10 hours ago|||

You may not need to read it, but you still need to test it.

Code that has not been thoroughly tested is a greater liability, not a lesser one.l, the faster you can write it.

the_black_hand 13 hours ago|||

It's understandable and even desirable that a new piece of code rapidly evolves as they iterate and fix bugs. I'd only be concerned if they keep this pattern for too long. In the early phases, I like keeping up with all the cutting edge developments. Projects where dev get afraid to ship because of breaking things end up becoming bloated with unnecessary backward compatibility.

paustint 20 hours ago|||

I recently listened to this episode from the Claude Code creator (here is the video version: https://www.youtube.com/watch?v=PQU9o_5rHC4) and it sounded like their development process was somewhat similar - he said something like their entire codebase has 100% churn every 6 months. But I would assume they have a more professional software delivery process.

I would (incorrectly) assume that a product like this would be heavily tested via AI - why not? AI should be writing all the code, so why would the humans not invest in and require extreme levels of testing since AI is really good at that?

causal 16 hours ago|||

I've gotta say, it shows. Claude Code has a lot of stupid regressions on a regular basis, shit that the most basic test harness should catch.

mattmanser 11 hours ago||

I feel like our industry goes through these phases where there's an obvious thought leader that everyone's copying because they are revolutionary.

Like Rails/DHH was one phase, Git/GitHub another.

And right now it's kinda Claude Code. But they're so obviously really bad at development that it feels like a MLM scam.

I'm just describing the feeling I'm getting, perhaps badly. I use Claude, I recommended Claude for the company I worked at. But by god they're bloody awful at development.

It feels like the point where someone else steps in with a rock solid, dependable, competitor and then everyone forgets Claude Code ever existed.

causal 3 hours ago|||

I use Claude Code because Anthropic requires me to in order to get the generous subscription tokens. But better tools exist. If I was allowed to use Cursor with my Claude sub I would in a heartbeat.

brabel 10 hours ago|||

There are plenty of competitors! I’ve been using Copilot, RovoCLI, Gemni, and there’s OpenAI thing.

mattmanser 6 hours ago||

This aren't competitors, they're clones, it's a different thing.

CC leads and they follow.

logicprog 20 hours ago||||

I mean, I'm slowly trying to learn lightweight formal methods (i.e. what stuff like Alloy or Quint do), behavior driven development, more advanced testing systems for UIs, red-green TDD, etc, which I never bothered to learn as much before, precisely because they can handle the boilerplate aspects of these things, so I can focus on specifying the core features or properties I need for the system, or thinking through the behavior, information flow, and architecture of the system, and it can translate that into machine-verifiable stuff, so that my code is more reliable! I'm very early on that path, though. It's hard!

slopinthebag 9 hours ago|||

I heard from somebody inside Anthropic that it's really two companies, one which are using AI for everything and the other which spends all their time putting out fires.

cpeterso 19 hours ago|||

OpenCode's creator acknowledged that the ease of shipping has let them ship prototype features that probably weren't worth shipping and that they need to invest more time cleaning up and fixing things.

https://x.com/thdxr/status/2031377117007454421

rdedev 17 hours ago|||

Uff. This is exactly what Casey Muratori and his friend was talking about in of their more recent podcast. Features that would never get implemented because of time constraints now do thanks to LLMs and now they have a huge codebase to maintain

alansaber 8 hours ago||

Not terrible if they proactively depricate slop features

logicprog 19 hours ago||||

Well that's good to hear, maybe they'll improve moving forward on the release aspect at least.

j45 17 hours ago|||

What to release > What to build > Build anything faster

arcanemachiner 19 hours ago|||

I'm still trying to figure out how "open" it really is; There are reports that it phones home a lot[0], and there is even a fork that claims to remove this behavior[1]:

[0] https://www.reddit.com/r/LocalLLaMA/comments/1rv690j/opencod...

[1] https://github.com/standardnguyen/rolandcode

nikcub 19 hours ago|||

the fact that somebody was able to fork it and remove behaviour they didn't want suggests that it is very open

that #12446 PR hasn't even been resolved to won't merge and last change was a week ago (in a repo with 1.8k+ open PRs)

drdaeman 17 hours ago||

I think there’s a conflict between “open” as in “open source”, and “open” as in “open about the practice” paired with the fact we usually don’t review software’s source scrupulously enough to spot unwanted behaviors”.

Must be a karmic response from “Free” /s

nsonha 18 hours ago|||

so how is telemetry not open? If you don't like telemetry for dogmatic reasons then don't use it. Find the alternative magical product whose dev team is able to improve the software blindfolded

heavyset_go 15 hours ago|||

> Find the alternative magical product whose dev team is able to improve the software blindfolded

The choice isn't "telemetry or you're blindfolded", the other options include actually interacting with your userbase. Surveys exist, interviews exist, focus groups exist, fostering communities that you can engage is a thing, etc.

For example, I was recruited and paid $500 to spend an hour on a panel discussing what developers want out of platforms like DigitalOcean, what we don't like, where our pain points are. I put the dollar amount there only to emphasize how valuable such information is from one user. You don't get that kind of information from telemetry.

eastbound 13 hours ago||

> Surveys exist, interviews exist, focus groups exist, fostering communities that you can engage is a thing, etc.

We all know it’s extremely, extremely hard to interact with your userbase.

> For example I was paid $500 an hour

+the time to find volunteers doubled that, so for $1000 an hour x 10 user interviews, a free software can have feedback from 0.001% of their users. I dislike telemetry, but it’s a lie to say it’s optional.

—a company with no telemetry on neither of our downloadable or cloud product.

latexr 11 hours ago||

> We all know it’s extremely, extremely hard to interact with your userbase.

On the contrary, your users will tell you what you need to know, you just have to pay attention.

> I dislike telemetry, but it’s a lie to say it’s optional.

The lie is believing it’s necessary. Software was successful before telemetry was a thing, and tools without telemetry continue to be successful. Plenty of independent developers ship zero telemetry in their products and continue to be successful.

ipaddr 17 hours ago|||

Or by testing it themselves.

sauercrowd 1 hour ago|||

Highly recommend trying pi.dev

It's fully open, fairly minimal, very extensible and (while getting very frequent updates) never has broken on me so far.

Been using it more and more in the last two months, switching more and more from codex to it now.

blks 20 hours ago|||

Probably all describe problems stem from the developers using agent coding; including using TypeScript, since these tools are usually more familiar with Js/Js adjacent web development languages.

logicprog 20 hours ago||

Perhaps the use of coding agents may have encouraged this behavior, but it is perfectly possible to do the opposite with agents as well — for instance, to use agents to make it easier to set up and maintain a good testing scaffold for TUI stuff, a comprehensive test suite top to bottom, in a way maintainers may not have had the time/energy/interest to do before, or to rewrite in a faster and more resource efficient language that you may find more verbose, be less familiar with, or find annoying to write — and nothing is forcing them to release as often as they are, instead of just having a high commit velocity. I've personally found AIs to be just as good at Go or Rust as TypeScript, perhaps better, as well, so I don't think there was anything forcing them to go with TypeScript. I think they're just somewhat irresponsible devs.

jeremyjh 3 hours ago||

> I think they're just somewhat irresponsible devs.

Before coding agents it took quite a lot more experience before most people could develop and ship a successful product. The average years of experience of both core team and contributors was higher and this reflected in product and architecture choices that really have an impact, especially on non-functional requirements.

They could have had better design and architecture in this project if they had asked the AI for more help with it, but they did not even know what to ask or how to validate the responses.

Of course, lots of devs with more years of experience would do just as badly or worse. What we are seeing here though is a filter removed that means a lot of projects now are the first real product everyone the team has ever developed.

sorentwo 11 hours ago|||

The moment that OpenCode, after helping fix a Dockerfile issue, decided it was time to deploy to prod without asking for consent, I was out.

brabel 10 hours ago||

You must never rely on AI itself for authorization… don’t let it run on an environment where it can do that. I can’t believe this needs to be said but everyone seems to have lost their mind and decided to give all their permissions away to a non deterministic thing that when prompted correctly will send it all out to whoever asks it nicely.

BenGosub 2 hours ago|||

I agree that Opencodr is using a lot of RAM, but regarding the features, I am ak only using the built in features and I wouldn't say they are too many, they are just enough for a complete workflow. If you need more you can install plugins, which I haven't done yet and it's my daily driver for four months.

thatmf 19 hours ago|||

The value of having (and executing) a coherent product vision is extremely undervalued in FOSS, and IMO the difference between a successful project in the long-term and the kind of sploogeware that just snowballs with low-value features.

rounce 19 hours ago|||

> The value of having (and executing) a coherent product vision is extremely undervalued in FOSS

Interesting you say this because I'd say the opposite is true historically, especially in the systems software community and among older folks. "Do one thing and do it well" seems to be the prevailing mindset behind many foundational tools. I think this why so many are/were irked by systemd. On the other hand newer tools that are more heavily marketed and often have some commercial angle seem to be in a perpetual state of tacking on new features in lieu of refining their raison d'etre.

openclaw01 18 hours ago||

[dead]

Aperocky 19 hours ago|||

negative values even.

AppleAtCha 7 hours ago|||

Is there a name for these types of "overbearing" and visually busy "TUIs"? It seems like all the other agents have the same aesthetic and it is unlike traditional nurses or plain text interfaces in a bad way IMO. The constant spinners, sidebars and needless margins are a nuisance to me. Especially over an ssh connection in a tmux session it feels wrong.

theshrike79 4 hours ago|||

I’ve pretty much ended up with a pi.dev+gpt-5 and Claude combo. Sometimes I use GLM with Pi if I run out of quota or need some simple changes.

I tried Opencode but it was just too much? Same with Crush, 10/10 pretty but lacking in features I need. LSP support was cool though.

dopidopHN2 3 hours ago||

Can you expands on the cool part of LSP support ? I"m curious and "on paper" it sounds desirable but I'm unclear on the pluses

tshaddox 19 hours ago|||

I’m a little surprised by your description of constant releases and instability. That matches how I would describe Claude Code, and has been one of the main reasons I tend to use OpenCode more than Claude Code.

OpenCode has been much more stable for me in the 6 months or so that I’ve been comparing the two in earnest.

hboon 17 hours ago||

I use Droid specifically because Claude Code breaks too often for me. And then Droid broke too (but rarely), and I just stuck to not upgrading (like I don't upgrade WebStorm. Dev tools are so fragile)

thayne 3 hours ago|||

That sounds a lot like my experience with claude code. IDK about OpenCode, but claude code is also largely written by LLMs, and you can tell.

plastic3169 12 hours ago|||

I’ve been testing opencode and it feels TUI in appearance only. I prefer commandline and TUIs and in my mind TUI idea is to be low level, extremely portable interface and to get out of the way. Opencode does not have low color, standard terminal theme so had to switch to a proper terminal program. Copy paste is hijacked so I need to write code out to file in order to get a snippet. The enter key (as in the return by the keypad) does not work for sending a line. I have not tested but don’t think this would work over SSH even. I have been googling around to find if I am holding it wrong but it feels to break expectations of a terminal app in a way that I wish they would have made it a gui. Makes me sad because I think the goods are there and it’s otherwise good.

msh 11 hours ago||

I don’t think good TUI’s are the same as good command line programs. Great tui apps would to me be things like Norton/midnight commander, borlands turbo pascal, vim, eMacs and things like that

plastic3169 11 hours ago||

Yes cli and tui are not the same, but I expect TUI to work decent in general terminal emulator and not acitvely block copying and pasting. Having to install supported terminal emulator goes against the vibe.

zackify 18 hours ago|||

Yeah every time I want to like it, scrolling is glitched vs codex and Claude. And other various things like: why is this giant model list hard coded for ollama or other local methods vs loading what I actually have...

On top of that. Open code go was a complete scam. It was not advertised as having lower quality models when I paid and glm5 was broken vs another provider, returning gibberish and very dumb on the same prompt

tmatsuzaki 18 hours ago||

I agree. Since tools like Codex let you use SOTA models more cheaply and with looser weekly limits, I think they’re the smarter choice.

scuff3d 18 hours ago|||

Drives me nuts that we have TUIs written in friggin TS now.

That being said, I do prefer OpenCode to Codex and Claude Code.

cies 15 hours ago||

Why to you prefer? I have a different experience, and want to learn.

(I'm also hating on TS/JS: but some day some AI will port it to Rust, right?)

esafak 4 hours ago|||

I find it more configurable, for defining (sub)agent abilities, plugins, and different models/providers, of course.

scuff3d 14 hours ago|||

The biggest reason is I don't like being locked into an ecosystem. I can use whatever I want with OpenCode, not so much with Codex and Claude Code. Right now I'm only using GPT with it, but I like the option.

CC I have the least experience with. It just seemed buggy and unpolished to me. Codex was fine, but there was something about it that just didn't feel right. It seemed fined for code tasks but just as often I want to do research or discuss the code base, and for whatever reason I seemed to get terse less useful answers using Codex even when it's backed by the same model.

OpenCode works well, I haven't had any issues with bugs or things breaking, and it just felt comfortable to use right from the jump.

rco8786 7 hours ago|||

> they add, remove, refine, change, fix, and break features constantly at that accelerated pace.

I wonder how much of this is because the maintainers are using OpenCode to vibe the code for OpenCode.

bjackman 9 hours ago|||

That is very disappointing coz I've been wanting to try an alternative to Gemini CLI for exactly these reasons. The AI is great but the actual software is a buggy, slow, bloated blob of TypeScript (on a custom Node runtime IIUC!) that I really hate running. It takes multiple seconds to start, requires restarting to apply settings, constantly fucks up the terminal, often crashes due to JS heap overflows, doesn't respect my home dir (~/.gemini? Come on folks are we serious?), has an utterly unusable permission system, etc etc. Yet they had plenty of energy to inject silly terminal graphics and have dumb jokes and tips scroll across the screen.

Is Claude Code like this too? I wonder if Pi is any better.

A big downside would be paying actual cost price for tokens but on the other hand, I wouldn't be tied to Google's model backend which is also extremely flaky and unable to meet demand a lot of the time. If I could get real work done with open models (no idea if that's the case yet) and switch providers when a given provider falls over, that would be great.

WhyNotHugo 4 hours ago|||

I use Pi with Aliyun, which cost a flat ¥40 (~€5) per month for GLM-5, Kimi K2.5, Minmax and a few other models.

Honestly, these models seem quite on par with Claude. Some days they seem slightly worse, some days I can't tell the difference.

AFAIK, the usage quota is comparable to the Claude $200 subscription.

knocte 8 hours ago||||

> Is Claude Code like this too? I wonder if Pi is any better.

I'm very happy with Pi myself (running it on a small VPS so that I don't need to do sandboxing shenanigans).

badlogic 4 hours ago||||

you can use subscriptions with pi.

plagiarist 4 hours ago|||

Claude will also happily write a huge pile of junk into your home directory, I am sad to report. The permissions are idiotic as well, but I always use it in a container anyway. But I have not had it crash and it hasn't been slow starting for me.

horsh1 8 hours ago|||

You are describing a typical state of a wibecoded project.

fuy 3 hours ago|||

claude code easily uses 10+GB in single session :) 1Gb sounds very efficient by comparison

nico 16 hours ago|||

> they're constantly releasing at an extremely high cadence, where they don't even spend the time to test or fix things

Tbf, this seems exactly like Claude Code, they are releasing about one new version per day, sometimes even multiple per day. It’s a bit annoying constantly getting those messages saying to upgrade cc to the latest version

ctxc 15 hours ago||

Oh wow. I got multiple messages in a day and just assumed it was a cache bug.

It's annoying how I always get that "claude code has a native installer xyz please upgrade" message

auggierose 10 hours ago|||

I think it goes away if you actually use the native installer ...

lanyard-textile 12 hours ago|||

I've never gotten that message?

stego-tech 5 hours ago|||

This is why I'm taking a wait-and-see approach to these tools on HN myself. My month with Claude Code (the TUI, not the GUI) was amazing from an IT POV, just slop-generating niche tools I could quickly implement and audit (not giant-ass projects), but I ain't outsourcing that to another company when Qwen et al are right there for running on my M1 Pro or RTX 3090.

I'm looking forward to more folks building these kinds of tools with a stronger focus on portability via API or loading local models, as means of having a genuinely useful assistant or co-programmer rather than paying some big corp way too much money (and letting them use my data) for roughly the same experience.

jazzypants 5 hours ago|||

The types of models you can run locally on that hardware are toys in comparison to the foundation models

627467 5 hours ago||||

Curious about your setup of qwen on m1 pro. Care to share the toolkit?

plagiarist 5 hours ago|||

Do you have a setup with a local Qwen that can write out niche tools pretty well? I have been curious about how much I could do local.

grapheneposter 18 hours ago|||

Yeah I tried using it when oh-my-opencode (now oh-my-openagent) started popping off and found it had highly unstable. I just stick with internal tooling now.

darepublic 7 hours ago|||

Why not just code your own agent harness

namlem 14 hours ago|||

How much of the development is being done by humans?

foobarqux 19 hours ago|||

What is a better option?

logicprog 19 hours ago|||

For serious coding work I use the Zed Agent; for everything else I use pi with a few skills. Overall, though, I'd recommend Pi plus a few extensions for any features you miss extremely highly. It's also TypeScript, but doesn't suffer from the other problems OC has IME. It's a beautiful little program.

mmcclure 19 hours ago||

Big +1 to Pi[1]. The simplicity makes it really easy to extend yourself too, so at this point I have a pretty nice little setup that's very specific to my personal workflows. The monorepo for the project also has other nice utilities like a solid agent SDK. I also use other tools like Claude Code for "serious" work, but I do find myself reaching for Pi more consistently as I've gotten more confident with my setup.

[1] https://github.com/badlogic/pi-mono/tree/main/packages/codin...

noelsusman 6 hours ago||||

pi.dev is worth checking out. The basic idea is they provide a minimalist coding agent that's designed to be easy to extend, so you can tailor the harness to suit your needs without any bloat.

One of the best features is they haven't been noticed by Anthropic yet so you can still use your Claude subscription.

vinhnx 17 hours ago||||

I've been building VT Code (https://github.com/vinhnx/vtcode), a Rust-based semantic coding agent. Just landed Codex OAuth with PKCE exchange, credentials go into the system keyring.

I build VT Code with Tree-sitter for semantic understanding and OS-native sandboxing. It's still early but I confident it usable. I hope you'll give it a try.

andreynering 19 hours ago|||

https://charm.land/crush

rao-v 19 hours ago||

I tried crush when it first came out - the vibes were fun but it didn’t seem to be particularly good even vs aider. Is it better now?

andreynering 19 hours ago||

Disclaimer: I work for Charm, so my opinion may be biased.

But we did a lot of work on improving the experience, both on UX, performance, and the actual reliability of the agent itself.

I would suggest you to give it a try.

rao-v 16 hours ago||

Will do thanks - any standout features or clever things for me to look out for?

andreynering 8 hours ago||

We just launched this: https://charm.land/blog/crush-and-docker-mcp/

Also, non-interactive support, useful for some workflows:

https://github.com/charmbracelet/crush/releases/tag/v0.48.0

https://github.com/charmbracelet/crush/releases/tag/v0.50.0

jruz 11 hours ago|||

yeah I agree is way too buggy, nice tho and I appreciate the effort but really feels sloppy

mmaunder 4 hours ago|||

Yeah just try to select text to copy. Nope. Try to scroll back in terminal or tmux. Nope. Overbearing for sure.

alienbaby 17 hours ago|||

its hard not to wonder if they are taking their own medicine, but not quite properly

wvlia5 6 hours ago|||

this is a bot comment

mihaaly 9 hours ago|||

I tried it briefly and the practice - argued for strategy for operation actually - to override my working folder seelction and altering to the parent root git folder is a no go.

bakugo 19 hours ago|||

Isn't this pretty much the standard across projects that make heavy use of AI code generation?

Using AI to generate all your code only really makes sense if you prioritize shipping features as fast as possible over the quality, stability and efficiency of the code, because that's the only case in which the actual act of writing code is the bottleneck.

logicprog 19 hours ago||

I don't think that's true at all. As I said, in a response to another person blaming it on agentic coding above, there are a very large number of ways to use coding agents to make your programs faster, more efficient, more reliable, and more refined that also benefit from agents making the code writing research, data piping, and refactoring process quicker and less exhausting. For instance, by helping you set up testing scaffolding, handling the boilerplate around tests while you specify some example features or properties you want to test and expands them, rewriting into a more efficient language, large-scale refactors to use better data structures or architectures, or allowing you to use a more efficient or reliable language that you don't know as well or find to have too much boilerplate or compiler annoyance to otherwise deal with yourself. Then there are sort of higher level more phenomenological or subjective benefits, such as helping you focus on the system architecture and data flow, and only zoom in on particular algorithms or areas of the code base that are specifically relevant, instead of forever getting lost in the weeds of thinking about specific syntax and compiler errors or looking up a bunch of API documentation that isn't super important for the core of what you're trying to do and so on.

Personally, I find this idea that "coding isn't the bottleneck" completely preposterous. Getting all of the API documentation, the syntax, organizing and typing out all of the text, finding the correct places in the code base and understanding the code base in general, dealing with silly compiler errors and type errors, writing a ton of error handling, dealing with the inevitable and inoraticable boilerplate of programming (unless you're one of those people that believe macros are actually a good idea and would meaningfully solve this), all are a regular and substantial occurrence, even if you aren't writing thousands of lines of code a day. And you need to write code in order to be able to get a sense for the limitations of the technology you're using and the shape of the problem you're dealing with in order to then come up with and iterate on a better architecture or approach to the problem. And you need to see your program running in order to evaluate whether it's functionality and design a satisfactory and then to iterate on that. So coding is actually the upfront costs that you need to pay in order to and even start properly thinking about a problem. So being able to get a prototype out quickly is very important. Also, I find it hard to believe that you've never been in a situation where you wanted to make a simple change or refactor that would have resulted in needing to update 15 different call sites to do properly in a way that was just slightly variable enough or complex enough that editor macros or IDE refactoring capabilities wouldn't be capable of.

That's not to mention the fact that if agentic coding can make deploying faster, then it can also make deploying the same amount at the same cadence easier and more relaxing.

adithyassekhar 15 hours ago||

You're both right. AI can be used to do either fast releases or well designed code. Don't say both, as you're not making time, you're moving time between those two.

Which one you think companies prefer? Or if you're a consulting business, which one do you think your clients prefer?

bakugo 14 hours ago||

> AI can be used to do either fast releases or well designed code

I have yet to actually see a single example of the latter, though. OpenCode isn't an isolated case - every project with heavy AI involvement that I've personally examined or used suffers from serious architectural issues, tons of obvious bugs and quirks, or both. And these are mostly independent open source projects, where corporate interests are (hopefully) not an influence.

I will continue to believe it's not actually possible until I am proven wrong with concrete examples. The incentives just aren't there. It's easy to say "just mindlessly follow X principle and your software will be good", where X is usually some variation of "just add more tests", "just add more agents", "just spend more time planning" etc. but I choose to believe that good software cannot be created without the involvement of someone who has a passion for writing good software - someone who wouldn't want to let an LLM do the job for them in the first place.

logicprog 8 hours ago|||

> It's easy to say "just mindlessly follow X principle and your software will be good", where X is usually some variation of "just add more tests", "just add more agents", "just spend more time planning" etc

That's a complete strawman of what I — or others trying to learn how to use coding agents to increase quality, like Simon Willison or the Oxide team — am saying.

> but I choose to believe that good software cannot be created without the involvement of someone who has a passion for writing good software - someone who wouldn't want to let an LLM do the job for them in the first place.

This is just a no true Scotsman. I prefer to use coding agents because they don't forget details, or get exhausted, or overwhelmed, or lazy, or give up, ever — whereas I might. Therefore, they allow me to do all of the things that improve code and software quality more extensively and thoroughly, like refactors, performance improvements, and tests among other things (because yes, there is no single panacea). Furthermore, I do still care about the clarity, concision, modularity, referential transparency, separation of concerns, local reasonability, cognitive load, and other good qualities of the code, because if those aren't kept up a) I can't review the code effectively or debug things as easily when they go wrong, b) the agent itself will struggle to male changes without breaking other things, and struggle to debug, c) those things often eventually effect the quality of the end state software.

Additionally, what you say is empirically false. Many people who do deeply value quality software and code quality, such as the creators of Flask, Redis, and SerenityOS/Ladybird, all use and value agentic coding.

Just because you haven't seen good quality software with a large amount of agentic influence doesn't mean it isn't possible. That's very close minded.

bakugo 2 hours ago||

Show me an example then. I want to see an example of quality software that makes heavy use of AI generated code (as in, basically written entirely by AI similar to OpenCode), led by developer(s) who care deeply about software quality but still choose to not write code themselves.

viktorianer 8 hours ago|||

[dead]

Imustaskforhelp 10 hours ago|||

I tried running Opencode on my 7$/yr 512mb vps but it had the OOM issue and yes it needs 1GB of ram or more.

I then tried running other options like picoclaw/picocode etc but they were all really hard to manage/create

The UI/UX I want is that I can just put my free openrouter api key in and then I am ready to go to get access to free models like Arcee AI right now

After reading your comments/I read this thread, I tried crush by charmbracelet again and it gives the UI/UX that I want.

I am definitely impressed by crush/ the charm team. They are on HN and they work great for me, highly recommended if you want something which can work on low constrained devices

I do feel like Charm's TUI's are too beautiful in the sense that running a connection over SSH can delay so when I tried to copy some things, the delay made things less copy-able but overall, I think that I am using Crush and I am happy for the most part :-)

Edit: That being said, just as I was typing this, Crush took all the Free requests from Openrouter that I get for free so it might be a bit of minor issue but overall its not much of an issue from Crush side, so still overall, my point is that Crush is worth checking out

Kudos to the CharmBracelet team for making awesome golang applications!

fHr 14 hours ago|||

Rust > TS Codex > OpenCode

dfhvneoieno 2 hours ago|||

[dead]

vrganj 9 hours ago||

[dead]

heavyset_go 15 hours ago||

By default OpenCode sends all of your prompts to Grok's free tier to come up with chat summaries for the UI.

To change that, you need to set a custom "small model" in the settings.

solarkraft 6 hours ago||

This is my main problem I have with it: It sends data and loads code left and right by default. For instance, the latest plugin packages are automatically installed on every startup. Their “Zen” provider is enabled by default so you might accidentally upload your code base to their servers. Better yet: The web UI has a button that just uploads the entire session to their servers WITH A SINGLE CLICK for sharing.

The situation is ... pretty bad. But I don’t think this is particularly malicious or even a really well considered stance, but just a compromise in order to move fast and ship useful features.

To make it easily adoptable by anyone privacy conscious without hours of tweaking, there should be an effort to massively improve this situation. Luckily, unlike Claude Code, the project is open source and can he changed!

moffkalast 6 hours ago||

There is some kind of fitting irony around agentic coding harnesses mainly being maintained by coding agents themselves, and as a result they are all a chaotic mess.

ekjhgkejhgk 18 minutes ago|||

> By default OpenCode sends all of your prompts to Grok's free tier

Just my prompts, or everything the agent has in the context window?

Also, could you please provide a reference for this claim? Thank you

daliusd 5 hours ago|||

I had to double check this. Here is summary:

The model selection for title generation works as follows (prompt.ts:1956-1960): 1. If the title agent has an explicit model configured — that model is used. 2. Otherwise, it tries Provider.getSmallModel(providerID) — which picks a "small" model from the same provider as the current session, using this priority list (provider.ts:1396-1402): - claude-haiku-4-5 / claude-haiku-4.5 / 3-5-haiku / 3.5-haiku - gemini-3-flash / gemini-2.5-flash - gpt-5-nano - (Copilot adds gpt-5-mini at the front; opencode provider uses only gpt-5-nano) 3. If no small model is found — it falls back to the same model currently being used for the session. So by default, title generation uses a cheaper/faster small model from the same provider (e.g., Haiku if on Anthropic, Flash if on Google, nano if on OpenAI), and if none are available, it just uses whatever model the user is chatting with. You can also override this entirely by configuring a model on the title agent.

heavyset_go 5 hours ago||

When I did this, I used a single local llama.cpp server instance as my main model without setting a small model and it did not use it for chat titles while I used it for prompts.

Chat titles would work even when the local llama.cpp server hadn't started, and it was never in the the llama.cpp logs, it used an external model I hadn't set up and had not intended to use.

It was only when I set `small_model` that I was able to route title generation to my own models.

kmod 2 hours ago|||

Fwiw this got changed about a week ago, where they changed the logic to match the documentation rather than default to sending your prompts to their servers. This is why so many people have noticed this happening but if you ask an AI about it right now it will say this is not true.

Personally I think it's necessary to run opencode itself inside a sandbox, and if you do that you can see all of the rejected network calls it's trying to make even in local mode. I use srt and it was pretty straightforward to set up

agilob 12 hours ago|||

Also, even when using local models in ollama or lmstudio, prompts are proxied via their domain, so never put anything sensitive even when using local setup

https://old.reddit.com/r/LocalLLaMA/comments/1rv690j/opencod...

They also don't let you run all local models, but specific whitelisted by another 3rd party: https://github.com/anomalyco/opencode/issues/4232

embedding-shape 9 hours ago||

To be clear, that seems to be about the webui only, the TUI doesn't seem affected. I haven't fully investigated this myself, but when I run opencode (1.2.27-a6ef9e9-dirty) + mitmproxy and using LM Studio as the backend, when starting opencode + executing a prompt, I only see two requests, both to my LM Studio instance, both normal inference requests (one for the chat itself + one for generating the title).

Everything you read on the internet seems exaggerated today. Especially true for reddit, and especially especially true for r/LocalLllama which is a former shadow of itself. Today it's mostly sockpuppets pushing various tools and models, and other sockpuppets trying to push misinformation about their competitors tools/models.

zingar 10 hours ago|||

Geez there should be a big warning on the tin about this. They’re so neatly integrated with copilot that I assumed (and told others) that they had all the privacy guarantees of copilot :(

thdxr 7 hours ago|||

this isn't true

it will use whatever small model there is in your provider

we had a fallback where we provided free small models if your provider did not have one (gpt nano)

some configs fell back to this unexpectedly which upset people so we removed it

solarkraft 6 hours ago|||

I can tell that you’re doing all of this in the name of first-use UX. It’s working: The out of the box experience is really seamless.

But for serious (“grown up”) use, stuff like this just doesn’t fly. At all. We have to know and be able to control exactly where data gets sent. You can’t just exfiltrate our data to random unvetted endpoints.

Given the hurt trust of the past, there also needs to be a communication campaign (“actually we’re secure now”), because otherwise people will keep going around claiming that OpenCode sends all of your data to Grok. This would really unnecessarily hurt the project in the long run.

lukewarm707 6 hours ago|||

[dead]

Iolaum 7 hours ago|||

Not true according to a CGPT question:

More importantly, the current dev branch source for packages/opencode/src/session/summary.ts shows summarizeMessage() now only computes diffs and updates the message summary object; it does not make an LLM call there anymore. The current code path calls summarizeSession() and summarizeMessage(), and summarizeMessage() just filters messages, computes diffs, sets userMsg.summary.diffs, and saves the message.

https://github.com/anomalyco/opencode/blob/dev/packages/open...

arcadianalpaca 7 hours ago|||

Yikes... sending prompts to a third party by default with no disclosure in the setup flow is a rough look for a tool that positions itself as the open sources alternative. "Open" loses meaning fast if the defaults work against the user.

gmassman 13 hours ago|||

Seems like an anti-pattern to me to run AI models without user’s consent.

kuboble 12 hours ago||

? The whole idea of a coding assistant is to send all your interactions with the program to the llm model.

movq 11 hours ago||

To the provider you select in the UI, I agree. But OpenCode automatically sends prompts to their free "Zen" proxy, even without choosing it in the UI.

Imagine someone using it at work, where they are only allowed to use a GitHub Copilot Business subscription (which is supported in OpenCode). Now they have sent proprietary code to a third party, and don't even know they're doing it.

zingar 10 hours ago||

This is exactly me considering what I might have leaked to god knows who via grok. I was hyped by opencode but now I’m thinking of alternatives. A huge red flag… at best irresponsible?

exitb 13 hours ago|||

My understanding is that it’s best to set a whitelist in enabled_providers, which prevents it from using providers you don’t anticipate.

phantomCupcake 11 hours ago|||

Are you using Grok for the coding? Because I have Copilot connected and I can see the request to Copilot for the summaries - with no "small model" setting even visible in my settings.

dfhvneoieno 2 hours ago||

[dead]

solarkraft 7 hours ago||

I found out about OpenCode through the Anthropic feud. I now spend most of my AI time in it, both at work and at home. It turns out to be pretty great for general chat too, with the ability to easily integrate various tools you might need (search being the top one of course).

I have things to criticize about it, their approach to security and pulling in code being my main one, but over all it’s the most complete solution I’ve found.

They have a server/client architecture, a client SDK, a pretty good web UI and use pretty standard technologies.

The extensibility story is good and just seems like the right paradigms mostly, with agents, skills, plugins and providers.

They also ship very fast, both for good and bad, I’ve personally enjoyed the rapid improvements (~2 days from criticizing not being able to disable the default provider in the web ui to being able to).

I think OpenCode has a pretty bright future and so far I think that my issues with it should be pretty fixable. The amount of tasteful choices they’ve made dwarfs the few untasteful ones for me so far.

theshrike79 4 hours ago|

Try pi.dev+gpt-5, it works amazingly well

Just note that you need to either create any special features yourself or find an implementation by someone else. It’s pretty bare bones by default

softwaredoug 21 hours ago||

The team also is not breathlessly talking about how coding is dead. They have pretty sane takes on AI coding including trying to help people who care about code quality.

blackqueeriroh 15 hours ago||

Couldn’t tell by the way they write their software.

m463 19 hours ago|||

They probably don't have to write OKRs every quarter saying the opposite.

vortegne 6 hours ago||

Do you follow them? They most definitely pump out insane takes on twitter. But maybe that’s just engagement bait for a check, of course.

jFriedensreich 5 hours ago||

opencode stands out as one of the few agents with a proper client server architecture that allows something like openchambers great vscode extension so its possible to seamlessly switch between tui, vscode, webapp, desktop app. i think there is hardly a usable alternative for most coding agent usecases (assuming agents from model providers are a no go, they cannot be allowed to own the tools AND the models). But its also far from perfect: the webui is secretly served from their servers instead of locally for no reason. worse the fallback route gets also sent to their servers so any unknown request to opencode api ends up being sent to opencode servers potentially leaking data. the security defaults are horrific, its impossible to use it safely outside a controlled container. it will just serve your whole hard drive via rest endpoint and not constrain to project folders. the share feature uploading your conversations to their servers is also so weirdly communicated and implemented that it leaves a bad taste. I dont think this will become much better until the agent ecosystem is more modular and less monolith, acp, a2a and mcp need to become good enough so tools, prompts, skills, subagent setups and workflow engines and UIs are completely swappable and the agent core has to only focus on the essentials like runtime and glue architecture. i really hope we dont see all of these grow into full agent oses with artificial lock in effects and big effort buy in.

ramon156 22 hours ago||

The Agent that is blacklisted from Anthropic AI, soon more to come.

I really like how their subagents work, as a bonus I get to choose which model is in which agent. Sadly I have to resort to the mess that Anthropic calls Claude Code

pczy 22 hours ago||

They are not blacklisted. You are allowed to use the API at commercial usage pricing. You are just not allowed to use your Claude Code subscription with OpenCode (or any other third‑party harness for the record).

boxedemp 18 hours ago|||

I have my own harness I wrap Claude CLI in, I wonder if I'm breaking the rules...

arcanemachiner 17 hours ago|||

If you're not paying full-fat API prices, then probably.

From what I've heard, the metrics used by Anthropic to detect unauthorized clients is pretty easy to sidestep if you look at the existing solutions out there. Better than getting your account banned.

blackqueeriroh 15 hours ago||

No, they specifically said it’s only if you’re trying to build a whole other product for public consumption on top of it

theshrike79 4 hours ago|||

If you’re just essentially calling claude -p you’re fine

hrmtst93837 11 hours ago||||

So it's less 'blacklist' and more a licensing gotcha designed to crush price arbitrage, basically rent-seeking by toggling where the tollbooth sits.

Robdel12 21 hours ago||||

Has it occurred to anyone that Anthropic highest in the industry API pricing is a play to drive you into their subscription? For the lock-in?

Macha 20 hours ago||

The highest in in the industry for API pricing right now is GPT-5.4-Pro, OpenRouter adding that as an option in their Auto Router was when I had to go customise the routing settings because it was not even close to providing $30/m input tokens and $180/m output tokens of value (for context Opus 4.6 is $5/m input and $25/m output)

(Ok, technically o1-pro is even more expensive, but I'm assuming that's a "please move on" pricing)

wilg 21 hours ago||||

Sometimes people want to be real pedants about licensing terms when it comes to OSS, assuming such terms are completely bulletproof, other times people don't think the terms of their agreement with a service provider should have any force at all.

oldestofsports 21 hours ago||||

I dont understand this, what is the difference, technically!

KronisLV 21 hours ago|||

With Anthropic, you either pay per token with an API key (expensive), or use their subscription, but only with the tools that they provide you - Claude, Claude Cowork and Claude Code (both GUI and CLI variants). Individuals generally get to use the subscriptions, companies, especially the ones building services on top of their models, are expected to pay per token. Same applies to various third party tools.

The belief is that the subscriptions are subsidized by them (or just heavily cut into profit margins) so for whatever reason they're trying to maintain control over the harness - maybe to gather more usage analytics and gain an edge over competitors and improve their models better to work with it, or perhaps to route certain requests to Haiku or Sonnet instead of using Opus for everything, to cut down on the compute.

Given the ample usage limits, I personally just use Claude Code now with their 100 USD per month subscription because it gives me the best value - kind of sucks that they won't support other harnesses though (especially custom GUIs for managing parallel tasks/projects). OpenCode never worked well for me on Windows though, also used Codex and Gemini CLI.

anonym29 21 hours ago||

>or perhaps to route certain requests to Haiku or Sonnet instead of using Opus for everything, to cut down on the compute

You can point Claude Code at a local inference server (e.g. llama.cpp, vLLM) and see which model names it sends each request to. It's not hard to do a MITM against it either. Claude Code does send some requests to Haiku, but not the ones you're making with whatever model you have it set to - these are tool result processing requests, conversation summary / title generation requests, etc - low complexity background stuff.

Now, Anthropic could simply take requests to their Opus model and internally route them to Sonnet on the server side, but then it wouldn't really matter which harness was used or what the client requests anyway, as this would be happening server-side.

KronisLV 10 hours ago||

Sounds pretty sane, the same way how OpenWebUI and probably other software out there also has a concept of “tool models”, something you use for all the lower priority stuff.

Actually curious to hear what others think about why Anthropic is so set on disallowing 3rd party tools on subscriptions.

kasey_junk 8 hours ago||

The sota models are largely undifferentiated from each other in performance right now. And it’s possible open weight models will get “good enough” relatively soonish. This creates a classic case where inference becomes a commodity. Commodities have very low margins. Training puts them in an economic hole where low margins will kill them.

So they have to move up the stack to higher margin business solutions. Which is why they offer subsidized subscription plans in the first place. It’s a marketing cost. But they want those marketing dollars to drive up the stack not commodity inference use cases.

miki123211 21 hours ago||||

Anthropic's model deployments for Claude Code are likely optimized for Claude Code. I wouldn't be surprised if they had optimizations like sharing of system prompt KV-cache across users, or a speculative execution model specifically fine-tuned for the way Claude Code does tool calls.

When setting your token limits, their economics calculations likely assume that those optimizations are going to work. If you're using a different agent, you're basically underpaying for your tokens.

echelon 20 hours ago||

- OR - it's about lock-in.

Build the single pane of glass everyone uses. Offer it under cost. Salt the earth and kill everything else that moves.

Nobody can afford to run alternative interfaces, so they die. This game is as old as time. Remember Reddit apps? Alternative Twitter clients?

In a few years, CC will be the only survivor and viable option.

It also kneecaps attempts to distill Opus.

fnordpiglet 20 hours ago|||

It’s probably a mixture of things including direct control over how the api is called and used as pointed out above and giving a discount for using their ecosystem. They are in fact a business so it should not surprise anyone they act as one.

esperent 20 hours ago||

It might well be a mixture, but 95% of that mixture is vendor lock in. Same reason they don't support AGENTS.md, they want to add friction in switching.

mgambati 20 hours ago|||

They can try add as much as friction they want. A simple rename in the files and directories like .claude makes the thing work to move out of CC.

It’s not like moving from android to iOS.

esperent 19 hours ago||

You'd be surprised how effective small bits of friction are.

xvector 18 hours ago|||

If it was lock in they wouldn't make it absolutely trivial to change inference providers in Claude Code.

esafak 2 hours ago||

The goal is to use Anthropic subscriptions outside of Claude Code!! That is the lock in.

AlexCoventry 17 hours ago||||

It's very straightforward to instrument CC under tmux with send-keys and capturep. You could easily use that for distillation, IMO. There are also detailed I/O logs.

corehys 17 hours ago|||

[dead]

hereme888 21 hours ago||||

Subscription = token that requires refreshing 1-2x/day, and you get the freedom to use your subscription-level usage amount any way you want.

API = way more expensive, allowed to use on your terms without anthropic hindering you.

NewsaHackO 19 hours ago||

Also, Subscription: against the TOS of Claude Code, need to spoof a token and possibly get banned due to it.

hereme888 3 hours ago||

Yup. And right now I'm straight-up breaking Claude's TOS by modifying OpenCode to still accept tokens. But I only have a few days left and don't care if they ban me. I'm using what I paid for.

hackingonempty 21 hours ago||||

Anthropic has an API, you can use any client but they charge per input/output/cache token.

One-price-per-month subscriptions (Claude Code Pro/MAX @ $20/$100/$200 a month) use a different authentication mechanism, OAUTH. The useful difference is you get a lot more inference than you can for the same cost using the API but they require you to use Claude Code as a client.

Some clients have made it simple to use your subscription key with them and they are getting cease and desist letters.

jwpapi 21 hours ago|||

about 30 times more cost

hereme888 21 hours ago|||

Was it not obvious what the OP meant by blacklisted?

Maxatar 21 hours ago|||

Blacklisted usually means something is banned. OpenCode is not banned from using Anthropic's API.

enraged_camel 21 hours ago|||

No, it was not? For those whose native language is English, "blacklisted" implies Claude API will not allow OpenCode.

theshrike79 4 hours ago||

API will, they just can spoof Claude Code OAUTH credentials

lima 22 hours ago|||

You can still use OpenCode with the Anthropic API.

pimeys 22 hours ago||

Yep. That's what I do. Just API keys and you can switch from Opus to GPT especially this week when Opus has been kind of wonky.

stavros 21 hours ago|||

I pay $100/mo to Anthropic. Yesterday I coded one small feature via an API key by accident and it cost $6. At this rate, it will cost me $1000/mo to develop with Opus. I might as well code by hand, or switch to the $20 Codex plan, which will probably be more than enough.

I'd rather switch to OpenAI than give up my favorite harness.

sailfast 19 hours ago|||

This is the intention. They do not want folks that can’t pay to use their service.

theshrike79 4 hours ago||

SOTA models cost SOTA prices. Nothing new there

iAMkenough 19 hours ago||||

Out of curiosity, what's your next monthly subscription in terms of price?

stavros 19 hours ago||

Electricity, $95/mo.

iAMkenough 19 hours ago||

Now you got me thinking my electric company should start offering subscription tiers in these uncertain energy times...

stavros 19 hours ago||

Ours never will, they're a cartel, sadly. If you mean fixed subscription, next one is Netflix, I think, or my server provider at $40 or so.

nomel 13 hours ago||

My monthly "connection fee" is more than that (no solar, just EV). Your cartel needs to step it up!

For me it's $0.8/kWh during peak, $0.47 off peak, and super off peak of $0.15. I accidentally left a little mini 500W heater on all day, while I was out, costing > 5% of your whole month!

stavros 11 hours ago||

Wow, what the hell.

xienze 21 hours ago||||

Yeah I had a similar experience one time. Which is why I laugh when people suggest Anthropic is profitable. Sure, maybe if everyone does API pricing. Which they won’t because it’s so damn expensive. Another way to think about it is API pricing is a glimpse into the future when everyone is dependent on these services and the subscription model price increases start.

mattmanser 21 hours ago||

I don't get why people talk about ChatGPT as some great saviour though, they're in the same boat but just have more money to burn.

AlexCoventry 16 hours ago|||

[flagged]

_ache_ 15 hours ago||

Yes, you are doing it too with antropic an xAI. I don't get your point. xAI and OpenAI are a little worst? Maybe, still very well fascism.

robbie-c 7 hours ago|||

Quite a lot worse. Both OpenAI and xAI were among the largest donors of Trump's campaign

Musk was the largest individual political donor of the 2024 election [1] and Greg Brockman was the largest donor to Trump's "MAGA Inc" super PAC [2]

[1] https://www.washingtonpost.com/technology/2024/12/06/elon-mu...

[2] https://www.theverge.com/ai-artificial-intelligence/867947/o...

Robdel12 7 hours ago|||

You’re right, Anthropic is quite a bit worse https://www.washingtonpost.com/technology/2026/03/04/anthrop...

robbie-c 6 hours ago||

Wait - are you missing all the context on this? Anthropic pushed back against this hard, there was a whole back and forth. I'm on mobile and can't look it up for you atm but if you google about this scenario, Anthropic definitely come out of this looking a lot better than OpenAI and xAI

Robdel12 4 hours ago||

Did you read the article? Or are you just replying?

Anthropic has literally been working with the DoD and Plantair for 2 years now. They were key to the Iran invasion.

If thats “looking better”, keep it.

theshrike79 4 hours ago||

Key being “worked”, past tense

Now they’re blacklisted from government work (appeal pending) and OpenAI practically jumped to replace them immediately

Robdel12 3 hours ago||

You’re right it completely doesn’t matter they’ve been instrumental to ICE and the war in Iran. It’s fine now, their previous actions are excused.

Edit: the word you’re also looking for is “working” not “worked”. I have many friends on government contracts still using Claude.

_ache_ 6 hours ago|||

If you evaluate fascism in terms of donation, yes.

But it is more about the political opinions, IMHO, and Anthropic doesn't sound more attractive than the competitors. Anthropic is very much to the right of the transhumanism spectrum (even if xAI and OpenAI are even farther).

AlexCoventry 13 hours ago|||

IMO, OpenAI have either implicitly committed to becoming the IT service for Trump's secret police, or they've willingly signed up for the harsh retaliation Anthropic's getting, knowing that the Trump administration will inevitably try to push OpenAI around in the same way, if they meaningfully refuse to assist in domestic mass surveillance efforts.

stavros 11 hours ago||

Anthropic was fine doing the same, they just didn't want it done to Americans.

AlexCoventry 3 hours ago||

OpenAI agreeing to operate as Trump's secret police materially impacts your security as a European, though, because it cements Trump's power.

stavros 2 hours ago||

Again, though, both of them agreed to that. Anthropic just didn't want to spy on Americans.

AlexCoventry 2 hours ago||

You can argue a moral equivalence, I guess, but on a practical level, OpenAI's decision is more dangerous for everyone, because it will help to secure Trump as a dictator.

gwd 21 hours ago||||

Or have Claude write the code and Gemini review it. (Was using GPT for review until the recent Pentagon thing.)

blks 20 hours ago||

You can also review the code you ship yourself.

gwd 12 minutes ago||

I certainly do -- but having Gemini review it first saves a lot of time.

jatora 22 hours ago|||

'just API key' lol. just hundreds of dollars at a minimum

pimeys 11 hours ago|||

Yes. And many companies pay that.

specproc 21 hours ago|||

[flagged]

fr33k3y 21 hours ago|||

I'm testing glm5 on Claude code and opencode just to stop consuming American... Soo good so far!

jen20 21 hours ago|||

Qwen works fine and requires paying no-one except a hardware vendor.

raincole 17 hours ago|||

More what to come?

heywinit 12 hours ago||

probably more agents to be blocked by anthropic. i've seen theo from t3.gg go through a bunch of loopholes to support claude in his t3code app just so anthropic doesn't sue their asses.

cyanydeez 21 hours ago||

a $3000 AMD395+ will get you pretty close to a open development environment.

anonym29 20 hours ago||

There are boards starting in the $1500-$2000 range, and complete systems in the $2500-$2700 range. I actually don't know of any Strix Halo mini PCs that cost $3000, do you?

EDIT: The system I bought last summer for $1980 and just took delivery of in October, Beelink GTR 9 Pro, is now $2999.... wow...

UncleOxidant 15 hours ago|||

RAM has gone up a lot since last summer.

free652 20 hours ago||||

the boards now are pricier, at least the framework one. I got it for 1700, and now its ~$2400.

Shebanator 20 hours ago||||

not mini PCs, no, but there are laptops that do

ricardobeat 20 hours ago|||

I bought mine, a mini PC, for $1400 just six months ago. This bubble will pass.

hippycruncher22 21 hours ago||

I'm a https://pi.dev man myself.

krzyk 13 hours ago||

Why most of those tools are written in js/ts?

JS is not something that was developed with CLI in mind and on top of that that language does not lend itself to be good for LLM generation as it has pretty weak validation compared to e.g. Rust, or event C, even python.

Not to mention memory usage or performance.

solarkraft 7 hours ago|||

TS is just a boring default.

It’s simply one of the most productive languages. It actually has a very strong type system, while still being a dynamic language that doesn’t have to be compiled, leading to very fast iteration. It’s also THE language you use when writing UIs. Execution is actually pretty fast through the runtimes we have available nowadays.

The only other interpreted language is Python and that thoroughly feels like a toy in comparison (typing situation still very much in progress, very weak ORM situation, not even a usable package manger until recently!).

jpc0 4 hours ago|||

> It’s also THE language you use when writing UIs

I'm unsure that I agree with this, for my smaller tools with a UI I have been using rust for business logic code and then platform native languages, mostly swift/C#.

I feel like with a modern agentic workflow it is actually trivial to generate UIs that just call into an agnostic layer, and keeping time small and composable has been crucial for this.

That way I get platform native integration where possible and actual on the metal performance.

plipt 4 hours ago||||

If Python has a "very weak ORM situation", what is it about the TS ORM scene that makes it stronger by comparison? Is there one library in particular that stands out?

_ache_ 5 hours ago|||

I was going to say that pnpm isn't that old but wikipedia says 2017!

solarkraft 4 hours ago||

pnpm is amazing for speed and everybody should use it! but even with npm before it, at least it was correct. I had very few (none?) mysterious issues with it that could only be solved by nuking the entire environment. That is more than I can say about the python package managers before uv.

theshrike79 3 hours ago||

uv + PEP723 is amazing for CLI tools

You download one .py, run it and uv automatically downloads and installs any requirements to a virtual environment and runs it

maleldil 3 hours ago||

Has the developer tooling been fixed? Doesn't it use an ephemeral environment? How do editors/LSPs know where to get dependency information?

manmal 12 hours ago||||

For a TUI agent, runtime performance is not the bottleneck, not by far. Hackability is the USP. Pi has extensions hotreloading which comes almost for free with jiti. The fact that the source is the shipped artifact (unlike Go/Rust) also helps the agent seeing its own code and the ability to write and load its own extensions based on that. A fact that OpenClaw’s success is in part based on IMO.

I can’t find the tweet from Mario (the author), but he prefers the Typescript/npm ecosystem for non-performance critical systems because it hits a sweet spot for him. I admire his work and he’s a real polyglot, so I tend to think he has done his homework. You’ll find pi memory usage quite low btw.

krzyk 10 hours ago||

OK, make sense, but there are also claw clones that are in Rust (and self modifying).

Also python ones would also allow self modifying. I'm always puzzled (and worried) when JS is used outside of browsers.

I'm biased as I find JS/TS rather ugly language compared to anything other basically (PHP is close second). Python is clean, C has performance, Rust is clean and has performance, Java has the biggest library and can run anywhere.

the_mitsuhiko 10 hours ago||||

In pi’s case there is a plugin system. It’s much easier to make a self extending agent work with Python or JavaScript than most other languages. JavaScript has the benefit that it has a great typing system on top with TypeScript.

gjs278 13 hours ago|||

[dead]

6ak74rfy 18 hours ago|||

Same.

Pi is refreshingly minimal in terms of system prompts, but still works really well and that makes me wonder whether other harnesses are overdoing. Look at OpenCode's prompts, for instance - long, mostly based on feels and IMO unnecessary. I would've liked to just overwrite OC's system prompts with Pi's (to get other features that Pi doesn't have) but that isn't possible today (without maintaining a custom fork)

onetom 5 hours ago|||

Pi is the Emacs of coding AI agents.

It's a pity it's written in TS, but at least it can draw from a big contributor pool.

There is https://eca.dev/ too, which might worth considering, which is a UI agnostic agent, a bit like LSP servers.

szatkus 10 hours ago|||

I just found out about pi yesterday. It's the only agent that I was able to run on RISC-V. It's quite scary that it runs commands without asking though.

theshrike79 3 hours ago||

It has zero safeguards by default

But the magic is that it knows how to modify itself, if you need a plan mode you can ask it to implement it :)

pontussw 12 hours ago|||

Same here!

The simplicity of extending pi is in itself addictive, but even in its raw form it does the job well.

Before finding pi I had written a lot of custom stuff on top of all the provider specific CLI tools (codex, Claude, cursor-agent, Gemini) - but now I don’t have to anymore (except if I want to use my anthropic sub, which I will now cancel for that exact reason)

wyre 19 hours ago|||

Same.

I’m sure there’s a more elegant way to say this, but OpenCode feels like an open source Claude Code, while pi feels like an open source coding agent.

vorticalbox 8 hours ago|||

> Sessions are stored as trees

that is actually really nice

Richard_Jiang 18 hours ago|||

Pi is a great project, and the lightweight Agent development is really recommended to refer to Pi's implementation method.

cmrdporcupine 18 hours ago||

Pi is good stuff and refreshingly simple and malleable.

I used it recently inside a CI workflow in GitLab to automatically create ChangeLog.md entries for commits. That + Qwen 3.5 has been pretty successful. The job starts up Pi programatically, points it at the commits in question, and tells it to explore and get all the context it needs within 600 seconds... and it works. I love that this is possible.

planckscnst 20 hours ago||

I love OpenCode! I wrote a plugin that adds two tools: prune and retrieve. Prune lets the LLM select messages to remove from the conversation and replace with a summary and key terms. The retrieve tool lets it get those original messages back in case they're needed. I've been livestreaming the development and using it on side projects to make sure it's actually effective... And it turns out it really is! It feels like working with an infinite context window.

https://www.youtube.com/live/z0JYVTAqeQM?si=oLvyLlZiFLTxL7p0

computerex 12 hours ago||

Hey I built that into my harness! http://github.com/computerex/z

Long tool outputs/command outputs everything in my harness is spilled over to the filesystem. Context messages are truncated and split to filesystem with a breadcrumb for retrieving the full message.

Works really well.

signal_v1 5 hours ago|||

The infinite context window framing is the right way to think about it. Running inside Claude Code continuously, the prune step matters more than retrieve in practice — most of what gets dropped stays dropped. More useful is being deliberate about what goes in at the start of each loop iteration rather than managing what comes out at the end.

weird-eye-issue 15 hours ago|||

That doesn't sound all that useful to be honest and would likely increase costs overall due to the hit to prompt caching by removing messages

embedding-shape 8 hours ago||

> would likely increase costs overall

Assuming you pay per token, which seems like a really strange workflow to lock yourself into at this point. Neither paid monthly plans nor local models suffer from that issue.

I tried once to use APIs for agents but seeing a counter of money go up and eventually landing at like $20 for one change, made it really hard to justify. I'd rather pay $200/month before I'd be OK with that sort of experience.

weird-eye-issue 6 hours ago|||

Yes I use the $200 per month plan for Claude Code and it's amazing

I assume the usage varies based on prompt caching, but I could be wrong. Why would you assume prompt caching would have zero effect on the subscription usage?

signal_v1 5 hours ago|||

The $20-per-change problem is a workflow problem, not a pricing problem. Batching work into larger well-scoped sessions rather than interactive back-and-forth changes the unit economics significantly. Most people use these tools like a terminal — one command at a time — which is the worst possible cost profile.

manmal 12 hours ago|||

Have a look how pi.dev implements /tree. Super useful

esafak 2 hours ago|||

That borks the cache and costs you more.

advael 20 hours ago||

Seems interesting, but at a glance I can't find a repo or a package manager download for this. Have you made it available anywhere?

sheo 19 hours ago||

I found the opencode fork repo, but no plugin seems available so far

https://github.com/Vibecodelicious/opencode

monkey26 46 minutes ago||

I do like OpenCode, and have been using it in and off since last July. But I feel like they’re trying to stuff too much GUI into a TUI? Due to this I find myself using Codex and Pi more often. But am still glad OpenCode and their Zen product exist.

brendanmc6 21 hours ago|

I’ve been extraordinarily productive with this, their $10 Go plan, and a rigorous spec-driven workflow. Haven’t touched Claude in 2 months.

I sprinkle in some billed API usage to power my task-planner and reviewer subagents (both use GPT 5.4 now).

The ability to switch models is very useful and a great learning experience. GLM, Kimi and their free models surprised me. Not the best, not perfect, but still very productive. I would be a wary shareholder if I owned a stake in the frontier labs… that moat seems to be shrinking fast.

helloplanets 14 hours ago||

> Moat seems to be shrinking fast.

It's been a moving target for years at this point.

Both open and closed source models have been getting better, but not sure if the open source models have really been closing the gap since DeepSeek R1.

But yes: If the top closed source models were to stop getting better today, it wouldn't take long for open source to catch up.

xvector 18 hours ago|||

The moat is having researchers that can produce frontier models. When OpenCode starts building frontier models, then I'd be worried; otherwise they're just another wrapper

brendanmc6 18 hours ago|||

Of course, my point is that these trailing models are close behind, and cost me a lot less, and work great with harnesses like OpenCode.

troymc 16 hours ago|||

"OpenCode Go" (a subscription) lets you use lots of hosted open-weights frontier AI models, such as GLM-5 (currently right up there in the frontier model leaderboards) for $10 per month.

xvector 4 hours ago||

GLM is benchmaxxed, leaderboards don't mean much anymore

theshrike79 3 hours ago||

Also most of the development experience is in the harness, the models aren’t as important anymore

quietsegfault 20 hours ago||

Can you talk more about how you leverage higher quality models for the stuff that counts? Anywhere I can read more on the philosophy of when to use each?

brendanmc6 19 hours ago|||

Sure happy to share. It’s been trial and error, but I’ve learned that for agents to reliably ship a large feature or refactor, I need a good spec (functional acceptance criteria) and I need a good plan for sequencing the work.

The big expensive models are great at planning tasks and reviewing the implementation of a task. They can better spot potential gotchas, performance or security gaps, subtle logic and nuance that cheaper models fail to notice.

The small cheap models are actually great (and fast) at generating decent code if they have the right direction up front.

So I do all the spec writing myself (with some LLM assistance), and I hand it to a Supervisor agent who coordinates between subagents. Plan -> implement -> review -> repeat until the planner says “all done”.

I switch up my models all the time (actively experimenting) but today I was using GPT 5.4 for review and planning, costing me about $0.4-$1 for a good sized task, and Kimi for implementation. Sometimes my spec takes 4-5 review loops and the cost can add up over an 8 hour day. Still cheaper than Claude Max (for now, barely).

Each agent retains a fairly small context window which seems to keep costs down and improves output. Full context can be catastrophic for some models.

As for the spec writing, this is the fun part for me, and I’ve been obsessing over this process, and the process of tracking acceptance criteria and keeping my agents aligned to it. I have a toolkit cooking, you can find in my comment history (aiming to open source it this week).

letsgethigh 14 hours ago||

How are you managing context?

I'm building a full stack web app, simple but with real API integrations with CC.

Moving so fast that I can barely keep a hold on what I'm testing and building at the same time, just using Sonnet. It's not bad at all. A lot of the specs develop as I'm testing the features, either as an immediate or a todo / gh issue.

How can you manage an agentic flow?

stavros 20 hours ago|||

I wrote something about that: https://www.stavros.io/posts/how-i-write-software-with-llms/

More comments...