I started looking at the commits, and it's basically solving the ,,tests not pass'' problem by changing the tests themselves. The real work of making it working on programs that are already deployed will be just starting now.
The only silver lining I see is that the server side JS community for some reason is already used to breakages all the time.
Not sure if these decisions were made by the LLM, but I've always felt that Claude is more prone to doing "shady stuff" like modifying tests than finding correct solutions to problems.
GPT/Codex is more honest in this regard.
Having said that, after looking at some of the test changes, they seem to be minor things, like changing timeouts, not changing the actual intended semantics of the tests. But it's too much code to review everything, so I might be completely wrong about that, and in real-world usage, even minor changes like these will cause issues.
https://github.com/oven-sh/bun/pull/30412/changes/68a34bf8ed...
This is great! Just add a random sleep(1) to a test, don't worry about it, it's going to be fine!
Strange test though either way.
Wow, This is definitely quite something for sure.
Can jarred comment about if he has read the commits or not too or respond to your comment, this has basically made me lose the small faith I had in what bun is doing if it turns out to be correct.
I'm happy it's not a project I'm depending on, but a large enough project had to try this at some point so that we all can learn from how it goes.
I think this is why Antropic bought bun, so that they can sell big code translation as a feature for all the banks with COBOL code that they want to get rid of for a long time.
Still, those banks / enterprises won't appreciate the number of unit test changes.
And I agree with another comment that Codex xhigh is much better for these kinds of tasks, but still hard on this kind of scale.
I'd want to be pretty damn confident it won't cause any regressions before sunsetting the original codebase in favor of this one.
There’s probably loads(?) of observable behaviors that people rely on, consciously or not. Even _if_ the new thing is 100% spec compliant, it might still be breaking or otherwise problematic for heavy users.
That said, I’d love to be proven wrong. I use Bun from time to time on small stuff and I enjoy it, so I wish them well (:
No, you are perfectly normal.
The people who in one week decided to replace the whole codebase for a widely used tool with code no human has seen are the crazy ones.
There’s no way they can know that for sure. A change of this magnitude cannot go from experiment to success in such a short time frame. Even if all the code were 100% correct, you can’t call it a success until it’s battle tested in real world scenarios for a while, and that is impossible without time. Same way you can’t cook properly by throwing food into a vulcano. It’s not just about the temperature.
Either the “experiment” claim was a lie or they are being irresponsible.
Frustration moves mountains, I don't think this rewrite was done lightly.
Citation needed. Couldn't it just as easily have been one person being as suspicious of the task as everyone else seemed to be?
For where this is coming from, skim the bugfixes in the Bun v1.3.14 and earlier release notes. Rust won’t catch all of these - leaks from holding references too long and anything that re-enters across the JS boundary are still on us. But a large % of that list is use-after-free, double-free, and forgot-to-free-on-error-path, and those become compile errors or automatic cleanup.
Impressive to rewrite 1MLOC in a week yes, but this is more of a job of a million monkey programmers crammed in a datacenter than a bunch geniuses. And I would know, since I'm a monkey programmer who is in danger now... Or maybe the Zig team is in a greater danger, since their brains hold the genius juice the clankers are missing and they should have it by 2027...
Even if the translation was free and into ideal idiomatic Rust (and it's obviously not - it's now Zig with Rust syntax) then this would be churn for the sake of churn.
At some project scale the language really stops being any limiting factor, and you're instead mostly dealing with working past past architectural decisions, integration of large changes, deep optimization, steering the codebase into alignment with project roadmaps and long-term goals, regression testing as features get introduced, maintenance of multiple release trains... Experienced software engineers mostly stop caring about simple things like the programming language choice at that point, because whatever issues come from that choice have already been resolved. What matters is stability, careful orchestration of large changes and a stable and comprehensive test suite.