Posted by david927 20 hours ago
Ask HN: What are you working on? (May 2026)
Now I'm working on expanding the work into more parameters and improving performance. I just finished an extremely harsh test of a Nemotron-flavored RVW that consisted of stretches of a random assortment of domains interspersed with long runs of single domains. Across all of it the model didn't forget (and actually improved on some of the more challenging domains). PPL on SmolTalk is still in the ~18 range, which I'd like to get lower, but this is all with only 4B params.
Currently, I'm training a Llama 3.2-flavored RVW with only about 2B params to see how that turns out. Depending on results of that, I may take it to Gemma 4 next.
Conclusion is permission reviews with LLMs like Claude’s auto mode or Codex auto review are like using a data center to flip a light switch - overkill.
The main benefit is that your agent’s autonomy can be governed deterministically through policies that can be stored at the user and repo level. The bonus is that you save tokens vs using auto modes.
Native application, no web UI, built using Rust + iced.rs, minimal dependency. NO AI.
I am putting the best effort to make it performant. Target audience is the users who want's the simplicity of the notepad [non-sloppy one], but still with some bells and whistles to note without worrying about managing the metadata manually.
I think with scripting there will be infinite possibilities to play with linear notes, and I want to make that happen.
Continuous challenges while implementing features are:
1. It should load instantly
2. Keeping it extremely simple to use
3. Keeping the interface minimal
4. Still have ways to let the user find the features easily.
Will have a demo version ready soonWe are building this because we such library it in our core business, and a lot of other engineers seem to need it too. We have contributors showing up with bug reports and fixes, and real interest from people building apps around .docx docs.
My previous show hn post (https://news.ycombinator.com/item?id=46947229) got a lot of skepticism because we're developing heavily with AI, but with active community feedback and proper ai oversight (mostly me), I'm super proud of what we have now.
This has some interesting implications. If you make a mistake, you can always backtrack and try again. If you have a crocheted piece, at least in principle you could find the lose end, free it, and work back stitch by stitch to reverse engineer it. (In practice people don't seem to do a stitch-for-stitch reverse engineering just like you probably wouldn't bother reimplementing something line by line without a compelling reason, you figure out what's going on in the challenging places just by look and feel and improvise from there.)
I'm oversimplifying somewhat and there are some forms of crochet that include irreversible stitches, yarn can be felted together (entangled, like a cotton ball) to create irreversible bonds between adjacent strands, and often several panels/pieces are joined together irreversibly to create a larger piece.
In addition to these tools, I'm also building automation that will port the tools from the reference implementation (OpenCode) to other harnesses (Claude Code, Cline, Pi, Gemini, Kilo, Codex, others to come?). As well as automation that will either cherry-pick or re-implement commits onto the latest head from upstream.
[1]: https://github.com/Vibecodelicious/context-bonsai-agents#con...
[2]: https://blog.vibecodelicio.us/posts/how-i-fixed-context-wind...
The existing ones were quite expensive, especially when I started out. A friend had the idea to get a cheap/non-functioning lawnmower second hand, and tear out the circuit board. We're in the process of coding up a new ROS2 based stack that will roam the lawn on GPS with RTK in the charging station. My friend does most of the electronics stuff, and I focus on the software.
I'm at the point where I will start testing a simple bounding box soon and just have it drive around until it "hits the edge" and then randomly pick a new direction.
It's fun so see the software I build "in real life" instead of as a web-site, as is the case for my my daily job.
It’s a hobby project in a very early state where it technically works but it’s missing several things I think it needs before I’d use it for anything serious. As of right now it isn’t even complete enough to dogfood a minimal container for itself without an intermediate base image because it can’t target a platform compatible with the distroless uv container image.