Posted by david927 1 day ago
Ask HN: What are you working on? (June 2026)
I'm also looking into coding harness self-improvement [2]. An inner LLM (raw LLM request) + harness solves coding tasks, an outer agent like Claude or Codex that proposes harness changes. I experimented with many things in the past few months that made me realize this self-improvement thing that everyone is talking about is just an experiment design problem. I wrote about it here [3]. I'm continuing to improve the infra around the self-improvement loop, to increase signal-to-noise ratio per experiment. I'm also generalizing the infra to expand beyond terminal bench tasks and to collect some data across different models (harness-bound vs model-bound).
[1] https://github.com/workofart/ml-by-hand
[2] https://github.com/workofart/harness-experiment
[3] https://www.henrypan.com/blog/2026-05-25-self-improvement-ha...
I've primarily been testing it by building out my AI tool chaz into an Eidetica-native AI Agent framework for decentralized Agent sessions. It's working surprisingly well, it maps pretty well onto the storage model and it's uncovering issues with Eidetica I need to fix (which was always my primary reason for building it anyways). https://github.com/arcuru/chaz
Separately I'm building OptiMap, a SIMD-accelerated hashmap repo that explores the design space for hashmaps and benchmarks different approaches. This is mostly for my own learning but I'll eventually turn into a blog post. https://github.com/arcuru/optimap
This has been in the works for many years! The project originally started as web forms driving After Effects templates on a Windows server, and has now evolved to a point where the web technology landscape has matured enough to build a full-on motion graphics editor right in the browser, using WebGPU and WebCodecs.
It's a project I have been working on for quite a long time and I released it on TestFlight about a week ago. It was really nice to work on something end-to-end, from creating a wrapper around llama.cpp with support for prompt caching/forking and automatic model loading and unloading based on device memory constraints, to the custom agentic harness the app runs on. I have also spent quite a lot of time on agent execution modes that I hope can enable to more easily reason about agent security regarding prompt injection attacks.
What I'm really hoping for now is to get actual feedback, to know if users end up having real use cases where the app is truly useful / interesting for them, to understand what should most urgently be improved etc.
Right now your LP reads like a technical doc rather than a product’s page.
My starting hypothesis is power users and devs, people who want to experiment with local and cloud LLMs, build their own custom agents, and try experiences they wouldn't usually find in consumer AI mobile apps. As the app is now closer to release, I think it has reached a level where it is likely complete enough that there are some viable combinations of its features that can actually solve concrete user problems. I could see the app being used to create agents that serve as small shortcuts tailored to the users' needs, with all the flexibility it enables. A bit like a more iOS-native OpenClaw with opinionated takes on tooling and security. I personally used it to create a food tracker that has a good understanding of my daily routine and also TL;DRs of various sources (including HN) surfaced as suggestions on the home page.
I don't yet know the exact words those users would use to describe their problem, so surfacing that is part of why I'm putting it in front of testers first.
I am close to buying a Claude sub but the thought of it going haywire and costing me extra money in tokens is too scary yet. Not to mention how much provider LLMs (not sure on the correct terminology for them as opposed to local) could hamper your reverse engineering efforts (looking at you Fable).
I really want to solve this scooter and have my own app for it. There's a firmware update feature in the app, maybe I could dump the firmware at least and that would help. Anyone have suggestions on what language model would be the best for this task (analyze decompiled and obfuscated android app / analyze dumped firmware ) ? i have a 96gb macbook and would prefer a local one (I guess to make myself feel better for having spent money on it?) but something through OpenRouter or whatever would do just fine as well
96gb macbook is crazy lol. Good luck!
When the limit is hit, you just cant use claude anymore until the new month / week / day starts , correct? We have Claude at work but we are just told to use it as much as needed and not worry about it so all of us each easily ammass around 100eur in usage every month on the enterprise seats plan.
Thank you for the answer
The most challenging part was getting MVTs to fly but it is very fast already even in mobile. The fun part is tarring the solver solves correctly :) no public version though but I can upload a screen grab somewhere should anyone be interested.
So I've been working on https://fringeflypost.com/, an event tracker with maps, search and filter, scheduling, and sharing with friends that's offline first. It syncs down a locally stored sqlite database and caches assets pretty aggressively.
(You don't actually need to sign up, and you can just jump into the list of shows directly here https://fringeflypost.com/shows).
- SophAI (https://www.sophai.app/): an app to connect the dots across cross-domain. As a CTO, I read across multiple domains (tech, design, business, e-com) and often have to connect the dots. I am building this primarily for myself. It is basically a rss parser with a big AI prompt to connect the dots across the blog posts. As I type this, I'm working on adding podcasts to the app.
- CTO field notes (https://www.ctofieldnotes.com/): collection of essays growing out of my 30 years in IT services. One essay every Tuesday.
Side effect: KYC gets stupid cheap. Cryptographic credential verification vs. traditional document checks is not even close on cost (≈90% cheaper). Qualified Electronic Signatures and EUDI Wallet Payment systems are also coming in the following years.