Posted by samwillis 17 hours ago
* A small statically generated Hugo website but with some clever linking/taxonomy stuff. This was a fairly self-contained project that is now 'finished' but wouldn't hvae taken me more than a few days to code up from scratch. * A scientific simulation package, to try and do a clean refresh of an existing one which i can point at for implementation details but which has some technical problems I would like to reduce/remove.
Claude code absolutely smashed the first one - no issues at all. With the second, no matter what I tried, it just made lots of mistakes, even when I just told it to copy the problematic parts and transpose them into the new structure. It basically got to a point where it wasn't correct and it didn't seem to be able to get out of a bit of a 'doom loop' and required manual intervention, no matter how much prompting and hints I gave it.
Did sign up for Claude Code myself this week, too, given the $10/month promo. I have experience with AI by using AWS Kiro at work and directly prompting Claude Opus for convos. After just 2 days and ~5-6 vibe coding sessions in total I got a working Life-OS-App created for my needs.
- Clone of Todoist with the features that I actually use/want. Projects, Tags, due dates, quick adding with a todoist like text-aware input (e.g. !p1, Today etc.)
- A fantastical like calendar. Again, 80% of the features I used from Fantastical
- A Habit Tracker
- A Goal Tracker (Quarterly / Yearly)
- A dashboard page showing todays summary with single click edit/complete marking
- User authentication and sharing of various features (e.g. tasks)
- Docker deployment which will eventually run on my NAS
I'm going to add a few more things and cancel quite a few subscriptions. It one-shots all tasks within minutes. It's wild. I can code but didn't bother looking at the code myself, because ... why.
Even though do not earn US Tech money, am tempted to buy the max subscription for a month or two although the price is still hard to swallow.
Claude and vibe coding is wild. If I can clone todoist within a few vibe coding sessions and then implement any additional/new feature I want within minutes instead proposing, praying and then waiting for months, why would I pay $$$...
What is `FrameState::render_placeholder`?
``` pub fn render_placeholder(&self, frame_id: FrameId) -> Result<FrameBuffer, String> { let (width, height) = self.viewport_css; let len = (width as usize) .checked_mul(height as usize) .and_then(|px| px.checked_mul(4)) .ok_or_else(|| "viewport size overflow".to_string())?;
if len > MAX_FRAME_BYTES {
return Err(format!(
"requested frame buffer too large: {width}x{height} => {len} bytes"
));
}
// Deterministic per-frame fill color to help catch cross-talk in tests/debugging.
let id = frame_id.0;
let url_hash = match self.navigation.as_ref() {
Some(IframeNavigation::Url(url)) => Self::url_hash(url),
Some(IframeNavigation::AboutBlank) => Self::url_hash("about:blank"),
Some(IframeNavigation::Srcdoc { content_hash }) => {
let folded = (*content_hash as u32) ^ ((*content_hash >> 32) as u32);
Self::url_hash("about:srcdoc") ^ folded
}
None => 0,
};
let r = (id as u8) ^ (url_hash as u8);
let g = ((id >> 8) as u8) ^ ((url_hash >> 8) as u8);
let b = ((id >> 16) as u8) ^ ((url_hash >> 16) as u8);
let a = 0xFF;
let mut rgba8 = vec![0u8; len];
for px in rgba8.chunks_exact_mut(4) {
px[0] = r;
px[1] = g;
px[2] = b;
px[3] = a;
}
Ok(FrameBuffer {
width,
height,
rgba8,
})
}
}
```What is it doing in these diffs?
https://github.com/wilsonzlin/fastrender/commit/f4a0974594e3...
I'd be really curious to see the amount of work/rework over time, and the token/time cost for each additional actual completed test case.
But, if I'm being fair, a full working browser from scratch is just as good.
I'm very bullish on LLMs building software, but this doesn't mean the death of software products anymore than 3D printers meant the death of factories.
The hype may be similar, if that's your point then I agree, but the weakness of 3D printing is the range of materials and the conditions needed to work with them (titanium is merely extremely difficult, but no sane government will let the general public buy tetrafluoroethylene as a feedstock), while the weakness of machine learning (even more broadly than LLMs) is the number of examples they require in order to learn stuff.
I really dislike this as a measure. A LLM on CPU is also long running cause it’s slow.
I get what it’s meant to convey but time is such a terrible measure of anything if tk/s isn’t static
I can create a web browser in under a minute in Copilot if I ask it to build a WinForms project that embeds the WebView2 "Edge" component and just adds an address bar and a back button.
If one vulnerability exists in those crates well, thats that.
It kinda blows my mind that this is possible, to build a browser engine that approximates a somewhat working website renderer.
Even if we take the most pessimistic interpretation of events ( heavy human steering, relies on existing libraries, sloppy code quality at places, not all versions compile etc)
The positive views are mostly from people who point out that what matters in the end is what the code does, not what it looks like, e.g. users don't see the code, nor do they care about the code, and that even for businesses who do care, LLMs may be the ones who have to pay down any technical debt that builds up.
* Anyone in a field where mistakes are expensive. In one project, I asked the LLM to code-review itself and it found security vulnerabilities in its own solutions. It's probably still got more I don't know about.
** In the original sense of just letting the LLM do whatever it wanted in response to the prompt, never reading or code reviewing the result myself until the end.
boys are trying to single shot a browser when a moderate complex task can derail a repo. there’s no good amount of info which might be deliberate but from what i can pick, their value add was “distributed computing and organisational design” but that too they simplified. i agree that simplicity is always the first option but flat filesystem structure without standards will not work. period.
If AI could reach the point where we actually trusted the output, then we might stop checking it.
It's a very real issue, people just seem to assume their code is wrong rather than the compiler. I've personally reported 12 GCC bugs over the last 2 years and there's 1239 open wrong-code bugs currently.
Here's an example of a simple one in the C frontend that has existed since GCC 4.7: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105180
AI coding agents are still a huge force-multiplier if you take this approach, though.
It would be walking the motorcycle.
All code interactions all happen through agents.
I suppose the question is if the agents only produce Swiss cheese solutions at scale and there's no way to fill in those gaps (at scale). Then yeah fully agentic coding is probably a pipe dream.
On the other hand if you can stand up a code generation machine where it's watts + Gpus + time => software products. Then well... It's only a matter of time until app stores entirely disappear or get really weird. It's hard to fathom the change that's coming to our profession in this world.