Top
Best
New

Posted by samrolken 3 days ago

Show HN: Why write code if the LLM can just do the thing? (web app experiment)(github.com)
I spent a few hours last weekend testing whether AI can replace code by executing directly. Built a contact manager where every HTTP request goes to an LLM with three tools: database (SQLite), webResponse (HTML/JSON/JS), and updateMemory (feedback). No routes, no controllers, no business logic. The AI designs schemas on first request, generates UIs from paths alone, and evolves based on natural language feedback. It works—forms submit, data persists, APIs return JSON—but it's catastrophically slow (30-60s per request), absurdly expensive ($0.05/request), and has zero UI consistency between requests. The capability exists; performance is the problem. When inference gets 10x faster, maybe the question shifts from "how do we generate better code?" to "why generate code at all?"
430 points | 317 commentspage 2
ManuelKiessling 2 days ago|
I think there might be a middle ground that could be worth exploring.

On the one hand, there’s „classical“ software that is developed here and deployed there — if you need a change, you need to go over to the developers, ask for a change & deploy, and thus get the change into your hands. The work of the developers might be LLM-assisted, but that doesn’t change the principle.

The other extreme is what has been described here, where the LLM provides the software „on the fly“.

What I‘m imagining is a software, deployed on a system and provided in the usual way — say, a web application for managing inventory.

Now, you use this software as usual.

However, you can also „meta-use“ the software, as in: you click a special button, which opens a chat interface to an LLM.

But the trick is, you don’t use the LLM to support your use case (as in „Dear LLM, please summarize the inventory“).

Instead, you ask the LLM to extend the software itself, as in: „Dear LLM, please add a function that allows me to export my inventory as CSV“.

The critical part is what happens behind the scenes: the LLM modifies the code, runs quality checks and tests, snapshots the database, applies migrations, and then switches you to a „preview“ of the new feature, on a fresh, dedicated instance, with a copy of all your data.

Once you are happy with the new feature (maybe after some more iterations), you can activate/deploy it for good.

I imagine this could be a promising strategy to turn users into power-users — but there is certainly quite some complexity involved to getting it right. For example, what if the application has multiple users, and two users want to change the application in parallel?

Nevertheless, shipping software together with an embedded virtual developer might be useful.

ohadpr 2 days ago||
You’d be surprised to know this works even without the tools, with just the context window as a persistence layer.

I did a POC for this in July - https://www.ohad.com/2025/07/10/voidware/

mrbluecoat 2 days ago||
> It works.

CEO stops reading, signs a contract, and fires all developers.

> It's just catastrophically slow, absurdly expensive, and has the memory of a goldfish.

Reality sinks in two months later.

taylorlunt 2 days ago||
This reminds me of the recent Claude Imagine, which passed quietly through most people's radars, but let you create web interfaces of any kind on the fly. There was no JS code generated. Instead, any time the user clicked a button, the AI would manually update the page accordingly. It was also slow and terrible, but a fun idea.
chrisss395 2 days ago|
You may not have been the target user. I found it intriguing to quickly bring a concept to life for dumb users. I think LLMs significantly lower the cost (barriers) for make something quick and dirty. I'm loving Claude code in an LXC sandbox to take my half-thought out ideas and make me something...most of it is throw-away, but it helps me evolve whatever problem is in my head that I'm trying to solve, and that I find valuable.
ed 2 days ago||
Like a lot of people in this thread I prototyped something similar. One experiment just connected GPT to a socket and gave it some bindings to SQLite.

With a system prompt like “you’re an http server for a twitter clone called Gwitter.” you can interact directly with the LLM from a browser.

Of course it was painfully slow, quickly went off the rails, and revealed that LLM’s are bad at business logic.

But something like this might be the future. And on a longer time horizon, mentioned by OP and separately by sama, it may be possible to render interactive apps as streaming video and bypass the browser stack entirely.

So I think we’re a the Mother of All Demos stage of things. These ideas are in the water but not really practical today. Similarly to MoaD, it may take another 25 years for them to come to fruition.

imiric 2 days ago||
"The Mother of All Demos" was a showcase of an early generation of technology that existed and had a practical purpose. There was never a question if and how the technology would improve. It was only a matter of time and solid engineering.

On the other hand, improvements to "AI" of similar scales are very much uncertain. We have seen moderate improvements from brute force alone, i.e. by throwing more data and compute at the problem, but this strategy has reached diminishing returns, and we have been at a plateau for about a year now. We've seen improvements by applying better engineering (MCP, "agents", "skills", etc.), but have otherwise seen the same tech demos in search of a problem, with a bit more polish at every iteration.

There's no doubt that statistical models are a very useful technology with many applications, some of which we haven't discovered yet. But given the technology we have today, the claim that something like it could be used to generate interactive video which could be used instead of traditional software is absurd. This is not a matter of gradual iterations to get there—it would require foundational breakthroughs to work even remotely reliably, which is as uncertain as LLMs were 10 years ago.

In any case, whatever sama and his ilk have to say about this topic is hardly relevant. These people would say anything to keep the hype-driven valuation pump going.

amrocha 2 days ago||
There is no world where the technology that exists as we understand it leads to your techno dystopia in any way.

These models can’t even do continuous learning yet. There’s no evidence that the current tech will ever evolve beyond what it is today.

Not to mention that nobody is asking for any of this.

hathawsh 2 days ago||
Very insightful and very weird. It's impractical now, but it's a glimpse into some significant part of our future. I can imagine an app called The Last Game, which morphs itself into any game you might want to play. "Let's play 3-D chess like Star Trek: TNG. You should play as Counselor Troi."

(I also just thought of that episode about Moriarty, a Holodeck character, taking over the ship by tricking the crew. It doesn't seem quite so far-fetched anymore!)

asim 2 days ago||
This is the future I had always envisioned but could never execute on. The idea that the visual format is dynamic. The data is there, we have the logic and APIs but we need transformation into visual formats based on some input. Ultimately this is the answer. Where you're going to get some pregenerated "cards", embeds and widgets but then also larger flows will be generated and then saved to be used over and over. We're really in the early innings of it all. What it also means is how we consume content will change. The web page is going to get broken down into snippets. Because essentially why do we need the web page or web a website. We don't. It's specific actions we want to perform and so we'll get the output of that. It also means in the long term how data is stored and accessed will be change to reflect a more efficient format for LLMs e.g the vector database for RAG is only the begining.
silasdavis 2 days ago||
I love the idea of it shifting from one non-descript design system to another on every other page change. How disorientating. Weird and boring at the same time.
tmsbrg 2 days ago||
So the AI basically hallucinates a webapp?

I guess any user can just run something /api/getdatabase/dumppasswords and it will give any user the passwords?

or /webapp?html=<script>alert()</script> and run arbitrary JS?

I'm surprised nobody mentioned that security is a big reason not to do anything like this.

psadri 3 days ago|
Awesome experiment!!

I did a version of this where the AI writes tools on the fly but gets to reuse them on future calls, trying to address the cost / performance issues. Migrations are challenging because they require some notion of an atomic update across the db and the tools.

This is a nice model of organically building software on the fly and even letting end users customize it on the fly.

More comments...