Top
Best
New

Posted by samrolken 11/1/2025

Show HN: Why write code if the LLM can just do the thing? (web app experiment)(github.com)
I spent a few hours last weekend testing whether AI can replace code by executing directly. Built a contact manager where every HTTP request goes to an LLM with three tools: database (SQLite), webResponse (HTML/JSON/JS), and updateMemory (feedback). No routes, no controllers, no business logic. The AI designs schemas on first request, generates UIs from paths alone, and evolves based on natural language feedback. It works—forms submit, data persists, APIs return JSON—but it's catastrophically slow (30-60s per request), absurdly expensive ($0.05/request), and has zero UI consistency between requests. The capability exists; performance is the problem. When inference gets 10x faster, maybe the question shifts from "how do we generate better code?" to "why generate code at all?"
436 points | 324 commentspage 4
samrolken 11/2/2025|
Wow, thanks everyone. First HN post ever and it’s this intentionally terrible experiment that I thought was the dumbest weekend project I ever did, and it hit the front page. Perfect.

I’ve been reading through all the comments and the range of responses is really great and I'm so thankful for everyone to take the time to comment... from from “this is completely impractical” to “but what if we cached the generated code?” to “why would anyone want non-deterministic behavior?” All valid! Though I think some folks are critiquing this as if I was trying to build something production-ready, when really I was trying to build something that would break in instructive ways.

Like, the whole point was to eliminate ALL the normal architectural layers... routes, controllers, business logic, everything, and see what happens. What happens is: it’s slow, expensive, and inconsistent. But it also works, which is the weird part. The LLM designed reasonable database schemas on first request, generated working forms from nothing but URL paths, returned proper JSON from API endpoints. It just took forever to do it. I kept the implementation pure on purpose because I wanted to see the raw capabilities and limitations without any optimizations hiding the problems.

And honestly? I came away thinking this is closer to viable than it should be. Not viable TODAY. Today it’s ridiculous. But the trajectory is interesting. I think we’re going to look back at this moment and realize we were closer to a real shift than we thought. Or maybe not! Maybe code wins forever. Either way, it was a fun weekend. If anyone wants to discuss this or work on projects that respond faster than 30 seconds per request, I’m available for full stack staff engineer or tech co-founder work: sam@samrolken.org or x.com/samrolken

jasonthorsness 11/1/2025||
I tried this as well at https://github.com/jasonthorsness/ginprov (hosted at https://ginprov.com). After a while it sort of starts to all look the same though.
kesor 11/2/2025||
This is just like vibe coding. In vibe coding, you snapshot the results of the LLM's implementation into files that you reuse later.

This project could use something like that. Perhaps ask the LLM to implement a way to store/cache the snapshots of its previous answers. That way, the more you use it, the faster it becomes.

ozim 11/2/2025||
Calculation is $0.05/request is valid only as far as AI companies continue to burn money as they are in grab the market phase.

Once the dust settles prices will go up. Even if running models will be cheaper they will need to earn back all the burned cash.

I’d much rather vibe code app get the code to run on some server.

jokethrowaway 11/2/2025|
not necessarily, hardware and software gains will make tokens cheaper, so we'll see where we are once the vc money runs out (or the entire US economy, there's a chance AI will pop the tech bubble of the last 20 years: I think tech company evaluation are insanely inflated compared to the value they provide)

I can get gpt 3 level of quality with qwen 8B, even qwen 4B in some cases

kazinator 11/6/2025||
Why generate code at all?

Because there are times when you use code in order to generate content. For instance, a complicated document in a content creating documentation. (Anything: graphics, music, corporate documents, ...).

Suppose that, on the spot, AI writes you a software suite in which you create a document.

Do you dare throw that suite away, hoping that AI will write a compatible one tomorrow which can still open and correctly handle all details of that complex document?

crazygringo 11/1/2025||
This is incredibly interesting.

Now what if you ask it to optimize itself? Instead of just:

  prompt: `Handle this HTTP request: ${method} ${path}`,
Append some simple generic instructions to the prompt that it should create a code path for the request if it doesn't already exist, and list all existing functions it's already created along with the total number of times each one has been called, or something like that.

Even better, have it create HTTP routings automatically to bypass the LLM entirely once they exist. Or, do exponential backoff -- the first few times an HTTP request is called where a routing exists, still have the LLM verify that the results are correct, but decrease the frequency as long as verifications continue to pass.

I think something like this would allow you to create a version that might then be performant after a while...?

sixdimensional 11/1/2025|
This brings a whole new meaning to "memoizing", if we just let the LLM be a function.

In fact, this thought has been percolating in the back of my mind but I don't know how to process it:

If LLMs were perfectly deterministic - e.g. for the same input we get the same output - and we actually started memoizing results for input sets by materializing them - what would that start to resemble?

I feel as though such a thing might start to resemble the source information the model was trained on. The fact that the model compresses all the possibilities into a limited space is exactly what makes it more valuable - instead of having to store every input, function body and outputs by memoizing that an LLM could generate, it just stores the model.

But this blows my mind somehow because if we DID store all the "working" pathways, what would that knowledgebase effectively represent and how would intellectual property work anymore in that case?

Thinking about functional programming, to me the potential to think of the LLM as the "anything" function, where a deterministic seed and input always produces the same output, with a knowledgebase of pregenererated outputs to use to speed up the retrieval of acceptable results for a given seed and set of inputs.... I can't put my finger on it.. is it a basically just a search engine then?

Let me try another way...

If I have a ask an LLM to generate a function for "what color is the fruit @fruit?", where fruit is the variable, and I memoize that @fruit = banana + seed 3 is "yellow", then the set of the prompt, input "@fruit", seed = 3, output = "yellow"... then this is now a fact that I could just memoize.

Would that be faster to retrieve the memoized result than calculating the result via the LLM?

And, what do we do with the thought that that set of information is "always true" with regards to intellectual property?

I honestly don't know yet.

hoppp 11/2/2025||
Because LLMs have a big chance to screw things up. They can't take responsibility. A person can take responsibility for code, but can they do the same for tool calling? Not really, because it's probabilistic. A webs service shouldn't be probabilistic
CoderLim110 11/2/2025||
I’ve been thinking about similar questions myself:

1、If code generation eventually works without human intervention, and every Google search could theoretically produce a real-time, custom-generated page, does that mean we no longer need people to build websites at all? At that point, “web development” becomes more like intent-shaping rather than coding.

2、I’m also not convinced that chat is the ideal interface for users. Natural language feels flexible, but it can also be slow, ambiguous, and cognitively heavier than clicking a button. Maybe LLM-driven systems will need new UI models that blend conversation with more structured interaction, instead of assuming chat = the future.

Curious how others here think about those two points.

_heimdall 11/2/2025|
If (1) is true, there is no use for the web at all really.

The only value of an LLM generating a realistic HTML page as an answer is to make it appear as though the answer was found on a preexisting page, lending the answer some level of validity.

If users really are fine with the LLM just generating the answer on the fly, doing so in HTML is completely unnecessary. Just give the user answers in text form.

CoderLim110 11/2/2025||
True. Most users just want their problem solved — they don’t care how it’s solved.
whatpeoplewant 11/1/2025||
Cool demo—running everything through a single LLM per request surfaces the real bottlenecks. A practical tweak is an agentic/multi‑agent pattern: have a planner synthesize a stable schema+UI spec (IR) once and cache it, then use small executor agents to call tools deterministically with constrained decoding; run validation/rendering in parallel, stream partial UI, and use a local model for cheap routing. That distributed, parallel agentic AI setup slashes tokens and latency while stabilizing UI across requests. You still avoid hand‑written code, but the system converges on reusable plans instead of re‑deriving them each time.
maderalabs 11/1/2025|
This is awesome, and proves that code, really, is a hack. People don’t want code. It sucks, it’s hard to maintain, it has bugs, it has to be updated all the time. Gross.

What people want isn’t code - they want computers to do stuff for them. It just happens that right now, code is the best way you can do it.

The paradigm WILL change. It’s really just a matter of when. I think the point you make that these are problems of DEGREE, not problems of KIND is very important. It’s plausible, now it’s just optimization, and we know how that goes and have plenty of history to prove we consistently underestimate the degree to which computation can get faster and cheaper.

Really cool experiment!

losteric 11/1/2025||
Code is a hack in the same way that gears and wheels and levers are hacks. People don’t want mechanical components, they just want machines to do stuff for them.
reeredfdfdf 11/2/2025||
Most of web applications are CRUD apps - they store information in a database, and allow modifying & retrieving it later. People generally expect these systems to be deterministic, so that the data you submit will be later available in same format.

I certainly wouldn't want a patient healthcare system that might return slightly different results, or store the data in different format each time you make a request. Code is and will continue to be the best way to build deterministic computer information systems, regardless of whether it's generated by humans or AI.

More comments...