Posted by vbtechguy 5 days ago
I worry that because we are now able to instantly produce a bunch of JS to do X thing, we will be incentivized not to change the underlying tools (because one, only AI's are using it, and two AI's won't know how to use the new thing)
I worry this will stall progress.
It's not really that different from taking your 2022 car to a shop to adjust your camless engine, and assuming everything's fine, but not having a clue what they did to it or how to fix it if the engine explodes the next day. You can't even prove it had something to do with what the shop did, or if they actually did anything at all. They probably don't even know what they did.
It won't stall progress for clever people who actually want to figure things out and know what they're doing. But it will certainly produce lots more garbage.
That said, the game is impressive for something stitched together by an LLM.
You still need paradigmatic shifts in architecture to enable delivering scale and quality from a smaller amount of materials, and it has not made a dent there, yet.
The standard for new frameworks won't be "does this make humans more productive using new concepts". It will be "can I get an LLM to generate code that uses this framework".
Gemini in particular is really good at this
I also find it interesting that the ai code submissions like this one are generally vague about the process.
This seem to be created using Cline on VsCode, prompting to Gemini 2.5 Pro using OpenRouter.
The commit history implies that a crude version was created by the LLM using an initial prompt and then gradually improved with features, fixes etc. Assuming ongoing conversations with the AI agent.
All code in a single index.html file which might not be great for human coders but who cares to be fair.
All in all, a prompt history would be really educational if anyone thinks about doing something similar.
As we start to develop AI first programs, I believe we will need to start connecting LLM conversations to code, not only for educational purpose but for maintenance as well. I'm currently experimenting with what I call Block-UUIDs, and they are designed to make it easy to trace LLM generated code. You can see what I mean in the link below, which contains a simple hello world example.
https://app.gitsense.com/?chat=7d09a63f-d684-4e2c-97b2-71aa1...
Something worth noting is, you can't expect the LLM to properly generate a UUID. If you ask the LLM, it'll says it can, but I don't trust it to do it correctly all the time. Since I can't trust the LLM, I instruct the LLM to use a template string, which I can replace on the server side. I've also found LLMs will not always follow instructions and will generate UUID and how I handle this is, when the LLM stops streaming, I will validate and fix if needed, any invalid UUIDs.
How I see things playing out in the future is, we will alway link to LLM conversations which will give use the Block-UUIDs generated and by looking at the code, we can see what Block-UUID was used and how it came about.
Full Disclosure: This is my tool
Additionally, you get more directly usable text out of a 'git blame'
This might be something I would do. My only concern is, the conversations can be quite long and mean very long. My "Chat Flow" right now is to discuss what needs to be done. Produce the code, which can span multiple chats and then have the LLM summarize things.
What I think might make sense in the future is to include detailed chat summaries in commit messages and PRs. Given that we get a lot of text diarrhea from LLMs, I think putting them in a LLM as is, may do more harm than good.
Ultimately the code needs to stand alone, but if you discover that a specific version of an LLM produced vulnerable code, you have no recourse but to try again and read the generated code more carefully. And reading code carefully is the opposite of vibe-coding.
I would say AI has generated about 98% of my code for my chat app in the last 3 months and it was definitely not vibe coding. Every function and feature was discussed in detail and some conversations took over a week.
My reasoning for building my chat app wasn't to support vibe coding, but rather to 10x senior developers. Once you know how things work, the biggest bottleneck to a senior developers productivity is typing and documentation. The speed at which LLMs can produce code and documentation cannot be matched by humans.
The only downside is, LLM don't necessary produce pretty or more readable code. The more readable code is something I would like to tackle in the future, as I believe post-processing tools can make LLM code much more readable.
I wonder if there is some tool support yet that supports that.
thanks for sharing!
AI won't cope well as that file gets larger, or best hope that experimental diff feature is working well. I find stuff really breaks if you don't refactor it down.
i.e. who cares when the LLM starts giving up when the file is too big and confusing?
i.e. who cares when the LLM introduces bugs / flaws it isn't away of?
If the AI fucks up then its a lost cause. And maybe you’ll better off create a new version from scratch instead of trying to maintain one when LLM starts to fail. Just ask Gemini 3.5 this time around.
The AI can write obfuscated code. Name all variables from a to z. Even emit binaries directly. Who cares if it works?
So how does it save time? Do I also tell customers we're blowing things up and redoing it every few months with potential risks like data loss?
I personally do not think any of this is a good idea but here we are.
And I was making fun of AI images with the weird fingers and shit just a year ago, now it's hard to identify AI generated images. The code gen now can create a single file space invaders which is impressive but shitty according to all coding metrics.
They are getting better. At some point the single file shit stew will be good enough cause the context windows and capabilities of LLMs will be able to handle those files. That's when nobody gonna care I guess.
IF...
> weird fingers and shit just a year ago, now it's hard to identify AI generated images
It's still easy and it's never been about the fingers. Things like lighting are still way off (on the latest models).
> At some point the single file shit stew will be good enough
At some point we'll be living in Mars too.
Could not parse specific advice due to formatting issues in the AI response content.
They don't "reason" - they're just AutoComplete on steroids
“I had a long flight today, so in the airport I coded something”
we cam start the discussion now i.e.: is using cursor still coding ?
is a senior team lead coding when the only thing he does is code reviews and steering the team?
You didn’t even mention AI.
Happy to give more details if there's any way to get in touch outside this thread.
I had something like $500,000. I bought up the entire inventory (at least, until the Buy buttons stopped working - there were still items available).
It then became a 'click as fast as you can and don't care about strategy' game, so I stopped.
Again, the first time I went to the store I could max out with everything, so there wasn't the accumulative build-up, or decision to prioritize one purchase strategy over another.
There wasn't any sign there would be new things to buy which I couldn't yet afford.
There was no hint that future game play would be anything other than clicking madly, with no need to conserve or focus resource use.
Part of Missile Command is to wait until just the right point so a single missile can do a chain reaction to take out several incoming warheads; to watch if your strategy was effective, not simply twitch-fire; and even the slow dread of watching an incoming warhead come towards a city after you've depleted your missile supply.
Here's an example from the novelization of WarGames, from https://archive.org/details/wargames00davi/page/n45/mode/2up... :
> David Lightman hovered over the controls of the Atari Missile Command tucked neatly between the Frogger machine and the Zaxxon game. ...
> Goddamned Smart Bombs! he thought as a white buzzing blip snuck through his latest volley of shots and headed for one of his six cities at the bottom of the screen. He spun the control ball, stitched a neat three-X line just below the descending bomb with the cursor, and watched with immense satisfaction as his missiles streaked white lines to their targets, blowing the bomb right out of the phosphor-dot sky.
I didn't get the same sense of satisfaction playing this version.
Are the game state updates and input coupled with the frame rate, as in most beginner game development tutorials out there, or did the "AI" do the right thing and decouple them?
What was the idea behind this choice?