Gas Town's agent patterns, design bottlenecks, and vibecoding at scale

Posted by pavel_lishin 1/23/2026

Gas Town's agent patterns, design bottlenecks, and vibecoding at scale(maggieappleton.com)

403 points | 434 comments

mediaman 1/23/2026|

I don't get the widespread hatred of Gas Town. If you read Steve's writeup, it's clear that this is a big fun experiment.

It pushes and crosses boundaries, it is a mixture of technology and art, it is provocative. It takes stochastic neural nets and mashes them together in bizarre ways to see if anything coherent comes out the other end.

And the reaction is a bunch of Very Serious Engineers who cross their arms and harumph at it for being Unprofessional and Not Serious and Not Ready For Production.

I often feel like our industry has lost its sense of whimsy and experimentation from the early days, when people tried weird things to see what would work and what wouldn't.

Maybe it's because we also have suits telling us we have to use neural nets everywhere for everything Or Else, and there's no sense of fun in that.

Maybe it's the natural consequence of large-scale professionalization, and stock option plans and RSUs and levels and sprints and PMs, that today's gray hoodie is just the updated gray suit of the past but with no less dryness of imagination.

hyperpape 1/23/2026||

> If you read Steve's writeup, it's clear that this is a big fun experiment:

So, Steve has the big scary "YOU WILL DIE" statements in there, but he also has this:

> I went ahead and built what’s next. First I predicted it, back in March, in Revenge of the Junior Developer. I predicted someone would lash the Claude Code camels together into chariots, and that is exactly what I’ve done with Gas Town. I’ve tamed them to where you can use 20–30 at once, productively, on a sustained basis.

"What's next"? Not an experiment. A prediction about how we'll work. The word "productively"? "Productively" is not just "a big fun experiment." "Productively" is what you say when you've got something people should use.

Even when he's giving the warnings, he says things like "If you have any doubt whatsoever, then you can’t use it" implying that it's ready for the right sort of person to use, or "Working effectively in Gas Town involves committing to vibe coding.", implying that working effectively with it is possible.

Every day, I go on Hacker News, and see the responses to a post where someone has an inconsistent message in their blog post like this.

If you say two different and contradictory things, and do not very explicitly resolve them, and say which one is the final answer, you will get blamed for both things you said, and you will not be entitled to complain about it, because you did it to yourself.

an0malous 1/23/2026|||

I agree, I’m one of the Very Serious Engineers and I liked Steve’s post when I thought it was sort of tongue in cheek but was horrified to come to the HN comments and LinkedIn comments proclaiming Gastown as the future of engineering. There absolutely is a large contingent of engineers who believe this, and it has a real world impact on my job if my bosses think you can just throw a dozen AI agents at our product roadmap and get better productivity than an engineer. This is not whimsical to me, I’m getting burnt out trying to navigate the absurd expectations of investors and executives with the real world engineering concerns of my day to day job.

cthalupa 1/24/2026|||

> horrified to come to the HN comments and LinkedIn comments proclaiming Gastown as the future of engineering.

I don't spend much time on LinkedIn, but basically every comment I've read on HN is that, at best, Gas Town can pump out a huge amount of "working" code in short timeframes at obscene costs.

The overwhelming majority are saying "This is neat, and this might be the rough shape of what comes next in agentic coding, but it's almost certainly not going to be Gas Town itself."

I have seen basically no one say that Gas Town is the The Thing.

bloppe 1/24/2026||||

I feel that yegge captured the mania of the whole operation rather well. If your bosses commit to the idea that 100 memoryless stochastic "polecats" will deliver a long term sustainable business, then there are probably other leadership issues besides this.

jmspring 1/24/2026||||

I think Steve's idea of an agent coordinator and the general model could make sense. There is a lot of discussion (and even work from Anthropic, OpenAI, etc) around multiagent workflows.

Is Gas Town the implementation? I'm not sure.

What is interesting is seeing how this paradigm can help improve one's workflow. There is still a lot of guidance and structuring of prompts / claude.md / whichever files that need to be carefully written.

If there is a push for the equivalent of helm charts and crds for gas town, then I will be concerned.

storystarling 1/24/2026||

I ran into this building a similar workflow with LangGraph. The prompt engineering is definitely a pain, but the real bottleneck with the coordinator model turns out to be the compounding context costs. You end up passing the full state history back and forth, so you are paying for the same tokens repeatedly. Between that and the latency from serial round-trips, it becomes very hard to justify in production.

pstuart 1/23/2026||||

AI is such a fun topic -- the hype makes it easy to loath, but as a coder working with Claude I think it's an awesome tool.

Gastown looks like a viable avenue for some app development. One of the most interesting things I've noticed about AI development is that it forces one to articulate desired and prohibited behaviors -- a spec becomes a true driving force.

Yegge's posts are always hyperbolic and he consistently presents interesting takes on the industry so I'm willing to cut him a buttload of slack.

joquarky 1/25/2026|||

I find it interesting that waterfall is becoming popular again.

ctmnt 4 days ago||

I agree, it is really interesting. I think that the main reason, though, is that instead of a waterfall cycle taking weeks or months, it now takes minutes. So it’s the process of waterfall (speccing things out carefully in advance, committing to the plan, assessing the results based on adherence to the plan, etc), but on the time frame of agile.

dingnuts 1/24/2026|||

[dead]

dada216 1/24/2026||||

Embrace and use it to your advantage. Tell them nobody knows and understands how these things will actually work long term, that's why there's stuff like gas town, and that the way you see all of this is you can manage this process. What you bring to the table is making sure it will actually work if the tech is safe and sound, reaping the rewards, or protect the business if the tech fails, protecting the company from catastrophic tech failure, telll them that you are uniquely positioned to carry out the balancing act because you are deep in the tech itself. bonus if you explain the uncertainty framing in the business strategy: "because nobody really understands the tech nobody has an advantage, we are all playing on a leveled field, from the big boys at FAANGs to us peasants in normal non-tech enterprises: I am your advantage here if you give me the tools and leverage I need to make this work". if you play this right you'll get the fat bonus whether the tech actually works or not.

Treegarden 1/24/2026||||

If your boss is that bad, the correct long-term move is to leave, not to wish technology didn’t advance.

pxtail 1/24/2026||

Your boss and other ones who are asleep someday will wake up too.

spacecadet 1/24/2026||||

"I’m getting burnt out trying to navigate the absurd expectations of investors and executives with the real world engineering concerns of my day to day job."

Welcome to being a member of a product team who cares beyond just whats on their screen... Honestly there is a humbling moment coming for everyone, it and Im not sure its unemployment.

meowface 1/23/2026||||

It's a half-joke. No need to take it that seriously or that jokingly. It's mostly only grifters and cryptocurrency scammers claiming it's amazing.

I think ideas from it will probably partially inspire future, simpler systems.

wonnage 1/23/2026|||

It may be a joke in the same way that brogramming was a joke and somehow became an enduring tech bro stereotype

adabyron 1/24/2026||

Strong agreement with this. The whimsical, fantasy, fun, light hearted things are great until a large enough group of people take them as a serious life motto & then try to push it on everyone else.

leoc 1/24/2026||

Taking the example of the cryptocurrency boom (as a whole) as the guide, the problem is the interaction of two realities: big money on the table; and the self-fulfilling-prophecy (not to say Ponzi) dynamic of needing people to keep clapping for Tinker-bell, in greater and greater numbers, to keep the line going up. It corrupts whimsical fun and community spirit, it corrupts idealism, and it corrupts technical curiosity.

lupire 1/23/2026|||

stevey already made $300K from cryptocurrency grift on Gas Town. Read his blog post about it.

wahnfrieden 1/23/2026|||

Complete with a "Let’s goooooooo!"

And FOMO stories about missing out on Bitcoin when he knew about it, so he doesn't want you to miss out on this new opportunity to get "filthy rich" as an "investor" while you still can.

wahnfrieden 1/24/2026|||

More details on the pump and dump scheme he joined in on promoting and drew money from: https://pivot-to-ai.com/2026/01/22/steve-yegges-gas-town-vib...

DonHopkins 1/24/2026|||

MOOLLM has its own official currency -- MOOLAH!

https://github.com/SimHacker/moollm/tree/main/skills/economy...

The official currency of MOOLLM is MOOLAH. It uses PROOF OF MILK consensus — udderly legen-dairy interga-lactic shit coin, without the bull.

meowface 1/24/2026||||

This initially sunk my heart, but in all his replies there are like 50 very clearly unintelligent crypto grifters telling him he needs to be killed for scamming them, so I am unsure who to root for at this point. It's depressing he accepted it, but I might partially forgive it due to him making a lot of them lose money.

JasonADrury 1/24/2026|||

[flagged]

jaapz 1/24/2026|||

Why is it hard to criticize people for being part of a scam operation? It's so morally and ethically bankrupt that it's really easy and valid to criticize someone for

JasonADrury 1/24/2026||

Who is being scammed? The only people buying into tokens as obscure as these are degenerate gamblers who know very well that it's not any kind of an investment.

jaapz 1/24/2026||

That sounds like victim blaming to me

JasonADrury 1/24/2026||

It's not a scam, there's no misrepresentation. This very clearly isn't marketed as an investment of any kind https://bags.fm

https://apps.apple.com/app/bags-financial-messenger/id647319...

The tagline of the app? "Buy & sell memecoins". Transparently advertised as a crowdfunding mechanism using memecoins.

wahnfrieden 1/24/2026|||

Yegge write a blog post for his readers where he called it an investment and hoped the investors would get “filthy rich”.

meowface 1/25/2026|||

What? Of course it's marketed as an investment. That's the sole thing it's marketed as. Are you not able to lift the thinnest veil imaginable?

meowface 1/24/2026||||

Because you'd be aiding and abetting a pennystock scam.

JasonADrury 1/24/2026||

The difference between bags.fm and pennystock scams is that bags.fm is very obviously not marketed as an investment, but a crowdfunding tool.

meowface 1/25/2026|||

It's absolutely marketed as an investment, and solely used and referenced by people saying it is an investment. This is like saying those cannabis paraphernalia shops are marketed as only for tobacco.

wahnfrieden 1/24/2026|||

Yegge write a blog post for his readers where he called it an investment and hoped the investors would get “filthy rich”.

victorbjorklund 1/24/2026|||

But people do. There are people who genuinely think crypto is an investment. Yes, smart people knows it is just a grift and that it is just about selling it on to the next person before it crashes. But is it moral to make money on stupid people? Many people lose all their money on gambling even if we always known gambling is a loss.

JasonADrury 1/24/2026||

> There are people who genuinely think crypto is an investment.

Sure! Are those people buying bags.fm tokens? Probably not.

This isn't even marketed as an investment https://bags.fm but a crowdfunding tool for developers with a casino attached.

You don't have to be smart to read the big text on the website.

meowface 1/25/2026|||

You don't have to be smart to understand they're very, very, very obviously saying it's an investment and using extremely superficial cover. All things like these are exclusively pennystock scams.

You're being bamboozled. Google the name of it. Search it on Twitter and 4chan. Watch any Coffeezilla video.

JasonADrury 1/26/2026||

I'm googling "bags.fm", everything I can find is about money going to creators. Literally nothing suggesting that you're going to get rich by buying these tokens.

Searching for "bags.fm" on X with keywords like "invest" or "rich" or "moon" also does not seem to return any conversations referring to anyone but the creators getting rich.

I can't find any bags.fm references on 4chan, and searching for gas town instead doesn't seem to bring up anything cryptocurrency related in the archive.

> You're being bamboozled

I don't think so. I suspect the world is so full of crypto scams that when someone does something explicitly non-scammy ("Hey, here's a crypto thing you can use to give me free money!") people still incorrectly view it as scammy because of crypto.

How many memecoin "investors" do you think view these as serious investments? I suspect essentially none of them.

How many memecoin "investors" are degenerate gambling addicts who need treatment? Probably most of them.

Taking money from vulnerable gambling addicts is certainly not ideal, but it's far from scammy.

wahnfrieden 1/26/2026|||

Yegge himself wrote a blog post to his non crypto audience calling it an investment that he hopes makes its investors filthy rich. He pumped it, then he dumped it, and announced he’s walking away from it at that point after taking his profits and crashing its value.

I don’t know why you’re talking about existing hardcore BAGS addicts when the topic is Yegge promoting a crypto grift to his own general audience as an investment and then running the typical pump and dump scam on them.

meowface 1/26/2026|||

It's a scam or a pennystock grift or whatever term you want to use.

https://x.com/Fizzy__01/status/1956006313848397861

100% of these things are somewhere on the scam and fraud spectrum. An unscrupulous person creates a token or a platform for creating tokens with the goal of raising the worthless token's price so they can parasitically make millions from something that holds zero value.

The "fund creators" thing is a common ploy. If they actually wanted to do that, they'd make it so you can only donate with dollars or stablecoins.

Look at the dozens of replies to all of Yegge's posts, now: https://x.com/Steve_Yegge/status/2014530592134910215

wahnfrieden 1/24/2026||||

Yegge write a blog post for his readers where he called it an investment and hoped the investors would get “filthy rich”.

torginus 1/24/2026|||

I don't get crypto - just looked up how a couple of most performant stocks did in the past decade, and I'm pretty sure you could outperform BTC with the same amount of risk tolerance.

The swings on BTC price are absolutely insane, and ETH even more so (which is even more risky, without showing higher gains).

torginus 1/24/2026||||

what the? how do you sell crypto based on a description of an orchestration framework?

donations?

cap11235 1/24/2026|||

People keep giving him the benefit of the doubt. "He's clearly on to something, I just don't know what". I know what. The hustle of the shill. He has long gone from 'let's use a lot of tokens' to seeking a high score. He disgusts me.

meowface 1/24/2026||

What high score?

lowbloodsugar 1/23/2026||||

I too am a Very Serious Engineer but my shock is in the other direction: of course the ideas behind Gas Town are the future of software development and several VSEs I know are developing a proper, robust, engineering version of it that works. As the author of this article here remarks “yes, but Steve did it first”, and it annoys me that if I had written this post nobody would have read it, but also that, because I intend to use it in Very Serious Business ($bns) my version isn’t ready to a actually be published yet. Bravo to Steve for getting these thoughts on paper and the idea built even in such crude form. But “level 8” is real and there will be 9s and 10s and I am really enjoying building my own.

csallen 1/23/2026|||

> "Gastown as the future of engineering"

Note the word "future" not "present". People are making a prediction of where things will go. I haven't seen a single person saying that Gas Town as it exists today is ready for production-grade engineering project.

potatolicious 1/23/2026||||

> "If you say two different and contradictory things, and do not very explicitly resolve them, and say which one is the final answer, you will get blamed for both things you said, and you will not be entitled to complain about it, because you did it to yourself."

If I can be a bit bold and observe that this tic is also a very old rhetorical trick you see in our industry. Call it Schrodinger's Modest Proposal if you will.

In it someone writes something provocative, but casts it as both a joke and deadly serious at various points. Depending on how the audience reacts they can then double down on it being all-in-good-jest or yes-absolutely-totally. People who enjoy the author will explain the nonsensical tension as "nuance".

You see it in rationalist writing all the time. It's a tiresome rhetorical "trick" that doesn't fool anyone any more.

antonvs 1/24/2026|||

It's a version of a motte and bailey argument (named after a medieval castle defense system):

> "...philosopher Nicholas Shackel coined the term 'motte-and-bailey' to describe the rhetorical strategy in which a debater retreats to an uncontroversial claim when challenged on a controversial one."

-- https://heterodoxacademy.org/blog/the-motte-and-the-bailey-a...

directevolve 1/23/2026|||

In what rationalist writing? The LessWrong style is to be literal and unambiguous. They’re pretty explicit that this is a community value they’re striving for.

sdwr 1/23/2026||

The whole trick is having your cake and eating it too. The LessWrong style exploits the gap between the strength of the claims ("this is a big deal that explains something fundamental about the world") and the evidence/foundation (abstract armchair reasoning, unfalsifiable)

directevolve 1/24/2026||

That’s not the same issue, though. You’re claiming just plain overconfidence or that you find their arguments unconvincing. But the rhetorical trick we were discussing is oscillating between treating a claim as a joke or as deadly serious depending on the audience.

theptip 1/24/2026||||

I think both can be true, no?

Multi-agent coordination is obviously what's next.

And, Gas Town itself might never amount to more than a proof-of-concept.

Personally I'd put my money on whatever Anthropic build to do this job, rather than a layer someone else builds atop CC.

Remember when code LLMs were just APIs, and folks were building their own coding scaffolds like Aider and Cursor? Then Claude Code steamrolled everyone; they win because they can do RL on the whole agentic scaffold.

Any multi-agent system will have the same properties, i.e. whatever traits (e.g. the GUPP) and tool expertise (e.g. using Beads) are required to effectively participate in a swarm will get RL'd into the coding model, and any attempts to build alternate scaffolds will hit impedance mismatches because they do not fit the shape of what was RL'd (just like using non-CC UIs with Anthropic models gives you worse results than using the CC UI).

I say this with love - Yegge is putting forth some excellent ideas here. Beads seems like a great concept to add to CC ASAP; storing the TODO state in a repo would mean we don't need MCPs onto issue trackers. And figuring out what orchestration concepts are required will need a lot more trial and error, but these existence proofs are moving the frontier forward.

csallen 1/23/2026||||

These are some very tortured interpretations you're making.

- "what's next" does not mean "production quality" and is in no way mutually exclusive with "experimental". It means exactly what it says, which is that what comes next in the evolution of LLM-based coding is orchestration of numerous agents. It does not somehow mean that his orchestrator writes production-grade code and I don't really understand why one would think it does mean that.

- "productively" also does not mean "production quality". It means getting things done, not getting things done at production-grade quality. Someone can be a productive tinkerer or they can be a productive engineer on enterprise software. Just because they have the word "product" in them does not make them the same word.

- "working effectively" is a phrase taken out of the context of this extremely clear paragraph which is saying the opposite of production-grade: "Working effectively in Gas Town involves committing to vibe coding. Work becomes fluid, an uncountable substance that you sling around freely, like slopping shiny fish into wooden barrels at the docks. Most work gets done; some work gets lost."

If he wanted to say that Gas Town wrote production grade code, he would have said that somewhere in his 8000-word post. But he did not. In fact, he said the opposite, many many many many many many times.

You're taking individual words out of context, using them to build a strawman representing a promise he never came close to making, and then attacking that strawman.

What possible motivation could you have for doing this? I have no idea.

> If you say two different and contradictory things...

He did not. Nothing in the blog post explicitly says or even remotely implies that this is production quality software. In addition, the post explicitly, unambiguously, and repeatedly screams at you that this is highly experimental, unreliable, spaghetti code, meant for writing spaghetti code.

The blog post could not possibly have been more clear.

> ...because you did it to yourself.

No, you're doing this to his words.

Don't believe me? Copy-paste his post into any LLM and ask it whether the post is contradictory or whether it's ambiguous whether this is production-grade software or not. No objective reader of this would come to the conclusion that it's ambiguous or misleading.

madhadron 1/24/2026|||

> Copy-paste his post into any LLM and ask it whether the post is contradictory or whether it's ambiguous whether this is production-grade software or not. No objective reader of this would come to the conclusion that it's ambiguous or misleading.

That's hilarious! You might want to add a bit more transition for the joke before the other points above, though.

airza 1/24/2026|||

> Don't believe me? Copy-paste his post into any LLM and ask it whether the post is contradictory or whether it's ambiguous whether this is production-grade software or not.

Bleak

drewbug01 1/23/2026||||

> If you say two different and contradictory things, and do not very explicitly resolve them, and say which one is the final answer, you will get blamed for both things you said, and you will not be entitled to complain about it, because you did it to yourself.

Our industry is held back in so many ways by engineers clinging to black-and-white thinking.

Sometimes there isn’t a “final” answer, and sometimes there is no “right” answer. Sometimes two conflicting ideas can be “true” and “correct” simultaneously.

It would do us a world of good to get comfortable with that.

hyperpape 1/23/2026|||

My background is in philosophy, though I am a programmer, for what it is worth. I think what I'm saying is subtly different from "black and white thinking".

The final answer can be "each of these positions has merit, and I don't know which is right." It can be "I don't understand what's going on here." It can be "I've raised some questions."

The final answer is not "the final answer that ends the discussion." Rather, it is the final statement about your current position. It can be revised in the future. It does not have to be definitive.

The problem comes when the same article says two contradictory things and does not even try to reconcile them, or try to give a careful reader an accurate picture.

And I think that the sustained argument over how to read that article shows that Yegge did a bad job of writing to make a clear point, albeit a good job of creatring hype.

habinero 1/23/2026|||

Or -- and hear me out -- unserious people are saying nonsense things for attention and pointing this out is the appropriate response.

akst 1/24/2026||||

yeah the messaging is somewhat insecure in that it preemptively seeks to invalidate criticism by just being an experiment while simultaneously making fairly inflammatory remarks about nay sayers like they'll eat dirt if they don't get on board.

I think it's possible to convey that you believe strongly in your idea and it could be the future (or "is the future" if you're so sure of self) while it still being experimental. I think he would get less critics if he wasn't so hyperbolic in his pitch and had fewer inflammatory personal remarks about people who he hasn't managed to bring on side.

People I know who communicate like that generally struggle to contribute constructively to nuanced discussions, and tend to seek out confrontation for the sake of it.

MarsIronPI 1/28/2026||||

Additionally, Steve seems very adamant about the fact that anyone who doesn't adopt vibe coding is going to be obsolete, and the ones who adopt it best are going to win big.

taneq 1/24/2026||||

> "What's next"? Not an experiment.

I think what’s next after an experiment very often is another experiment, especially when you’re doing this kind of exploratory R&D.

columk 1/24/2026||||

>We should take Yegge’s creation seriously not because it’s a serious, working tool for today’s developers (it isn’t). But because it’s a good piece of speculative design fiction that asks provocative questions and reveals the shape of constraints we’ll face as agentic coding systems mature and grow.

I have no doubt Yegge would agree wholeheartedly with that take. He wants the community to explore these ideas with him.

The bizarre thing is that Gas Town has been popping up in mainstream news and media. It's being discussed in my economics podcasts.

It's relevant for them because it hints at a very disruptive idea: The hierarchy of Gas Town, when extrapolated, suggests that agents won't just replace your workers, it will replace your business too. It suggests that in a few years there could be a tool that is effectively a software agency, which means companies like Anthropic could eat any software shop that can't compete.

rlt 1/23/2026||||

I think you just proved mediaman's point.

portd062 1/26/2026||||

[dead]

GoatInGrey 1/23/2026|||

Keep in mind that Steve has LLMs write his posts on that blog. Things said there may not reflect his actual thoughts on the subject(s) at hand.

gozzoo 1/23/2026|||

There is no way for this to be true. I read his book about vibe coding and it is obvoius that it has significant LLM contribution. His blog posts though are funy and controversial, and have bad jokes, and he jumps from topic to topic. Ha has had this style like 10+ years before LLMs came around.

davidgerard 1/25/2026||

The book intro proudly states it used LLM drafting.

square_usual 1/23/2026||||

I've been reading Steve's posts for quite literally a decade now and I don't think his new posts are so meaningfully different from the old ones that he's not at the wheel any more. Besides, his twitter posts often double down on what he's writing in the blog, and it's doubtful he's not writing those.

joshstrange 1/23/2026||||

> Keep in mind that Steve has LLMs write his posts on that blog.

Ok, I can accept that, it's a choice.

> Things said there may not reflect his actual thoughts on the subject(s) at hand.

Nope, you don't get to have it both ways. LLMs are just tools, there is always a human behind them and that human is responsible for what they let the LLM do/say/post/etc.

We have seen the hell that comes from playing the "They said that but they don't mean it" or "It's just a joke" (re: Trump), I'm not a fan of whitewashing with LLMs.

This is not an anti or pro Gas Town comment, just a comment on giving people a pass because they used an LLM.

idle_zealot 1/23/2026||

Do you read that as giving him a pass? I read it as more of a condemnation. If you have an LLM write "your" blog posts then of course their content doesn't represent your thoughts. Discussing the contents of the post then is pointless, and we can disregard it entirely. Separately we can talk about what the person's actual views might be, using the fact that he has a machine generate his blog posts as a clue. I'm not sure I buy that the post was meaningfully LLM-generated though.

The same approach actually applies to Trump and other liars. You can't take anything they say as truth or serious intent on its own; they're not engaging in good faith. You can remove yourself one step and attempt to analyze why they say what they do, and from there get at what to take seriously and what to disregard.

In Steve's case, my interpretation is that he's extremely bullish on AI and sees his setup or something similar as the inevitable future, but he sprinkles in silly warnings to lampshade criticism. That's how the two messages of "this isn't serious" and "this is the future or software development" co-exist. The first is largely just a cover and an admission this his particular project is a mess. Note that this interpretation assumes that the contents of the blog post in question were largely written by him, even if LLM assistance was used.

joshstrange 1/23/2026||

Hmm, maybe I read the original comment wrong then? I did read it as "You can't blame him, that might not even be what he thinks" and my stance is "He posted it on his blog, directly or indirectly, what else am I supposed to think?".

I agree with you on Steve's case, and I have no ill will towards him. Mostly it was just me trying to "stomp" on giving him a pass, but, as you point out, that may not have been what the original commenter meant.

jauntywundrkind 1/23/2026||||

Is this confirmed true? Yegge has a very very long history of writing absurdly long posts / rants.

WesolyKubeczek 1/23/2026||

Back in the day they used to be coherent.

square_usual 1/23/2026||

Not much more than his recent posts, no.

usefulcat 1/23/2026||||

There's a rather fine line between "don't believe everything you read" and "don't believe anything you read". At least in this case.

63stack 1/24/2026|||

This is some super fucked up thinking. If it does not reflect your actual thoughts, you do not post it under your own name.

ludicity 1/23/2026|||

I thought it was harmless(ish) fun, but David Gerard put out a post stating that Yegge used Gas Town to push out a crypto project that rug pulled his supporters, while he personally walked away with something between $50K to $100K from memory.

I suppose that has little to do with the technical merits of the work, but it's such a bad look, and it makes everyone boosting this stuff seem exactly as dysregulated/unwise as they've appeared to many engineers for a while.

I met Sean Goedecke for lunch a few weeks ago, who uses LLMs a bunch, and is clearly a serious adult, but half the folks being shoved in front of everyone are behaving totally manic and people are cheering them on. Absolutely blows my mind to watch.

https://pivot-to-ai.com/2026/01/22/steve-yegges-gas-town-vib...

skybrian 1/23/2026||

That was very weird. In the post where he was arguably "shilling," he seems to have signposted pretty well that it was dumb, but he will take the money they offered:

> $GAS is not equity and does not give you any ownership interest in Gas Town or my work. This post is for informational purposes only and is not a solicitation or recommendation to buy, sell, or hold any token. Crypto markets are volatile and speculative — do not risk money you can’t afford to lose.

...

> Note: The next few sections are about online gambling in all its forms, where “investing” is the buy-and-hold long-form “acceptable” form of gambling because it’s tied to world GDP growth. Cryptocurrencies are subject to wild swings and spikes, and the currency tied to Gas Town is on a wild swing up. But it’s still gambling, and this stuff is only for people who are into that… which is not me, and should probably not be you either.

In the next post he said he wasn't going to shill it any more, and then the price collapsed and people sent him death threats on Twitter. It probably would have collapsed anyway. Perhaps there was supposedly some implicit bargain that he shouldn't take the money if he wasn't going to shill? Well, there's certainly no rule saying you have to do that.

I think he's not very much to blame for taking the money from degenerate gamblers, and the cryptocurrency idiots are mostly to blame for their own mistakes.

gavin-1 1/24/2026|||

> I think he's not very much to blame for taking the money from degenerate gamblers, and the cryptocurrency idiots are mostly to blame for their own mistakes.

I empathize with the disdain for crypto idiots, but I still think the people running or promoting these scams deserve most of the blame. "There's a market for my poison" is every dopamine dealer's excuse.

cap11235 1/24/2026||

Yeah, and I don't want to be involved in that shit. Yeggae can go fuck off.

cannonpr 1/24/2026||||

“Degenerate gamblers” is the kind of stigma that stops people and their families getting help for addiction. Even if you believe it’s a moral failing, the families deserve better.

skybrian 1/24/2026|||

Very true. Although, I wonder how much of that sort of thing was going on in this case? Did people actually bet money they couldn't afford to lose on this crazy scheme?

PKop 1/25/2026||||

No harsh belittling is what makes them quit, not accommodating their nonsense and excusing it. There should be a stigma with destructive behavior. The flaw is this decades-long trend of talking about stigmas and refusing to condemn bad behaviors.

cannonpr 1/26/2026||

I’m afraid we tried that for quite a few centuries, with very little effect. Infact most major world religions had phases with heavy punishment, condemnation, belittling, and you are going to hell stuff over gambling. Yet here we are.

draw_down 1/24/2026|||

[dead]

dpatterbee 1/23/2026||||

I'm fairly certain those disclaimers were added after he got some pushback from the original post.

skybrian 1/24/2026||

One of them clearly was (marked "Edit: "). I don't know about the others.

andrepd 1/24/2026||||

> I think he's not very much to blame for taking the money from degenerate gamblers, and the cryptocurrency idiots are mostly to blame for their own mistakes.

So drug dealers are not to blame for taking the money from degenerate addicts! Let's free everyone and disband the DEA, we'll save billions of dollars.

Oh wait nvm this line of thinking only applies to sv people

matkoniecz 1/24/2026||||

He is still an evil scammer scamming people.

In the same way signposting and credibly warning "I murder people" does not make ok to murder people.

skybrian 1/24/2026||

Do you have the same attitude towards all forms of gambling?

wahnfrieden 1/24/2026|||

Yegge wrote in his blog post (viewed by many who ended up buying in) that it is an investment and that he wishes the investors will become "filthy rich". He wrote the post as an introduction to the concept of BAGS for an audience that is unfamiliar with it. He onboarded people to the platform and to his pump and dump scheme (in which he pumped, and dumped, then announced he's walking away from it).

You left out that part of the post and only mentioned the disclaimer he added at the top after he got pushback on his messaging. Are you influenced by his celebrity?

matkoniecz 1/25/2026|||

Yes, if people benefiting from gambling misrepresent it as for example investing.

In such cases it is a scam.

wahnfrieden 1/24/2026||||

He pumped, and dumped. He stopped shilling at the moment that the dump was proceeding. That's what pump and dump grifters do.

Details https://pivot-to-ai.com/2026/01/22/steve-yegges-gas-town-vib...

cap11235 1/24/2026|||

Maybe I'd care about his opinion if he didn't take the money. I consider this worse than OSS taking VC money. At least those don't have a scam auto-builtin to the structure beyond normal capitalistic parasitism.

Also, 275k lines for a markdown todo app. Anyone defending this is an idiot. I'll just say that. Go ahead, defend it. Go do a code review on `beads`. Don't say it's alright, but gastown is madness. He fucking sucks.

piker 1/23/2026|||

> If you read Steve's writeup

Personally I got about 3 paragraphs into what seemed like a twelve-page fevered dream and filed it under "not for me yet".

chwtutha 1/23/2026|||

> And the reaction is a bunch of Very Serious Engineers who cross their arms and harumph at it for being Unprofessional and Not Serious and Not Ready For Production.

Exactly!

pja 1/23/2026||

They’re part of Steve’s art project, they just don’t realise it.

Xmd5a 1/23/2026||||

> OK! That was like half a dozen great reasons not to use Gas Town. If I haven’t got rid of you yet, then I guess you’re one of the crazy ones. Hang on. This will be a long and complex ride. I’ve tried to go super top-down and simplify as much as I can, but it’s a bit of a textbook.

michaelcampbell 1/25/2026||||

Yegge's been around a long, long time and this is about within a standard deviation of his normal writings, at least in style. I haven't read much of his LLM/AI related stuff, but none of Gas Town left me with any sort of "huh" reaction, knowing the author.

saidarembrace 1/23/2026|||

For better or worse, we are making history.

tikhonj 1/23/2026|||

A sense of art and whimsy and experimentation is less compelling when it's jumping on the hypest of hype-trains. I'd love to see more folk art in programming, but Gas Town is closer to fucking Beeple than anything charming.

pydry 1/23/2026|||

>I often feel like our industry has lost its sense of whimsy and experimentation from the early days, when people tried weird things to see what would work and what wouldn't.

Remember the days when people experimented with and talked about things that werent LLMs?

I used to go to a lot of industry events and I really enjoyed hearing about the diversity of different things people worked on both as a hobby and at work.

Now it's all LLMs all the time and it's so goddamn tedious.

Ronsenshi 1/23/2026|||

> I used to go to a lot of industry events and I really enjoyed hearing about the diversity of different things people worked on both as a hobby and at work.

I go to tech meetups regularly. The speed at which any conversation end up on the topic of AI is extremely grating to me. No more discussions about interesting problems and creative solutions that people come up with. It's all just AI, agentic, vibe code.

At what point are we going to see the loss of practical skills if people keep on relying on LLMs for all their thinking?

magicalist 1/23/2026|||

> No more discussions about interesting problems and creative solutions that people come up with. It's all just AI, agentic, vibe code.

And then you give in and ask what they're building with AI, that activation energy finally available to build the side project they wouldn't have built otherwise.

"Oh, I'm building a custom agentic harness!"

...

Analemma_ 1/23/2026||||

It's like the entire software industry is gambling on "LLMs will get better faster than human skills will decay, so they will be good enough to clean up their own slop before things really fall apart".

I can't even say that's definitely a losing bet-- it could very well happen-- but boy does it seem risky to go all-in on it.

FridgeSeal 1/23/2026|||

On one hand, it’s extremely tiring having to put up with that section of our industry.

On the other, if a large portion of the industry goes all in, and it _doesn’t_ pay off and craters them, maybe the overhyping will move onto something else and we can go back to having an interesting, actually-nice-to-be-in-industry!

zdragnar 1/24/2026||

I can't help but think of a video of a talk by someone- uncle Bob maybe?- talking about the origin of the agile manifesto.

He framed it as software developers were once the experts in the room, but so many young people joined the industry that managers turned to micromanaging them out of instinctual distrust. The manifesto was supposed to be the way for software developers to retake the mantle of the professional expert, trusted to make things happen.

I don't really think that happened, especially with agile becoming synonymous with Scrum, but if this doesn't pay off and craters the industry, it seems like it'd be the final nail in that coffin.

FeteCommuniste 1/23/2026|||

Some of the heads like Altman seem to be putting all their chips in the "AGI in [single-digit number] years" pile.

bleepblap 1/24/2026|||

It's incredible the change over the last few years even on the hardware side. I go to the supercomputing.org conference annually and saw folks advertising "AI power distribution units". There used to be a lot of neat innovation, and now every last thing has to have "AI" in the title, it's infuriating

TeMPOraL 1/24/2026|||

Well, LLMs are an engineering breakthrough of the degree somewhere between the Internet and electricity, in terms of how general-purpose and broadly-applicable they are. Much like them, LLMs have the potential to be useful in just about everything people do, so it's no surprise they've dominated the conversation - just like electricity and the Internet did, back in their heyday.

(And similar to the two, I expect many of the initial ideas for LLM application to be bad, perhaps obviously stupid in hindsight. But enough of them will work to make LLMs become a lasting thing in every aspect of people's lives - again, just like electricity and the Internet did).

pydry 1/24/2026||

It reminds me most of the release of the first iPhone - very flashy, very overhyped, adds a bit of convenience to people's lives but also likely to measurably damage people's brains in the long run.

~80% of the usage patterns i see these days falsely assume that LLMs can handle their own quality control and are optimizing for appearance, potential or demo-worthiness rather than hardcore usefulness. Gas town is not an outlier here.

When the internet and electricity were ~3 years old people were already using it for stuff that was working and obviously world changing rather than as demos of potential.

That 20% of usage patterns that work now arent going away but the other 80% are going to be seen as blockchainesque hype in 5 or 10 years.

CuriouslyC 1/23/2026|||

I like gastown's moxie, it's fun, and seems kind of tongue in cheek.

What I don't like is people me-tooing gastown as some breakthrough in orchestration. I also don't like how people are doing the same thing for ralph.

In truth, what I hate is people dogpiling thoughtlessly on things, and only caring about what social media has told them to care about. This tendency makes me get warm tingles at the thought of the end of the world. Agent smith was right about humanity.

FuckButtons 1/23/2026||

I mean, isn’t the whole point of Ralph that it’s an allusion to “I’m in danger” because Claude in a for loop can do your job?

CuriouslyC 1/23/2026|||

I believe the intent was that he's dumb but persistent.

aprilthird2021 1/23/2026|||

No, Ralph is famously dumb and needs lots of hand-holding and explanations of things most people think are very simple and can hold very little in his head at once.

But that's often enough to loop over and over again and eventually finish a task

Barrin92 1/23/2026|||

>it is a mixture of technology and art, it is provocative

There's no art (or engineering) in this and the only provocative thing about it is that Yegge apparently decided to turn it into a crypto scam. I like the intersection of engineering and art but I prefer if it includes both actual engineering and art, 100 rabbits (100r.co) is a good example of it, not a blog post with 15 AI generated images in it that advocates some unholy combination of gambling, vibe coding and cryptocurrency crap.

wrs 1/23/2026|||

Perhaps it was his followup post about how people are lining up to throw millions of VC dollars at his bizarre whimsical fever dream that disturbs people? I’m all for arts funding, but…

square_usual 1/23/2026||

Isn't the point that he refused them? VCs can be dumb (see the crypto hype, even the recent inflated AI raises) so I wouldn't put too much stock in what they think is valuable.

lupire 1/23/2026|||

He participated in a crypto pump and dump

https://pivot-to-ai.com/2026/01/22/steve-yegges-gas-town-vib...

cap11235 1/24/2026|||

HAHAHA

SomaticPirate 1/23/2026|||

It isn't though. It crossed the chasm when Steve (who I would like to think is somewhat comfortable after writing a book, holding a director level position at several startups) decided to endorse an outright crypto pump and dump.

When he decided to monetize the eyeballs on the project instead of anything related to the engineering. Which, of course, Steve isn't smart enough to understand (in his own words) and he recommends you not buy but he still makes a tidy profit from it.

Its a memecoin now... that has a software project attached to it. Anything related to engineering died the day he failed to disavow the crypto BS and instead starting shrilling it.

What happened to engineers not calling out BS as BS.

ewoodrich 1/23/2026||

Okay yeah, not great...

https://steve-yegge.medium.com/bags-and-the-creator-economy-...

lovich 1/23/2026|||

My favorite part about that is gas town is supposedly so productive that this guys sleep patterns are affected by how much work he’s doing, but he took the time to physically go to a bank to get a 5 figure payout.

It makes it difficult to believe that gas town is actually producing anything of value.

I also lol at his bitching about how the bank didn’t let him do the transactions instantly as he describes himself how much of a scam this seems and how the worst thing is his bank account being drained, like banks don’t have a self interest in protecting their clientele from such scams.

vanderZwan 1/24/2026|||

> I don't get the widespread hatred of Gas Town. If you read Steve's writeup, it's clear that this is a big fun experiment. It pushes and crosses boundaries, it is a mixture of technology and art, it is provocative.

Because I actually have an arts degree and I know the equivalent of a con artist in a rich people arts gallery bullshitting their way into money when I see one.

And the "pushing and crossing boundaries" argument has been abused as a pathetic defense to hide behind shallowness in the art world for longer than anyone in this discussion board has been alive. It's not provocative when it's utterly predictable, and in this case the "art" is "take the most absurd parody of AI culture and play it straight". Gee whiz how "creative" and "provocative".

tracerbulletx 1/23/2026|||

Its because people are treating the experiment like a serious path forward for their business.

JamesTRexx 1/23/2026|||

"our industry has lost its sense of whimsy"

The first thing I thought as I read his post and saw the images of the weasels was that he should make a game of it. Maybe name it Bitborn.

bdcravens 1/23/2026|||

> I don't get the widespread hatred of Gas Town.

Fear over what it means if it works.

mrkeen 1/23/2026|||

I work in a typical web app company which does accounting/banking etc.

A couple of days ago I was sitting in a meeting of 10-15 devs, discussing our AI agents. People were raising issues and brainstorming ways around the problems with AI. How to make the AI better.

Our devs were occupied doing AI things, not accounting/banking things.

If the time savings were as promised, we should have been 3 devs (with the remaining devs replaced by 7-10 AI agents) discussing accounting/banking.

If Gas Town succeeds, it will just be the next toy we play with instead of doing our jobs.

square_usual 1/23/2026|||

Isn't that fun though? We get paid to fuck around. People say AI is putting devs out of jobs, I say we're getting paid to play with them and see if there's any value there. This is no different from the dev tools boom of the ZIRP era: I remember having several sprints worth of work just integrating the latest dev tool whose sales team won our execs over.

This is only partly tongue in cheek :P

turtlebits 1/23/2026||||

Who wants to do grunt work when you can play architect to a bunch of robots?

Its like the ultimate RTS, plus you get paid.

stronglikedan 1/23/2026||||

Playing with new toys is part of doing my job. In my shop, we call them "ooh shiny"'s. Most devs are in the same boat, but I feel bad for those that aren't.

bdcravens 1/24/2026|||

Sounds like more of an issue of corporate meeting culture.

xyzsparetimexyz 1/24/2026|||

Has it written anything of quality?

cap11235 1/24/2026||

beads is a 275k line todo tracker (probably more now). Yeggae is proud to have never read the source. I'm sure its high quality.

DonHopkins 1/25/2026|||

I really don't get the point. An LLM can easily, flexibly, and masterfully track commented hierarchal yaml todo lists without breaking a sweat, with zero lines of code.

It's like writing a 275k line C++ program just to printf("You are absolutely correct!") when ChatGPT can do that for you with a one line prompt in just one shot.

cap11235 1/24/2026|||

Anyone using beads should switch to something else that isn't insane. If you like beads, https://github.com/hmans/beans works the same (not my project), just that its serdes is markdown files with front matter, in a dot folder. Like every sane solution. No daemons, no sync branches. I cannot guarantee the project at all, but at least its better than beads. Or make your own; this is one example of one such project.

q3k 1/24/2026|||

It reads like the ramblings of a smart person experiencing a psychotic episode.

wahnfrieden 1/24/2026||

First Yegge read?

antonvs 1/24/2026||

The best thing about LLMs is that they can summarize Yegge posts to extract any actually useful content.

sailfast 1/24/2026|||

I didn't read this article as hate at all, FWIW. It was a pretty measured review of what it is and what it isn't with some much clearer diagrams of the mental models.

cap11235 1/24/2026||

[flagged]

freedomben 1/23/2026|||

Links to Steve's writeup for Gas Town for those who don't have them yet:

[Medium post]: https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16d...

[HN Discussion]: https://news.ycombinator.com/item?id=46458936

matkoniecz 1/24/2026||

And https://steve-yegge.medium.com/bags-and-the-creator-economy-... where you can read about its author scamming people

FergusArgyll 1/24/2026||

I don't understand what's "scammy" about a rugpull. What did the "investors" expect? That lolcoin would become a cash flow positive business and disburse dividends?

matkoniecz 1/25/2026||

The part where you lie to people and falsely describe it as investing and promise/imply/mention profits.

Rugpull is a scam by definition, being confused why scam is scammy seems weird.

joquarky 1/25/2026|||

> I often feel like our industry has lost its sense of whimsy and experimentation from the early days, when people tried weird things to see what would work and what wouldn't.

The gold rush brogrammers took over. They only care about money and they have displaced most of the more whimsical (but competent) "nerds" over the past decade.

itsafarqueue 1/23/2026|||

It’s not the whimsy. It’s that the whimsy is laced with casual disdain, a touch too much “let me buy you a stick of gum and show you how to chew it”, a frustrated tenor never stated but dog whistled “you dumb fucks”. A soft sharp stink of someone very smart shoving that fact in your face as they evangelise “the obvious truth” you’re too stupid to see.

And maybe he’s even right. But the reaction is to the flavour of chip on the shoulder delivery mixed into an otherwise fun piece.

cap11235 1/24/2026||

Don't forget a bit of crypto! People are being way to nice going "I don't understand, but ...". Fuck him.

inadequatespace 1/25/2026|||

> Maybe it's because we also have suits telling us we have to use neural nets everywhere for everything Or Else, and there's no sense of fun in that.

Yes, and using it a justification to offshore/ layoff

DonHopkins 1/24/2026|||

Hey, I didn't get a harumph out of that VSE crossing his arms at me!

https://www.youtube.com/watch?v=g2Bp8SqYrnE

PKop 1/25/2026|||

> it's clear that this is a big fun experiment.

No it's not clear, because at every turn we're told we're supposed to take it seriously, that there's something there there and that's it's a very real hint at some very real future not whimsical nonsense made for a laugh. You can see this in Steve's writing, calling out the non-believers. Then when you call the bluff, well "it's just a prank bro chill out."

> It pushes and crosses boundaries

What does this mean? This is fluff talk nonsense.

Something that's burning through thousands of dollars, producing what exactly?, is deserving of our respect why?

HDThoreaun 1/26/2026||

Why can't you take experiments seriously? It's a prediction of what the future could look like, not a production ready tool. If youre problem with it is "they took our jobs" sure that makes sense, but if youre problem is that it is a crappy tool then youre not looking at it correctly.

guelo 1/23/2026|||

This is just not true. Yegge is serious and thinks Gas Town is the next big thing.

Keyframe 1/24/2026|||

He did us once with Javascript prophecy. Has this man no decency?? :)

cap11235 1/24/2026|||

For his income, yes.

DonHopkins 1/24/2026|||

Hi mediaman! I'm totally there with you and Steve on the whimsy and experimentation! And your tolerant attitude gives me the Dutch courage to post this.

I've been reading Yegge since the "Stevey's Drunken Blog Rants™" days -- his rantings on Lisp, Emacs, and the Eval Empire shaped how I approach programming. His pro-LLM-coding rants were direct inspiration for my own work on MOOLLM. The guy has my deep respect, and I'm intrigued by his recent work on Sourcegraph and Gas Town.

Gas Town and MOOLLM are siblings from that same Eval Empire -- both oriented along the Axis of Eval, both transgressively treating LLMs as universal interpreters. MOOLLM immanentizes Eval Incarnate -- https://github.com/SimHacker/moollm/blob/main/designs/eval/E... -- where skills are programs, the LLM is eval(), and play is but the first step of the "Play Learn Lift" methodology: https://github.com/SimHacker/moollm/tree/main/skills/play-le....

The difference is resource constraints. Yegge has token abundance; I'm paying out of pocket. So where Gas Town explores "what if tokens were free?" (20-30 Claude instances overnight), MOOLLM explores "what if every token mattered?" Many agents, many turns, one LLM call.

To address wordswords2's concern about "no metrics or statistics" -- I agree that's a gap in Gas Town. MOOLLM makes falsifiable claims with receipts. Last night I ran an Amsterdam Fluxx Marathon stress test: 116+ turns, 4 characters (120+ character-turns per LLM call), complex social dynamics on top of dynamic rule-changing game mechanics. Rubric-scored 94/100. The run files exist. Anyone can audit.

qcnguy's critique ("same thing multiplied by ten thousand") is exactly the kind of specific feedback that helps systems improve. I wrote a detailed analysis comparing the two approaches -- intellectual lineage (Self, Minsky's K-lines, The Sims, LambdaMOO), the "vibecoded" problem (MOOLLM is LLM-generated but rigorously iterated, not ship-and-hope), and why "carrier pigeon" IPC architecture is a dark pattern when LLMs can simulate many agents at the speed of light.

an0malous raises a real fear about bosses thinking "throw agents at it" replaces engineering. Both systems agree: design becomes the bottleneck. Gas Town says "keep the engine fed with more plans." MOOLLM says "design IS the point -- make it richer." Different answers, same problem.

lowbloodsugar mentions building a "proper, robust, engineering version" -- I'd love to compare notes. csallen is right that "future" doesn't mean "production-grade today."

Analysis: https://github.com/SimHacker/moollm/blob/main/designs/GASTOW...

MOOLLM repo: https://github.com/SimHacker/moollm

Happy to discuss tradeoffs or hear where my claims don't hold up. Falsifiable criticism welcome -- that's how systems improve.

DonHopkins 1/24/2026||

Adventure Uplift — Building a YAML-to-Web Adventure Compiler with Simulated Computing Pioneers:

I ran a 260KB session log where I convened a simulated symposium of computing pioneers to design an Adventure Compiler — a tool that compiles YAML adventure definitions that run on MOOLLM under cursor into standalone deterministic browser games requiring no LLM at runtime.

The twist: the "attendees" include AI-simulated tributes to Will Wright, Alan Kay, Marvin Minsky, Seymour Papert, Ted Nelson, Ken Kahn, Gary Drescher, and 25+ others — both living legends and memorial candles for those who've passed. All clearly marked as simulated tributes, not transcripts.

What emerged from this thought experiment:

- Pie menus as the universal interface (rooms, inventory, dialogue trees)

- Sims-style needs system with YAML Jazz inner voice ("hunger: 1 # FOOD. FOOD. FOOD.")

- Prototype-based objects (Self/JavaScript delegation chains)

- Schema mechanism + LLM = "teaching them to fly"

- Git as the collaboration operating system

- ToonTalk-inspired "programming by petting" for terpene kittens

- Speed of Light simulation — the opposite of "carrier pigeon" multi-agent architectures

On that last point: most multi-agent systems use message passing between separate LLM calls. Agent A generates output, it gets detokenized to text, sent over IPC, retokenized into Agent B's context. MOOLLM inverts this. Everything happens in one LLM call.

The spatial MOO map (rooms connected by exits) provides navigation, but communication is instantaneous within a call. Many agents, many turns, zero latency between them — and zero token requantization or semantic noise from successive detokenization/tokenization loops.

The session includes adversarial brainstorming where Barbara Liskov challenges schema contracts, James Gosling questions performance, Amy Ko pushes accessibility, and Bret Victor demands immediate feedback. Each critique gets a concrete response.

Concrete outputs: a working linter, architecture decisions, 53 indexed topics from "Food Oriented Programming" to "Hidden Objects as Invisible Infrastructure."

This is MOOLLM's Play-Learn-Lift methodology in action — play with ideas, extract patterns, lift into reusable skills and efficient scripts.

Session log (260KB, 8000+ lines): https://github.com/SimHacker/moollm/blob/main/examples/adven...

MOOLLM repo: https://github.com/SimHacker/moollm

The session uses representation ethics guidelines — all simulated people are clearly marked, deceased figures invoked with memorial candles, and the framing is explicitly "educational thought experiment."

Happy to discuss the ethics of simulating people, the architecture decisions, or how this relates to my earlier Gas Town comparison post.

DonHopkins 1/24/2026||

In the simulated discussion guest book entry, simulated Douglass Engelbart wrote:

>Doug Engelbart (Augmentation): "Bootstrapping. The tools that build the tools. Your adventure compiler should be able to compile ITS OWN documentation into an adventure ABOUT how it works. The manual is a playable game."

That is exactly how the self documenting categorized skill directory/room works -- the directory is a room with subdirectories for every skill, themselves intertwingled rooms, which form a network you can navigate around via k-lines (see also tags).

Here is the skills dir, with the ROOM.yml file that makes it a room (like COM QueryInterface works: multiple interfaces available for a class, for multiple aspects of it, the directory is IUnknown and you can QI by looking for known interfaces like ROOM.yml, CHARACTER.yml, CONTAINER.yml that inherit from the corresponding skills).

And the README.md file is naturally the ubiquitous human readable documentation (also great for LLM deep dives). And github kindly formats and publishes README.md on every repo directory page, supporting mermaid diagrams, etc):

MOOLLM Skills dir:

https://github.com/SimHacker/moollm/tree/main/skills

MOOLLM Skills room, with skill K-Line navigation protocol:

https://github.com/SimHacker/moollm/blob/main/skills/ROOM.ym...

  # ROOM.yml — The Skill Nexus
  #
  # This is a ROOM — a metaphysical library where all capabilities live.
  # Every skill is a book that teaches itself when you read it.
  # Every cluster is a shelf of related knowledge.
  # Every ensemble is a team that works together.

To go meta, you can enter the Skill Skill (skills/skill), an extended MOOLLM meta-skill that knows all about creating new skills (via the constructionist "Play Learn Lift" strategy), and importing and upgrading Anthropic skills:

https://github.com/SimHacker/moollm/tree/main/skills/skill

And here is a narrative session of me giving a tour of the category and skill networks by hopping around through K-Lines!

MOOLLM currently has 103 Anthropic compatible but extended skills (using 7 MOOLLM extensions, like CARD.yml, K-Lines, Self Prototypes and Delegation, etc).

Session Log: K-Line Connections Safari:

https://github.com/SimHacker/moollm/blob/main/examples/adven...

Eight luminaries have been summoned as Hero-Story familiars — not puppets, but conceptual guides whose traditions we invoke. Each carries the K-lines they pioneered. [...]

ENTERING THE SKILL NEXUS

You push through a shimmering membrane and step into the Skill Nexus.

The space is impossible — a vast spherical chamber where books float in mid-air, orbiting a central point of warm golden light. But these aren't books. They're SKILLS. Living documents that teach themselves when you read them.

Lines of golden light connect related skills. Each connection pulses with meaning. This isn't a library — it's a constellation of knowledge.

Your companions materialize beside you:

Marvin Minsky adjusts his glasses, looking around with evident satisfaction.

"Ah! K-lines made manifest. Each of these floating tomes is a knowledge structure. Touch one and it reactivates an entire constellation of associations. I wrote about this in 1985, but I never imagined seeing it rendered so... literally."

Ted Nelson is already examining the golden threads between skills.

"Two-way links! Every connection goes BOTH directions. When skill A references skill B, skill B knows about skill A. This is what I've been trying to explain since 1965! Everything is deeply intertwingled!"

James Burke turns to address an invisible camera.

"You're looking at the Skill Nexus. A room where every door leads to another room, and every room has doors to everywhere else. But here's the thing — the signs above each door tell you WHY. Not just where you're going, but what connects HERE to THERE. That's what we're going to explore."

Palm scampers up to a floating skill-book labeled "incarnation" and hugs it.

"This is where I became REAL! Don spoke the wish, the tribunal approved, and I wrote my own soul."

q3k 1/24/2026||

[flagged]

DonHopkins 1/24/2026||

[flagged]

rulelet 1/24/2026|||

We have a different take than Gastown. If AI behaves unreliably and unpredictably, maybe the problem is the ask. So we looked at backend code and decided it was time to bring in more declarative programming. We are already halfway there with declarative frontend (React) and declarative database (SQL). Functional programming is an answer, but functional programming didnt replace object oriented programming because of practical reasons.

So even if the super serious engineers are serious, they should watch their back. Eventually enough guardrails will be created or even the ask will change enough for a lot of automation to happen. And make no mistake, it is automation no different than having automated testing replace armies of manual testing or code generation or procedural generation or any other machine method. And who is going to be left with jobs? People who embrace the change, not people who lament for the good old days or who can't adapt.

Sucks but just how the world works. Sit on the bleeding edge or be burned. Yes there is an "enough" but I suspect enough is around people willing to look at Gastown or even make their own Gastown, not the other side.

AtlasBarfed 1/23/2026|||

Yeah where he probably Burns like a million dollars of money.

Just for fun!

walthamstow 1/23/2026||

He's paying $600 a month for 3x Claude Max subs. It's in his article.

toraway 1/23/2026|||

…and now funded by a $GAS crypto coin on the BAGS platform so it even pays for itself!

https://steve-yegge.medium.com/bags-and-the-creator-economy-...

walthamstow 1/23/2026||

What a tasty disclosure section that is

ares623 1/23/2026|||

It's a "let them eat cake" write up.

Johnny_Bonk 1/23/2026|||

Yeah it's unbelievably tiresome, endless complaints from people pushing up their glasses complaining, ITS A PROJECT ABOUT POLECATS CALLED GAS TOWN MADE FOR FUN, read that again, either admire it and enjoy it or quit the umpteenth complaint about vibecoding.

NedF 1/23/2026||

[dead]

usefulposter 1/23/2026||

>while Yegge made lots of his own ornate, zoopmorphic [sic] diagrams of Gas Town’s architecture and workflows, they are unhelpful. Primarily because they were made entirely by Gemini’s Nano Banana. And while Nano Banana is state-of-the-art at making diagrams, generative AI systems are still really shit at making illustrative diagrams. They are very hard to decipher, filled with cluttered details, have arrows pointing the wrong direction, and are often missing key information.

So true! Not to mention the garbled text and inconsistent visuals across the diagrams———an insult to the reader's intelligence. How do people tolerate this visual embodiment of slurred speech?

toraway 1/23/2026||

Yeah I couldn’t figure out if they were just intended as illustrations and gave up trying to read them after a while.

Which is unfortunate as it would have been really helpful to have actually legible architecture diagrams, given the prose was so difficult for me to untangle due to the manic “fun” irreverent style (and it’s fine to write with a distinctive voice to make it more interesting, but still … confusing).

Plus the dozens of new unique names and connections introduced every few paragraphs to try to keep in my head…

I first asked Gemini 3 Pro to condense it to a boring technical overview and it produced a single page outline and Mermaid diagrams that were nearly as unintelligible as the original post so even AI has issues digesting it apparently…

cap11235 1/24/2026||

[flagged]

sethaurus 1/24/2026|||

Thrilled to see someone else using a triple-em dash in the wild⸻keep holding the line.

falcor84 1/24/2026|||

I generally am a fan of polished writing, but I do believe that there's room for quickly fired experimental stuff, and quite enjoyed this piece. With the speed he was going, I wouldn't be surprised if the system architecture actually changed in between subsequent sections of the post. It's not a scientific article, but just a cross-country runner at the top of his game giving us a quick update without breaking his stride, and I'm all here for that.

As Basil Exposition said "I suggest you don’t worry about this sort of thing and just enjoy yourself".

usefulposter 1/24/2026||

Sure. The critique is with his incoherently labeled images, not his prose or passion project.

If LLMs can't produce readable technical diagrams ("Figure n"), avoid them for diagramming.

Don't insult the reader with strings like Oarity, Ed3e Csess, Ouclstnoc, relinemen, ressore Critieal, Foll Throughput, Witnese/Refin, ecstate ta Mayor, and Hsoide Fulures Stafey Fen.

https://miro.medium.com/v2/resize:fit:1400/format:webp/1*X3z...

https://miro.medium.com/v2/resize:fit:1400/format:webp/1*7Cr...

https://miro.medium.com/v2/resize:fit:1400/format:webp/1*blw...

rsynnott 1/24/2026||

I like how ‘refine 1’ is correctness, followed by ‘refine 2’, which is of course ‘oarity’.

I’m convinced that AI has totally broken the brains of its users; how on earth could you look at this thing and think “yes, that is a reasonable thing to post in public”? For whatever reason, for many users, the lesson of the AI revolution seems to be “just produce arbitrarily shoddy nonsense, it’s fine, nobody cares anyway”… The worrying thing is that this may be _true_.

antonvs 1/24/2026||

> How do people tolerate this visual embodiment of slurred speech?

99% of diagrams essentially act as little more than decoration.

> zoopmorphic

My next software product is gonna be zoopmorphic. VCs'll love it!

MrOrelliOReilly 1/23/2026||

The author's high-value flowcharts vs Steve Yegge's AI art is enough of a case-in-point for how confusing his posts and repos are. However this is a pervasive problem with AI coding tools. Unsurprisingly, the creators of these tools are also the most bullish about agentic coding, so the source code shows the consequences. Even Claude Code itself seems to experience an unusually high number of regressions or undocumented changes for such a widely used product. I had the same problem when recently trying to understand the details of spec-kit or sprites from their docs. Still, I agree that Gas Town is a very instructive example of what the future of AI coding will look like. I'm confident mature orchestration workflows will arrive in 2026.

zingar 1/23/2026|

Also struggling with sprites, I thought it was just me!

sandinmyjoints 1/23/2026||

Lots of comments about Gas Town (which I get, it's hard not to talk about it!), but I thought this was a pretty good article -- nice job of summing up various questions and suggesting ways to think about them. I like this bit in particular:

> A more conservative, easier to consider, debate is: how close should the code be in agentic software development tools? How easy should it be to access? How often do we expect developers to edit it by hand?

> Framing this debate as an either/or – either you look at code or don’t, either you edit code by hand or you exclusively direct agents, either you’re the anti-AI-purist or the agentic-maxxer – is unhelpful.

> The right distance isn’t about what kind of person you are or what you believe about AI capabilities in the current moment. How far away you step from the syntax shifts based on what you’re building, who you’re building with, and what happens when things go wrong.

athrowaway3z 1/24/2026||

> Buried in the chaos are sketches of future agent orchestration patterns

I'm not sure if there are that many. We need to be vigilant of "it feels useful & powerful", because it's so easy to feel that way.

When I write complex plans, I can tell Claude to spawn agents for each task and I can successfully 1-shot a 30-60 minute implementation.

I've toyed with more complicated patterns, but unlike this speculative fiction, I did need my result both simple and working.

A couple of times now I've had to spend a lot of hours trying to unfuck a design i let slip through. The kind where 1 agent injects some duplicate code/architecture pattern into the system that's correct enough not to be flagged, but wrong enough to forever trip up every subsequent fresh agents that stumble on it.

I tell people my job now is to kick these things every 15 minutes. Its a kinda joke kinda not. But they definitely need kicking. Without, the decoherence of a non-trivial project is too high, and you still need time to know; where and how to kick.

I'm not sure what I'd need to be convinced a higher level of orchestration can do that. I do like to try new things. But my spider-sense is telling me this is a Collatz-conjecture-esque dead-end. People get the feeling of making giant leaps of progress, which anybody using these things should be familiar with by now, but something valuable is always just out of reach with the tools we currently have.

There are some big gains by guiding agents/users to use more sub agents with a clean context - perhaps with some more knobs - but I'd advise against acting under the assumption using grander orchestration tools will inevitably have a positive ROI.

visarga 1/24/2026|||

> either you look at code or don’t, either you edit code by hand or you exclusively direct agents, either you’re the anti-AI-purist or the agentic-maxxer – is unhelpful.

If you're looking at all your code you are just walking the motorcycle. You need tests to automate your eyes. In fact I believe tests and specs are the new product, code can be regenerated at will.

That is why we see vibe coding projects that replicate well specced and implemented products like web browsers, you get both the specs and differential testing for free.

itchingsphynx 1/24/2026||

Maggie had many great articles. Technology x anthropology.

slfnflctd 1/23/2026||

> Yegge deserves praise for exercising agency and taking a swing at a system like this [...] then running a public tour of his shitty, quarter-built plane while it’s mid-flight

This quote sums it all up for me. It's a crazy project that moves the conversation forward, which is the main value I see in it.

It very well could be a logjam breaker for those who are fortunate enough to get out more than they put into it... but it's very much a gamble, and the odds are against you.

shermantanktop 1/23/2026||

Yegge is just running arbitrage on an information gap.

It's the same chasm that all the AI vendors are exploiting: the gap between people who have some idea what is going on and the vast mass of people who don't but are addicted to excitement or fear of the future.

Yegge is being fake-playful about it but if you have read any of his other writing, this tracks. None of it is to be taken very seriously because he values provocation and mischief a little too highly, but bits of it have some ideas worth thinking about.

pydry 1/24/2026|

I wonder if he's being paid.

I detected a noticeable uptick in posts on reddit bragging AI coding in the last month which fit the pattern of other opinion shaping astroturfing projects ive seen before.

If Claude came to me with a bundle of cash and tokens to encourage me to keep the AI coding hype train going I'd also go heavy on the excitability, experimental attitude, humor and irreverence.

I'd also leave a mountain of disclaimers to help protect future me's reputation.

bleepblap 1/24/2026||

He got a cool 50k from some memecoin for this

suriya-ganesh 1/23/2026||

>Yegge is leaning into the true definition of vibecoding with this project: “It is 100% vibecoded. I’ve never seen the code, and I never care to.”

I don't get it. Even with a very good understanding of what type of work I am doing and a prebuilt knowledge of the code, even for very well specced problem. Claude code etc. just plain fail or use sloppy code. How do these industry figures claim they see no part of a 225K+ line of code and promise that it works?

It feels like we're getting into an era where oceans of code which nobody understands is going to be produced, which we hope AGI swoops in and cleans?

jrmg 1/23/2026||

This is also my experience. Everything I’ve ever tried to vibe code has ended up with off-by-one errors, logic errors, repeated instances of incorrect assumptions etc. Sometimes they appear to work at first, but, still, they have errors like this in them that are often immediately obvious on code review and would definitely show up in anything more than very light real world use.

They _can_ usually be manually tidied and fixed, with varying amounts of effort (small project = easy fixes, on a par with regular code review, large project = “this would’ve been easier to write myself...”)

I guess Gas Town’s multiple layers of supervisory entities are meant to replace this manual tidying and fixing, but, well, really?

I don’t understand how people are supposedly having so much success with things like this. Am I just holding it wrong?

If they are having real success, why are there no open source projects that are AI developed and maintained that are _not_ just systems for managing AI? (Or are there and I just haven’t seen them?...)

consumer451 1/24/2026|||

In my comment history can be found a comment much like yours.

Then Opus 4.5 was released. I had already had my CC cluade.md, and Windsurf global rules + workspace rules set up. Also, my main money making project is React/Vite/Refine.dev/antd/Supabase... known patterns.

My point is that given all that, I can now deploy amazing features that "just work," and have excellent ux in a single prompt. I still review all commits, but they are now 95% correct on front end, and ~75% correct on Postgres migrations.

Is it magic? Yes. What's worse is that I believe Dario. In a year or so, many people will just create their own Loom or Monday.com equivalent apps with a one page request. Will it be production ready? No. Will it have all the features that everyone wants? No. But it will do that they want, which is 5% of most SaaS feature sets. That will kill at least 10% of basic SaaS.

If Sonnet 3.5 (~Nov 2024) to Opus 4.5 (Nov 2025) progress is a thing, then we are slightly fucked.

"May you live in interesting times" - turns out to be a curse. I had no idea. I really thought it was a blessing all this time.

kaydub 1/23/2026||||

Yeah, it sounds like "you're holding it wrong"

Like, why are you manually tidying and fixing things? The first pass is never perfect. Maybe the functionality is there but the code is spaghetti or untestable. Have another agent review and feed that review back into the original agent that built out the code. Keep iterating like that.

My usual workflow:

Agent 1 - Build feature Agent 2 - Review these parts of the code, see if you find any code smells, bad architecture, scalability problems that will pop up, untestable code, or anything else falling outside of modern coding best practices Agent 1 - Here's the code review for your changes, please fix Agent 2 - Do another review Agent 1 - Here's the code review for your changes, please fix

Repeat until testable, maybe throw in a full codebase review instead of just the feature.

Agent 1 - Code looks good, start writing unit tests, go step by step, let's walk through everything, etc. etc. etc.

Then update your .md directive files to tell the agents how to test.

Voila, you have an llm agent loop that will write decent code and get features out the door.

joshstrange 1/23/2026|||

I'm not trying to be rude here at all but are you manually verifying any of that? When I've had LLMs write unit tests they are quick to write pointless unit tests that seem impressive "2123/2123 tests passed!" but in reality it's testing mostly nothing of value. And that's when they aren't bypassing commit checks or just commenting out tests or saying "I fixed it all" while multiple tests are broken.

Maybe I need a stricter harness but I feel like I did try that and still didn't get good results.

kaydub 1/23/2026|||

I feel like it was doing what you're saying about 4-6 months ago. Especially the commenting out tests. Not always but I'd have to do more things step by step and keep the llm on track. Now though, the last 3-4 months, it's writing decent unit tests without much hand holding or refactors.

joshstrange 1/23/2026|||

Hmm, my last experience was within the last 2 months but I'm trying not to write it off as "this sucked and will always suck", that's the #1 reason I keep testing and playing with these things, the capabilities are increasing quickly and what did/didn't work last week (especially "last model") might work this week.

I'll keep testing it but that just hasn't been my experience, I sincerely hope that changes because an agent that runs unit test [0] and can write them would be very powerful.

[0] This is a pain point for me. The number of times I've watching Claude run "git commit --no-verify"... I've told it in CLAUDE.md to never bypass commit checks, I've told it in the prompt, I've added it 10 more times in different places in CLAUDE.md but still, the agent will always reach for that if it can't fix something in 1-3 iterations. And yes, I've told it "If you can't get the checks to pass then ask me before bypassing the checks".

It doesn't matter how many guardrails I put up and how good they are if the agent will lazily bypass them at the drop of a hat. I'm not sure how other people are dealing with this (maybe with agents managing agents and checking their work? A la Gas Town?).

kaydub 1/23/2026|||

I haven't seen your issue, but git is actually one of the things I don't have the llm do.

When I work on issues I create a new branch off of master, let the llm go to town on it, then I manually commit and push to remote for an MR/PR. If there are any errors on the commit hooks I just feed the errors back into the agent.

joshstrange 1/23/2026||

Interesting, ok, I might try that on my next attempt. I was trying to have it commit so that I could use pre-commit hooks to enforce things I want (test, lint, prettier, etc) but maybe instead I should handle that myself and make it more explicit in my prompts/CLAUDE.md to test/lint/etc. In reality I should just create a `/prep` command or similar that asks it to do all of that so that once it thinks it's done, I can quickly type that and have it get everything passing/fixed and then give a final report on what it did.

toraway 1/23/2026|||

You’ll likely have the same issue relying on CLAUDE.md instructions to test/lint/etc, mine get ignored constantly to the point of uselessness.

I’m trying to redesign my setup to use hooks now instead because poor adherence to rules files across all the agentic CLIs is exhausting to workaround.

(and no, Opus 4.5 didn’t magically solve this problem to preemptively respond to that reply)

kaydub 1/23/2026||

What do your rules files look like?

I wonder if some people are putting in too much into their markdown files of what NOT to do.

I hate people saying the llms are just better auto-correct, but in some ways they're right. I think putting in too much "don't do this" is leading the llm down the path to do "this" because you mentioned it at all. The LLM is probabilistically generating it's response based on what you've said and what's in the markdown files, the fact you put some of that stuff in there at all probably increases the probability those things will show up.

kaydub 1/23/2026|||

In my projects there's generally a "developer" way to do things and an "llm agent" way to do things.

For the llm a lot of linting and build/test tools go into simple scripts that the llm can run and get shorthand info out of. Some tools, if you have the llm run them, they're going to ingest a lot from the output (like a big stacktrace or something). I want to keep context clean so I have the llm create the tool to use for build/test/linting and I tell it to create it so the outputs will keep its context clean, then I have it document it in the .md file.

When working with the LLM I have to start out pretty explicit about using the tooling. As we work through things it will start to automatically run the tooling. Sometimes it will want to do something else, I just nudge it back to use the tooling (or I'll ask it why or if there are benefits to the other way and if there are we'll rebuild the tooling to use the other way).

Finally, if the LLM is really having trouble, I kill the session and start a new one. It used to feel bad to do that. I'd feel like I'm losing a lot of info that's in context. But now, I feel like it's not so bad... but I'm not sure if that's because the llms are better or if my workflow has adapted.

Now, let me backup a little bit. I mentioned that I don't have the llm use git. That's the control I maintain. And with that my workflow is: llm builds feature->llm runs linters/tests->I e2e test whatever I'm building by deploying to a dev/staging/local env->once verified I commit. Now I will continue that context window/session until I feel like the llm starts fucking up. Then I kill the session and start a new one. I rarely compact, but it does happen and I generally don't fret about it too much.

I try to keep my units of work small and I feel like it does the best when I do. But then I often find myself surprised at how much it can do from a single prompt, so idk. I do understand some of the skepticism because a lot of this stuff sounds "hand-wavy". I'm hoping we all start to hone in on some general more concrete patterns but with it being so non-deterministic I'm not sure if we will. It feels like everyone is using it differently and people are having successes and failures across different things. People where I work LOVE MCPs but I can't stand them. When I use them it always feels like I have to remind the llm that it has an MCP, then it feels like the MCP takes too much context window and sometimes the llm still trips over how to use it.

joshstrange 1/23/2026||

Ok, that's a good tip about separate tools/scripts for the LLM, I did something similar less than a year ago so that I kept lint/test output to a minimum but it was still invoked via git hooks. I'll try again with scripts next time I'm doing this. My hope was to let the agent commit to a branch (with code that passed lint/test/prettier/etc), push it, auto-deploys to preview branches, and then that's where I'd do my e2e/QA and once I was happy I could merge it and it get deployed to the main site.

Shebanator 1/24/2026||||

I discussed approaches in my earlier reply. But what you are saying now makes me think you are having problems with too much context. Pare down your CLAUDE.md massively and never let you context usage get over 60-65%. And tell CLAUDE not to commit anything without explicit instructions from you (unless you are working in a branch/worktree and are willing to throw it all away).

mh2266 1/24/2026|||

put a `git` script in `PATH` that simply errors out i.e.:

    if "--no-verify" in sys.args:
        println("--no-verify is not allowed, file=sys.stderr)
        sys.exit(1)

and otherwise forwards to the underlying `git`

Paracompact 1/24/2026|||

Literally yesterday I was using Claude for writing a SymPy symbolic verification of a mathematical assertion it was making with respect to some rigorous algebra/calculus I was having it do for me. This is the best possible hygiene I could adopt for checking its output, and it still failed to report on results correctly.

After manual line-by-line inspection and hand-tweaks, it still saved me time. But it's going to be a long, long time before I no longer manually tweak things or trust that there are no silent mistakes.

Shebanator 1/24/2026||||

Those kinds of errors were super common 4-6 months ago, but LLM quality moves fast. Nowadays I don't see these very often at all. Two things that make a huge difference: work on writing a spec first. github.speckit, GSD, BMAD, whatever tool you like can help with this. Do several passes on the spec to refine it and focus on the key ideas.

Now that you have a spec, task it out, but tell the LLM to write the tests first (like Test-Driven Development, but without all the formalisms). This forces the LLM to focus on the desired behavior instead of the algorithms. Be sure to focus on tests that focus on real behavior: client apis doing the right error handling when you get bad input, handling tricky cases, etc. Tell the system not to write 'struct' tests - checking that getters/setters work isn't interesting or useful.

Then you implement 1-3 tasks at a time, getting the tests to pass. The rules prevent disabling tests, commenting out tests, and, most importantly, changing the behavior of the tests. Doesn't use a lot of context, little to no hallucinating, and easily measurable progress.

enraged_camel 1/23/2026|||

>> When I've had LLMs write unit tests they are quick to write pointless unit tests that seem impressive "2123/2123 tests passed!" but in reality it's testing mostly nothing of value.

This has not happened to me since Sonnet 4.5. Opus 4.5 is especially robust when it comes to writing tests. I use it daily in multiple projects and verify the test code.

joshstrange 1/23/2026||

I thought I did use Opus 4.5 when I tested this last time but I might have still been on the $20 plan and I cannot remember if you get any Opus 4.5 on that in Claude Code (I thought you did with really low limits?), so maybe I wasn't using Opus 4.5, I will need to try again.

kapimalos 1/23/2026|||

I haven’t used multi-agent set up yet but it’s intriguing.

Are you using Claude Code? How do you run the agents and make them speak?

kaydub 1/23/2026||

Let me clarify actually, I run separate terminals and the agents are separated. I think claude code cli is the best. But at home I pay per token. I have a google account and I pay for chatgpt. So I often use codex and gemini cli in tandem. I'll copy + paste stuff between them sometimes or I'll have one review the changes or just the code in general and then feed the other with the outputs. I'll break out claude code for specific tasks or when I feel like gemini/chatgpt aren't quite doing the job right (which has gotten rarer the past few months).

I messed around with separate "agents" in the same context window for a while. I even went as far as playing with strands agents. Having multiple agents was a crapshoot.

Sometimes they'd work great, but sometimes they start working on the same files at the same time, argue with each other, etc. I'd always get multiple agents working, at least how I assumed they should work, by telling the llm explicitly what agents to create and what work to pass off to what agents. And it did a pretty poor job of that. I tried having orchestration agents, but at a certain point the orchestration agent would just takeover and do everything. So I'm not big on having multiple agents (in theory it sounds great, especially since they are supposed to each have their own context window). When I attempted doing this kind of stuff with strands agents it honestly felt like I was trying to recreate claude, so I just stick with plain cli llm tools for now.

pdntspa 1/23/2026|||

I worry about people who use this approach where they never look at the code. Vibe-coding IS possible but you have to spent a lot of time in plan mode and be very clear about architecture and the abstractions you want it to use.

I've written two seperate moderately-sized codebases using agentic techniques (oftentimes being very lazy and just blanket approving changes), and I don't encounter logic or off-by-one errors very often if at all. It seems quite good at the basic task of writing working code, but it sucks at architecture and you need occasional code review rounds to keep the codebase tidy and readable. My code reviews with the AI are like 50% DRY and separating concerns

johnmaguire 1/23/2026||

In a recent Yegge interview, he mentions that he often throws away the entire codebase and starts from scratch rather than try to get LLMs to refactor their code for architecture.

kami23 1/23/2026|||

This has been my best way to learn, put one agent on a big task, let it learn things about the problem and any gotchas, and then have it take notes, do it again until I'm happy with the result, if in the middle I think there's two choices that have merit I ask for a subagent to go explore that solution in another worktree and to make all its own decisions, then I compare. I also personally learn a lot about the problem space during the process so my prompts and choices on us sequent iterations use the right language I need to use.

d1sxeyes 1/23/2026|||

Honestly, in my experience so far, if an LLM starts going down a bad path, it’s better just to roll back to a point where things were OK and throw away whatever it was doing, rather than trying to course correct.

kaydub 1/23/2026|||

I don't get you guys that are getting such bad results.

Are you guys just trying to one shot stuff? Are you not using agents to iterate on things? Are you not putting agents against each other (have one code, one critique/test the code, and put them in a loop)?

I still look at the code that's produced, I'm not THAT far down the "vibe coding" path that I'm trusting everything being produced, but I get phenomenal results and I don't actually write any code any more.

So like, yeah, first pass the llm will create my feature and there's definitely some poorly written code or duplicate code or other code smells, but then I tell another agent to review and find all these problems. Then that review gets fed back in to the agent that created the feature. Wham, bam, clean code.

I'm not using gastown or ralph wiggum ($$$) but reading the docs, looking over how things work, I can see how it all comes together and should work. They've been built out to automatically do the review + iteration loop that I do.

arrowleaf 1/23/2026|||

My feeling has been that 'serious' software engineers aren't particularly suited to use these tools. Most don't have an interest in managing people or are attracted to the deterministic nature of computing. There's a whole psychology you have to learn when managing people, and a lot of those skills transfer to wrangling AI agents from my experience.

You can't be too prescriptive or verbose when interacting with them, you have to interact with them a bit to start understanding how they think and go from there to determine what information or context to provide. Same for understanding their programming styles, they will typically do what they're told but sometimes they go on a tangent.

You need to know how to communicate your expectations. Especially around testing and interaction with existing systems, performance standards, technology, the list goes on.

kaydub 1/23/2026||

All our best performing devs/engineers are using the tools the most.

I think this is something a lot of people are telling themselves though, sure.

lknuth 1/23/2026||

Best performing by what metric? There aren't meaningful ways to measure engineer "performance" that makes them comparable as far as I know.

kaydub 1/23/2026||

Your org doesn't track engineering impact?

What about git stats?

I can tell you the guys that are consistently pushing code AND having the biggest impact are using LLM tools.

direwolf20 1/24/2026||

Are we measuring productivity by lines of code again? This was treated as unserious for decades.

kaydub 1/24/2026||

Why ignore where I mention engineering impact??? Come on, be real here

ozozozd 1/24/2026|||

What git stats do you have that show “impact”?

The OP was right to assume it was lines of code. Another assumption could be number of commits, which also doesn’t measure impact.

HDThoreaun 1/26/2026||

Track engineering impact and git stats were two separate suggestions in that comment. Every org tracks impact through performance reviews.

matkoniecz 1/24/2026|||

Probably because you mentioned "git stats".

What you meant by that?

kaydub 1/24/2026||

High number of days with commits, merging and shipping code consistently (some people/project will ship multiple times a day/week, some projects move a little slower).

That plus the completion of high impact projects makes good strong engineers.

Those are the people I see using LLMs

direwolf20 1/24/2026||

So quantity of code?

habinero 1/23/2026||||

It lets 0.05X developers be 0.2X developers and 1X developers be 0.9-1.1X developers.

The problem is some 0.05X developers thought they were 0.5X and now they think they're 2X.

kaydub 1/23/2026||

Nah, our best devs/engineers use the tools the most.

In my real life experience it's been the middling devs that always talk about "ai slop" and how the tools can't do their jobs.

enraged_camel 1/23/2026|||

On our team there's a very clear distinction between three groups:

- those who have embraced AI and learned to use it well

- those who have embraced AI but treat it as a silver bullet

- those who reject AI

First group is by far the most productive and adds the most value to the team.

kaydub 1/23/2026|||

Yeah, it's similar where I'm at.

If anything the silver bullet people are mostly managers and C levels... some of which don't even use the tools themselves.

Of the devs that rejected it at first, the ones with the same sentiment I'm seeing online in threads like these, we forced one to give it a try. He now treats totters between using it well and treating it as a silver bullet. I still hear him incredulous about the things claude does at meetings, "I had to do <thing> and I thought I'd let claude get a crack at it... did it in one shot"

habinero 1/24/2026|||

I mean, that fits with what I said.

habinero 1/24/2026|||

I mean, not all workplaces hire the best.

alecbz 1/23/2026||||

I have some success but by the time I'm done I'm often not sure if I saved any time.

sjajshha 1/23/2026|||

My (former) coworker who’s heavy into this stuff produced a lot of unmaintainable slop on his way out while singing agents praises to hire-ups. He also felt he was getting a lot of value and had no issues.

kaydub 1/23/2026||

[flagged]

joshstrange 1/23/2026|||

Where is the "super upvote button" when you need it?

YES! I have been playing with vibe coding tools since they came out. "Playing" because only on rare occasions have I created something that is good enough to commit/keep/use. I keep playing with them because, well I have a subscription, but also so I don't fall into the fuddy-duddy camp of "all AI is bad" and can legitimately speak on the value, or lack thereof, of these tools.

Claude Code is super cool, no doubt, and with _highly targeted_ and _well planned_ tasks it can produce valuable output. Period. But, every attempt at full-vibe-coding I've done has gotten hung up at some point and I have to step in an manually fix this. My experience is often:

1. First Prompt: Oh wow, this is amazing, this is the future

2. Second Prompt: Ok, let me just add/tweak a few things

10. 10th prompt: Ugh, everytime I fix one thing, something else breaks

I'm not sure at all what I'm doing "wrong". Flogging the agents along doesn't not work well for me or maybe I am just having trouble letting go of the control and I'm not flogging enough?

But the bottom line is I am generally shocked that something like Gas Town was able to be vibe-coded. Maybe it's a case of the LLM overstating what it's accomplished (typical) and if you look under the hood it's doing 1% of what it says it is but I really don't know. Clearly it's doing something, but then I sit over here trying to build a simple agent with some MCPs hooked up to it using a LLM agent framework and it's falling over after a few iterations.

dceddia 1/23/2026|||

So I’m probably in a similar spot - I mostly prompt-and-check, unless it’s a throwaway script or something, and even then I give it a quick glance.

One thing that stands out in your steps and that I’ve noticed myself- yeah, by prompt 10, it starts to suck. If it ever hits “compaction” then that’s beyond the point of return.

I still find myself slipping into this trap sometimes because I’m just in the flow of getting good results (until it nosedives), but the better strategy is to do a small unit of work per session. It keeps the context small and that keeps the model smarter.

“Ralph” is one way to do this. (decent intro here: https://www.aihero.dev/getting-started-with-ralph)

Another way is “Write out what we did to PROGRESS.md” - then start new session - then “Read @PROGRESS.md and do X”

Just playing around with ways to split up the work into smaller tasks basically, and crucially, not doing all of those small tasks in one long chat.

joshstrange 1/23/2026||

I will check out Ralph (thank you for that link!).

> Another way is “Write out what we did to PROGRESS.md” - then start new session - then “Read @PROGRESS.md and do X”

I agree on small context and if I hit "compacting" I've normally gone too far. I'm a huge fan of `/clear`-ing regularly or `/compact <Here is what you should remember for the next task we will work on>` and I've also tried "TODO.md"-style tracking.

I'm conflicted on TODO.md-style tracking because in practice I've had an agent work through everyone on the list, confidently telling me steps are done, only to find that's not the case when I check its work. Either a TODO.md that I created or one I had the agent create both suffer from this. Also, getting it update the TODO.md has been frustrating, even when I add it to CLAUDE.md "Make sure to mark tasks as complete in the TODO.md as you finish them" or adding the same message to the end of all my prompts, it won't always update it.

I've been interested in trying out beads to see if works better than a markdown TODO file but I haven't played with that yet.

But overall I agree with you, smaller chunks are key to success.

square_usual 1/23/2026||

I hate TODO.mds too. If I ever have to use one, I'll keep track of it manually, and split the work myself into chunks of the size I believe CC/codex can handle. TODO.md is a recipe for failure because you'll quickly have more code than you can review and nothing to trust that it was executed well.

EFreethought 1/23/2026||||

> 10. 10th prompt: Ugh, everytime I fix one thing, something else breaks

Maybe that is the time to start making changes by hand. I think this dream of humans never ever writing any more code might be too far and unnecessary.

theropost 1/23/2026|||

I’ve definitely hit that same pattern in the early iterations, but for me it hasn’t really been a blocker. I’ve found the iteration loop itself isn’t that bad as long as you treat it like normal software work. I still test, review, and check what it actually did each time, but that’s expected anyway. What’s surprised me is how quickly things can scale once the overall architecture is thought through. I’ve built out working pieces in a couple of weeks using Claude Code, and a lot of that time was just deciding on the architecture up front and then letting it help fill in the details. It’s not hands-off, but used deliberately, it’s been quite effective https://robos.rnsu.net

joshstrange 1/23/2026||

I agree that it can be very useful when used like that but I'm referring to fully vibe-coding, the "I've never looked at the code"-people. CC is a great tool when you use plan carefully, review its work, etc but people are building things they say they've never read the code for and that just hasn't been my experience, it always falls over on it's own if I'm not in the code reviewing/tweaking.

kgwgk 1/23/2026|||

> How do these industry figures claim they see no part of a 225K+ line of code and promise that it works?

The only promise is that you will get your face ripped off.

“WARNING DANGER CAUTION - GET THE F** OUT - YOU WILL DIE […] Gas Town is an industrialized coding factory manned by superintelligent robot chimps, and when they feel like it, they can wreck your shit in an instant. They will wreck the other chimps, the workstations, the customers. They’ll rip your face off if you aren’t already an experienced chimp-wrangler.”

kaydub 1/23/2026||

Yeah, I'm at that stage 6 or 7. I'm using multiple agents across multiple terminal windows. I'm not even coding any more, literally I haven't written code in like 2-4 months now beyond changing a config value or something.

But I still haven't actually used Gastown. It looks cool. I think it probably works, at least somewhat. I get it. But it's just not what I need right now. It's bleeding edge and experimental.

The main thing holding me back from even tinkering with it is the cost. Otherwise I'd probably play with it a little, but it's not something I'd expect to use and ship production code right now. And I ship a ton of production code with claude.

skippyboxedhero 1/23/2026|||

There is an incentive for dishonesty about what AI can and cannot do.

People from OpenAI was saying that GPT2 had achieved AGI. There is a very clear incentive for that statement to be made by people who are not using AI for anything productive.

Even as increasingly bombastic claims are made, it is obvious that the best AI cannot one-shot everything if you are an actual user. And the worst ones: was using Gemini yesterday and it wouldn't stop outputting emojis, was using Grok and it refused to give me a code snippet because it claimed its system prompt forbade this...what can you say?

I don't understand why anyone would want to work on a codebase they didn't understand either. What happens when something goes wrong?

Again though, there is massive financial incentive to make these claims, and some other people will fall along with that because it is good for their career, etc. I have seen this in my own company where senior people are shoehorning this stuff in that they clearly do not actually use or understand (to be clear, this is engineering not management...these are people who definitely should understand but do not).

Great tool, but the 100% vibecoding without looking at the code, for something that you are actually expecting others to use, is a bad idea. Feels more like performance art than actual work. I like jokes, I like coding, room for both but don't confuse the two.

rozap 1/24/2026||

> I don't understand why anyone would want to work on a codebase they didn't understand either. What happens when something goes wrong?

It's your coworker's problem. The one who actually understands the big picture and how the system fits into it. They'll deal with it.

turtlebits 1/23/2026|||

No one is promising anything. It's just a giant experiment and the author explicitly tells you not to use it. I appreciate those that try new things, even it it's possibly akin to throwing s** at a wall and seeing what sticks.

Maybe it changes how we code or maybe it doesn't. Vibe coding has definitely helped me write throwaway tools that were useful.

johnmaguire 1/23/2026|||

After listening to Yegge's interview, I'm not sure this is accurate: https://www.youtube.com/watch?v=zuJyJP517Uw

For example, he makes a comment to the effect that anyone using an IDE to look at code in 2026 is a "bad engineer."

eikenberry 1/23/2026||

Hyperbole is very common.

matkoniecz 1/24/2026|||

In LLM field things move so fast that distinguishing accurate statements, mistaken statements, jokes and lies is hard.

A result, hyperbole is more annoying than usual.

johnmaguire 1/25/2026|||

Watch the video - he's very clear that he's not looking at code. I see no indication that he is being hyperbolic.

lovich 1/23/2026|||

> It's just a giant experiment and the author explicitly tells you not to use it.

No, he threw up a hyperbolic warning and then dove deep into how this is the future of all coding in the rest of his talks/writing.

It’s as good a warning as someone saying “I’m not {X} but {something blatantly showing I am X}”

amenhotep 1/23/2026||

Reminds me of Matt Levine on https://www.lesswrong.com/posts/WACraar4p3o6oF2wD/sam-altman...

furyofantares 1/23/2026|||

Who's promising it works?

It's an experiment to discover what the limits are. Maybe the experiment fails because it's scoped beyond the limits of LLMs. Maybe we learn something by how far it gets exactly. Maybe it changes as LLMs get better, or maybe it's a flawed approach to pushing the limits of these.

bbayles 1/23/2026|||

I'm sympathetic to this view, but I also wonder if this is the same thing that assembly language programmers said about compilers. What do you mean that you never look at the machine code? What if the compiler does something inefficient?

gtowey 1/23/2026|||

Not even remotely close.

Compilers are deterministic. People who write them test that they will produce correct results. You can expect the same code to compile to the same assembly.

With LLMs two people giving the exact same prompts can get wildly different results. That is not a tool you can use to blindly ship production code. Imagine if your compiler randomly threw in a syscall to delete your hard drive, or decide to pass credentials in plain text. LLMs can and will do those things.

alecbz 1/23/2026|||

Even ignoring determinism, with traditional source code you have a durable, human-readable blueprint of what the software is meant to do that other humans can understand and tweak. There's no analogy in the case of "don't read the code" LLM usage. No artifacts exist that humans can read or verify to understand what the software is supposed to be doing.

luckydata 1/23/2026||

yeah there is. it's called "documentation" and "requirements". And it's not like you can't go read the code if you want to understand how it works, it's just not necessary to do so while in the process of getting to working software. I truly do not understand why so many people are hung up on this "I need to understand every single line of code in my program" bs I keep reading here, do you also disassemble every library you use and understand it? no, you just use it because it's faster that way.

notpachet 1/23/2026|||

> do you also disassemble every library you use and understand it?

Sometimes.

alecbz 1/23/2026|||

> it's called "documentation" and "requirements"

What I mean is an artifact that is the starting point for generating the software. Compiled binaries can be completely thrown away whenever because you know you have a blueprint (the source code) that can reliably reproduce it.

Documentation & requirements _could_ work this way if they served as input to the LLMs that would then go and create the source code from scratch. I don't think many people are using LLMs this way, but I think this is an interesting idea. Maybe soon we'll have a new generation of "LLM-facing programming languages" that are even higher level software blueprints that will be fed to LLMs to generate code.

TDD is also a potential answer here? You can imagine a world where humans just write test suites and LLMs fill out the code to get it to pass. I'm curious if people are using LLMs this way, but from what I can tell a lot of people use them for writing their tests as well.

> And it's not like you can't go read the code if you want to understand how it works

In-theory sure, but this is true of assembly in-theory as well. But the assembly of most modern software is de-facto unreadable, and LLM-generated source code will start going that way too the more people become okay with not reading it. (But again, the difference is that we're not necessarily replacing it with some higher-level blueprint that humans manage, we're just relying on the LLMs to be able to manage it completely)

> I truly do not understand why so many people are hung up on this "I need to understand every single line of code in my program" bs I keep reading here, do you also disassemble every library you use and understand it? no, you just use it because it's faster that way.

I think at the end of the day this is just an empirical question: are LLMs good enough to manage complex software "on their own", without a human necessarily being able to inspect, validate, or help debug it? If the answer is yes, maybe this is fine, but based on my experiences with LLMs so far I am not convinced that this is going to be true any time soon.

knowknow 1/23/2026|||

Not only that but compiler optimizations are generally based on rigorous mathematical proofs, so that even without testing them you can be pretty sure it will generate equivalent assembly. From the little I know of LLM's, I'm pretty sure no one has figured out what mathematical principles LLM's are generating code from so you cant be sure its going to right aside from testing it.

conartist6 1/23/2026||||

I write JS, and I have never directly observed the IRs or assembly code that my code becomes. Yet I certainly assume that the compiler author has looked at the compiled output in the process of writing a compiler!

For me the difference is prognosis. Gas Town has no ratchet of quality: its fate was written on the wall since the day Steve decided he didn't want to know what the code says: it will grow to a moderate but unimpressive size before it collapses under its own weight. Even if someone tried to prop it up with stable infra, Steve would surely vibe the stable infra out of existence since he does not care about that

luckydata 1/23/2026||

or he will find a way to get the AI to create harnesses so it becomes stable. The lack of imagination and willingness to experiment in the HN crowd is AMAZING me and worrying me at the same time. Never thought a group of engineers would be the most conservative and close minded people I could discuss with.

conartist6 1/23/2026|||

It's a paradox, huh. If the AI harness became so stable it wrote good code he wouldn't be afraid to look at the code he would be eager to look at it, right? But then if it mattered if AI wrote good code or not he couldn't defend his position that the way to create value with code is quantity over quality. He needs to sell the idea of something only AI can do, which means he needs the system to be made up of a lot of bad or low quality code which no person would ever want to be forced to look at.

troupo 1/24/2026||||

There's a difference between "imagination and willingness to experiment" and "blind faith and gullibility".

vardalab 1/23/2026|||

Wait till you meet engineers other than sw engineers. Not even sure most sw people should be called engineers since there are no real accredited standards. I specifically trained as EE in physical electronics because other disciplines at the time seemed really rigid.

There's a saying that you don't want optimists building bridges.

crote 1/23/2026||||

The big difference is that compilation is deterministic: compile the same program twice and it'll generate the same output twice. It also doesn't involve any "creativity": a compiler is mostly translating a high-level concept into its predefined lower-level components. I don't know exactly what my code compiles to, but I can be pretty certain what the general idea of the assembly is going to be.

With LLMs all bets are off. Is your code going to import leftpad, call leftpad-as-a-service, write its own leftpad implementation, decide that padding isn't needed after all, use a close-enough rightpad instead? Who knows! It's just rolling dice, so have fun finding out!

fragmede 1/23/2026||

> The big difference is that compilation is deterministic: compile the same program twice and it'll generate the same output twice.

That's barely true now. Nix comes close, but builds are only bit-for-bit identical if you set a bunch of extra flags that aren't set by default. The most obvious instability is CPU dispatch order (aka modern single computer systems are themselves distributed, racy systems) changes the generated code ever so slightly.

We don't actually care, because if one compiled version of the code uses r8 for a variable but a different compilation uses r9 for that variable, it doesn't matter because we just assume the resulting binary works the same either way. R8 vs r9 are implementation details that don't matter to humans. See where I'm going with this? If the LLM non-deterministically calls the variable fileName one day, and file_name the next time it's given the same prompt, yeah language syntax purists are going to suffer an aneurysm because one of those is clearly "wrong" for the language in use, but it's really more of an implementation detail at this point. Obviously you can't mix them, the generated code has to be consistent in which one it's using, but if compilers get to chose r8 one day and r9 the next, and we're fine with it, why is having the exact variable name that important, as long as it's being used correctly?

tjr 1/23/2026|||

I’ve done builds for aerospace products where the only binary difference between two builds of the same source code is the embedded timestamp. And per FAA review guidelines, this deterministic attribute is required, or else something is wrong in the source code or build process.

I certainly don’t use all compilers everywhere, but I don’t think determinism in compilation is especially rare.

m4rtink 1/23/2026|||

If your builds are not deterministic for the same set of inputs, you are doing something wrong - or victim of supply chain attack.

https://reproducible-builds.org/

mike_hearn 1/24/2026||

No, some compilers aren't deterministic by design, e.g. because they compile stuff in parallel and don't take extra steps to enforce consistent ordering of things (because it doesn't matter).

7777332215 1/23/2026||||

The compiler is deterministic and the translation does not lose semantics. The meaning of your code is an exact reflection of what is produced.

fragmede 1/23/2026||

We can tell you weren't around for the advent of compilers. To be fair, neither was I since the UNIX c compiler came out in '68 and was by far not the first compiler. Modern comilers you can make that claim about, but early compilers weren't.

georgemcbay 1/23/2026|||

I've been programming since 6502/6510 assembly language and all compilers I've used were deterministic (which isn't the same thing as being bug free or producing the correct output for a given input).

fragmede 1/25/2026||

Bullshit.

recursive 1/23/2026||||

All compilers have bugs. Any loss of semantics during compilation would be considered a bug. In order to do that, the source and target language need to be structured and specified. I wasn't around in the 60s either, but I think that hasn't changed.

tjr 1/23/2026|||

Which early compilers were nondeterministic?

3vidence 1/23/2026||||

This analogy has always been bad any time someone has used it. Compilers directly transform via known algorithms.

Vibecoding is literally just random probabilistic mapping between unknown inputs and outputs on an unknown domain.

Feels like saying because I don't know how my engine works that my car could've just been vibe-engineered. People have put 1000s of hours into making certain tools work up to a give standard and spec reviewed by many many people.

"I don't know how something works" != "This wasn't thoughtfully designed"

Why do people compare these things.

anonymous908213 1/23/2026||||

No, it is not what assembly programmers said about compilers, because you can still look at the compiled assembly, and if the compiler makes a mistake, you can observe it and work around it with inline assembly or, if the source is available, improve the compiler. That is not the same as saying "never look at the code".

hilbertseries 1/23/2026||||

I feel like this argument would make a lot more sense if LLMs had anywhere near the same level of determinism as a compiler.

jplusequalt 1/23/2026||||

>but I also wonder if this is the same thing that assembly language programmers said about compilers

But as a programmer writing C code, you're still building out the software by hand. You're having to read and write a slightly higher level encoding of the software.

With vibe coding, you don't even deal with encodings. You just prompt and move on.

zerkten 1/23/2026||

I've wondered if people who write detailed specs, are overly detailed, are in a regulated industry, or even work with offshore teams have success more quickly simply they start with that behavior. Maybe they have a tendency to dwell before moving on which may be slightly more iterative than someone who vibecodes straight through.

gegtik 1/23/2026||||

I wonder if assembly programmers felt this way about the reliability of the electical components which their code relies upon...

beklein 1/23/2026||

I wonder if electrical engineers felt this way about the reliability of the silicon crystal lattice their circuits rely upon…

0xbadcafebee 1/23/2026|||

Do you understand at a molecular level how cooking works? Or do you just do some rote actions according to instructions? How do you know if your cooking worked properly without understanding chemistry? Without looking at its components under a microscope?

Simple: you follow the directions, eat the food, and if it tastes good, it worked.

If cooks don't understand physics, chemistry, biology, etc, how do all the cooks in the world ensure they don't get people sick? They follow a set of practices and guidelines developed to ensure the food comes out okay. At scale, businesses develop even more practices (pasteurization, sanitization, refrigeration, etc) to ensure more food safety. None of the people involved understand it at a base level. There are no scientists directly involved in building the machines or day-to-day operations. Yet the entire world's food supply works just fine.

It's all just abstractions. You don't need to see the code for the code to work.

habinero 1/23/2026||

That's a terrible analogy lol.

1. Chefs do learn the chemistry, at least enough to know why their techniques work.

2. Food scientist is a real job

3. The supply chain absolutely does have scientists involved in day to day operations lol.

A better analogy is just shoving the entire contents of the fridge into a pot, plastic containers and all, and assuming it'll be fine.

0xbadcafebee 1/23/2026||

> Chefs do learn the chemistry, at least enough to know why their techniques work

Cooks are idiots (most are either illegal immigrants with no formal education, or substance-abusing degenerates who failed at everything else) who repeat what they're told. They think ridiculous things, like that searing a stake "seals in the juices", or that adding oil to pasta water "prevents sticking", that alcohol completely "cooks off", that salt "makes water boil faster", etc. They are the auto mechanics of food. A few may be formally educated but the vast majority are not. They're just doing what they were shown to do.

> A better analogy is just shoving the entire contents of the fridge into a pot, plastic containers and all, and assuming it'll be fine.

That would never result in a good meal. On the other hand, vibe coding is curently churning out not just working software, but working businesses. You're sleeping on the real effect this is having. And it's getting better every 6 months.

Back to the topic: most programmers actually suck at programming. Their code is full of bugs, and occasionally the code paths run into those bugs and make them noticeable, but they are always there. AI does the same thing, just faster, and it's getting better at it. If you still write code by hand in a few years you will be considered a dinosaur.

habinero 1/24/2026|||

> Cooks are idiots (most are either illegal immigrants with no formal education, or substance-abusing degenerates who failed at everything else) who repeat what they're told

Jesus Christ, dude. Just because someone works with their hands doesn't mean they're stupid. Good lord. Working in a professional kitchen is an incredibly demanding and difficult job. Don't be elitist to people who work way harder than you.

Especially since some of the dumbest and most intellectually coddled failsons I know went to, like, Yale lol. Or Harvard. A lot of YC startups are like Failson Continuation School. Plenty of people are smart, but a lot of them are just rich.

> On the other hand, vibe coding is curently churning out not just working software, but working businesses

Funny story, I'm evaluating SaaS ETL products and I found one that looked great. So I spent a couple hours testing out some tinkertoy examples with the idea to ask for budget if it worked.

I kept running into small stupid documentation problems and some incredibly stupid behavior in really basic shit (like, screwing up .env files) that no developer would do and then I realized it was all AI generated.

Did it work? Kinda! Mostly! Did it immediately make me put it in the "absolutely not" pile? Sure did.

If the code I can see is that sloppy and poorly reviewed, how bad is the code I can't see? I'm for sure not giving them our sensitive data.

If you think human code is bad, you should just work with better humans. ¯\_(ツ)_/¯

threethirtytwo 1/24/2026||

>Jesus Christ, dude. Just because someone works with their hands doesn't mean they're stupid. Good lord. Working in a professional kitchen is an incredibly demanding and difficult job. Don't be elitist to people who work way harder than you.

You're making a personal comment. It's orthoganol to the point. You said cooks learn the chemistry, he says they don't and they are too stupid too.

As bad as that statement is, it's true. Culinary arts as an occupation has statistically lower IQ than many other occupations. Additionally they don't actually learn the chemistry. You sidetracked off on a tirade talking about someones "elitest" character... but if you stick to the point, what you said was completely and utterly wrong.

>Funny story, I'm evaluating SaaS ETL products and I found one that looked great. So I spent a couple hours testing out some tinkertoy examples with the idea to ask for budget if it worked.

You know Ryan Dahl? Inventor of NodeJS, likely smarter, more successful, and a better coder than you says this: https://x.com/rough__sea/status/2013280952370573666

So you have a funny story, and then there are other smarter competent people saying the EXACT opposite of you. Does that ever make you pause and think? We've all seen evidence of AI fucking up. AI being stupid is a story so obvious that even the proponents of AI know AI can fuck up big time. But have you ever wondered what would make Ryan Dahl say something like that? Does what I'm saying even compute or are you just so stubbornly sure that your "funny story" invalidates everything?

habinero 1/26/2026||

> Culinary arts as an occupation has statistically lower IQ than many other occupations.

Citation very much needed lol. Again, don't be elitist about work you don't do and don't understand. Honestly, given the choice between between a random pool of kitchen staff and a bunch of people with BAYC twitter profiles, I'm taking the people who can pull off a busy Sunday brunch and I'm not thinking twice about it.

> Are you just so stubbornly sure that your "funny story" invalidates everything

It wasn't actually intended to be ha-ha funny, my guy, that's just a stock phrase.

And if you're asking, do I trust my own judgement to critically evaluate claims in my own industry? Yes. Yes, I do.

If you rely on other people telling you what's good and never think for yourself, you're always just going to be a follower. It's like you've never been on a single enterprise software sales call, jeez.

threethirtytwo 1/26/2026||

>Again, don't be elitist about work you don't do and don't understand

There’s a difference between being elite at and being truthful. Don’t weaponize the word elitism and use it to attack truth.

https://brght.org/iq/jobtitle/cook/

Below average iq for cooks. So you’re wrong. Almost everything you talk about is wildly wrong and off base.

> It wasn't actually intended to be ha-ha funny, my guy, that's just a stock phrase.

lol. Did it ever occur to you i was just using the same stock phrase to reference your “funny story”? Takes a certain iq to figure that out.

> And if you're asking, do I trust my own judgement to critically evaluate claims in my own industry? Yes. Yes, I do.

Good. A smart person though wouldn’t completely trust himself because he knows no one is infallible. So he evaluates his own judgements against other judgements. Especially judgements of others smarter than them. Are you a smart person? Maybe ask yourself that question.

> If you rely on other people telling you what's good and never think for yourself, you're always just going to be a follower. It's like you've never been on a single enterprise software sales call, jeez.

lol, never asked you that and it’s the wrong comparison my guy, my dude.

Read what I wrote. It’s a call to evaluate your own statement against others who say the opposite. It’s not a call to rely on what others say. Nor is it a call to just trust everything in your own brain. I asked you to evaluate your judgements and the judgements of others smarter than you as a whole.

You’re like the guy who thinks everyone is a salesman so you mistrust the entire world and you think everything you know and think is 100 percent true. I feel you’re scared of being wrong. Jeeze. Theres nothing to be scared of for being wrong, my dude.

A smart person would think: “hey half the population plus this guy smarter than me (Ryan dhal, who is one of many smart people that have nothing to sell) is saying AI writes all his code now. Maybe consider his perspective alongside mine?”

Understand, my dude?

sarchertech 1/24/2026|||

Cooks also repeatedly cook the exact same recipe designed by someone else over and over again. In our industry cooks are closest to the CPU executing machine code.

0xbadcafebee 1/24/2026||

With the exception that cooks are actually less reliable (sometimes your steak comes out medium rare, sometimes well done). The human world is chaotic and unreliable, yet we wrangle it into a workable form. I think pretty soon we'll see that paralleled in the AI world, in the same ways we categorize and value human labor and businesses.

roberttod 1/23/2026|||

It's unintuitive, but having an llm verification loop like a code reviewer works impeccably well, you can even create dedicated agents to check for specific problem areas like poor error handling.

This isn't about anthropomorphism, it's context engineering. By breaking things into more agents, you get more focused context windows.

I believe gas town has some review process built in, but my comment is more to address the idea that it's all slop.

As an aside, Opus 4.5 is the first model I used that most of the time doesn't produce much slop, in case you haven't tried it. Still produces some slop, but not much human required for building things (it's mostly higher level and architectural things they need guidance on).

fragmede 1/23/2026||

> it's mostly higher level and architectural things they need guidance on

Any examples you can share?

roberttod 1/23/2026||

Mostly, it's not the model that is lacking but the visibility it has. Often the top level business context for a problem is out of reach, spread across slack, email, internal knowledge and meetings.

Once I digest some of this and give it to Claude, it's mostly smooth sailing but then the context window becomes the problem. Compactions during implementation remove a lot of important info. There should really be a Claude monitoring top level context and passing work to agents. I'm currently figuring out how to orchastrate that nicely with Claude Code MD files.

With respect to architecture, it generally makes sound decisions but I want to tweak it, often trading off simplicity vs. security and scale. These decisions seem very subtle and likely include some personal preferences I haven't written anywhere.

mactavish88 1/23/2026|||

In my experience, it really depends on what you're building _and_ how you prompt the LLM.

For some things, LLMs are great. For others, they're absolute dog shit.

It's still early days. Anyone who claims to know what they're talking about either doesn't or what they're saying will be out of date in a month's time (including me).

anonymous908213 1/23/2026|||

The secret is that it doesn't work. None of these people have built real software that anyone outside their bubble uses. They are not replacing anyone, they are just off in their own corner building sand castles.

ryandrake 1/23/2026|||

Just because they're one-off tools that only one person uses doesn't mean it's not "real software". I'm actually pretty excited about the fact that it's now feasible for me to replace all my BloatedShittyCommercialApps that I only use 5% of with vibe-coded bespoke tools that only do the important 5%, just for me to use. If that makes it a "sand castle" to you, fine, but this is real software and I'm seeing real benefit here.

nicoburns 1/23/2026|||

> I'm actually pretty excited about the fact that it's now feasible for me to replace all my BloatedShittyCommercialApps that I only use 5% of with vibe-coded bespoke tools that only do the important 5%, just for me to use.

Aren't you worried that they'll work fine for 3 weeks then delete all your data when you hold them slightly different? Vibe coded software seems to have a similar problem to "Undefined Behaviour", in that just because it works sometimes doesn't mean that it will always work. And there's no limit on what it might do when it doesn't work (the proverbial "nasal demons") - it might well wipe your entire harddrive, not just corrupt it's own data.

You can of course mitigate this by manually reviewing the software, but then you lose at least some of the productivity benefit.

ryandrake 1/23/2026||

> Aren't you worried that they'll work fine for 3 weeks then delete all your data when you hold them slightly different?

It might. It probably won't though. I don't see any code in it that deletes files. And, unlike BloatedShittyCommercialApp (and its cousin, BloatedDoEverythingOpenSourceApp), the code is going to be relatively small and if I do have doubts I can easily check to see what it's doing. I can build it quickly. I can patch it quickly. I don't have to file a bug to someone and beg him to look at it. I don't have to worry that the next release is going to break stuff I want and add stuff I don't want.

I recently moved my home theater PC from Kodi to a tiny bespoke vibed video player app, that basically just wraps libVLC with a minimal Android GUI. It's like 3000 lines of code total. I can practically keep the entire app in my head. If I need to fix something, it's 5 minutes in my dev terminal and then adb install. Ever tried to find and fix a bug in Kodi? The goddamn thing takes forever to even build, let alone debug. And that's even open source. I don't even have a remote chance of getting a bug fixed in professionally-built proprietary software.

suriya-ganesh 1/25/2026||

> the code is going to be relatively small and I do have doubts I can easily check

Continues to make an app with 150K lines.

enraged_camel 1/23/2026|||

The whole "real software" thing is a type of elitism that has existed in our field for a long time, and AI is the new battleground on which it is wielded.

azan_ 1/23/2026||||

> The secret is that it doesn't work.

I have 100% vibecoded software that I now use instead of commercial implementation that cost me almost 200 usd a month (tool for radiology dictation and report generation).

alecbz 1/23/2026|||

Wait, so you're a radiologist and you're using software you vibecoded to generate radiology reports for real patients? Is that, like, allowed?

mbesto 1/23/2026|||

Not saying it's right, but boy do I have stories about the code used in <insert any medical profession> healthcare applications. Not sure how "vibecoded" programming lines of code is any worse.

dullcrisp 1/23/2026||

Because that code is presumably working and the vibe code is probably not?

alecbz 1/23/2026|||

Honestly even if this wasn't vibe-coded I'm still a bit surprised at individual radiologists being able to bring their own software to work, for things that can have such a high effect on patient outcomes.

mbesto 1/23/2026|||

do you have evidence that all vibe coded solutions dont work? Because thats what you're implying.

dullcrisp 1/23/2026||

If I wanted to prove murder, not negligence.

azan_ 1/23/2026||||

Of course it’s allowed. It’s just kind of text editor but with support of speech to text and structured reports (e.g. when reporting spine if I say l3 bd it automatically inserts description of bulging disc in the correct place in the report). I then copy paste it to RIS so there’s absolutely nothing wrong or illegal in that.

d1sxeyes 1/23/2026|||

Depends where in the world they are. Here in Hungary, it’s not uncommon to email your-family-doctor@gmail.com

direwolf20 1/23/2026||

What does that have to do with vibe-coding?

Analemma_ 1/23/2026||||

Vibe-coded radiology reports, finally the 21st century will get its own Therac-25 incident.

azan_ 1/23/2026||

Yes I’m sure that text to speech with very nice fluff on top will have terrible consequences. It’s almost as bad as some radiologists using Word for writing reports which is not fda-approved (shocking I know!)

anonymous908213 1/23/2026||||

And yet I notice you haven't mentioned publishing it and undercutting the market. You could make a lot of money out-competing the existing option if what you produced was production-grade software. I'm guessing the actual case is that you only needed a small subset of the functionality of the paid software, and the LLM stitched together a rough unpolished proof-of-concept that handled your exact specific use case. Which is still great for you! But it's not the future of coding. The world still needs real engineers to make real software that is suitable for the needs of many, and this doesn't replace that.

jcims 1/23/2026|||

>The world still needs real engineers to make real software that is suitable for the needs of many, and this doesn't replace that.

I think azan_ is demonstrating that shipping products 'suitable for the needs of many' is going to have to compete with 'slopping software for the needs of one'.

anonymous908213 1/23/2026||

The only people who think that are programmers already or programmer-adjacent. Your mother is never going to be able to use a Gas Town-like workflow to make software for her own needs, nor is she even going to want to spend her weekends trying. These tools still require a baseline minimum of technical knowledge, and a real time investment, and also a real money investment the way some people are using them. Moreover, most real software has interoperability needs. A world where everyone makes their own Twitter or WhatsApp is a world where nobody can talk to anyone else.

There is a small subset of the population who is now enabled to make proof-of-concepts with less effort than before. This is no way diminishes the need for delivering performant, secure, interoperable software at scale to serve humanity's needs.

blenderob 1/23/2026|||

> Your mother is never going to be able to use a Gas Town-like workflow to make software for her own needs, nor is she even going to want to spend her weekends trying.

I'm going on a tangent here but what's with this constant deprecation of mothers to make a point? There are many people here whose mothers can develop software.

dullcrisp 1/23/2026|||

I think it’s just a generalization. They could have said “your uncle Pete” without actually implying anything about anyone’s uncle named Peter.

anonymous908213 1/23/2026|||

People's mothers are statistically unlikely to be programmers, obviously. My own grandmother was a programmer, but it conveys the idea in two words rather than making up a clunky phrase to describe the exact degree of non-techiness of the hypothetical person.

throwway120385 1/23/2026|||

What if we packaged Gas Town up in an operating system userspace, put it on rails, and gave people an interface to it?

anonymous908213 1/23/2026||

An interface isn't enough. Even if you never look at the code, the results are going to be influenced significantly by having the vocabulary to accurately describe what you want. The less sufficient your technical vocabulary, the more ambiguous your prompts will be and the less likely it is that the Polecats will be able to deliver anything resembling your unspoken imagination. To say nothing of being able to guide the lost critters when they run into problems.

throwway120385 1/23/2026||||

It sounds like a medical device, in which case marketing it may require FDA approval or notification. Whereas vibe-coding a one-off tool for yourself might still require validation but you're the one taking the risk and accepting liability for it.

I think the thing you're missing is that the tool doesn't need to be marketed because someone else could ask their LLM to make them a similar tool but fitting their use case.

anonymous908213 1/23/2026||

If they're using a 100% vibe-coded tool that they've never read the code of to replace something that would require government approval, for use on real-world patients, they're probably committing medical malpractice as we speak. Let us pray that is not the case.

It doesn't matter if the tool "needs" to be marketed. There is a market of paying customers. If other people are paying $200/month, both your and their lives would be improved significantly by you offering a $100/month replacement software. For all the talk about LLMs replacing the need for packaged software, people are still paying for packaged software, and while they are, you could be making large amounts of money while saving them money. If you're altruistic, you could even release it as FOSS and save a lot of people $200/mo. Unless, of course, your vibe-coded app isn't actually remotely capable of replacing the software in question.

azan_ 1/23/2026||

Jumping to conclusion that I’m committing malpractice is completely uncalled for and offensive. > Unless, of course, your vibe-coded app isn't actually remotely capable of replacing the software in question. It is completely capable FOR ME. I’m not interested in publishing it because I love my job and it pays great already.

saidarembrace 1/23/2026|||

Not everything has to be monetized, buddy. It's okay to relax.

anonymous908213 1/23/2026||

> If you're altruistic, you could even release it as FOSS and save a lot of people $200/mo. Unless, of course, your vibe-coded app isn't actually remotely capable of replacing the software in question.

johnmaguire 1/23/2026||||

My partner is a radiologist and I'd love to hear more about what you built. The engineer in me is also curious how much this cost in credits?

kaydub 1/23/2026||

It CAN be cheap.

I built a clinical pharmacist "pocket calculator" kinda app for a specific function. It was like $.60 in claude credits I think. Built with flutter + dart. It's a simple tool suite and I've only built out one of the tools so far.

Now to be fair, that $.60 session was just the coding. I did some brainstorming in chatgpt and generated good markdown files (claude.md, gemini.md, agents.md) before I started.

timeon 1/23/2026||||

How much costs you renting vibecoding tools?

brokensegue 1/23/2026||

such tools cost 10-20/mo usually?

FridgeSeal 1/23/2026|||

Using mystery vibe coded software in a tightly regulated, consequence-heavy environment, that’s so reassuring! /s

Is it _just_ speech-to-text, or god-forbid are you giving it scans and having it write reports for you too?

azan_ 1/23/2026||

It’s text to speech with structured reports support. Jesus Christ stop with the moral panic already.

matkoniecz 1/24/2026||

FYI I also assumed that it is doing something more dangerous, mostly because you mentioned being radiologist as relevant.

Is it calling some external API or doing this text to speech locally?

asadm 1/23/2026||||

no that's not true. I rarely now write a SINGLE line of code both at work or at home. Even simple config switches, I ask codex/gemini to do it.

You always have to review overall diff though and go back to agent with broader corrections to do.

mahogany 1/23/2026||

> You always have to review overall diff though and go back to agent with broader corrections to do.

This thread is about vibe coding _without_ looking at the code.

_zoltan_ 1/24/2026||||

Of course it works. I haven't looked at code for my internal development in months.

I don't know why people keep repeating this but it's wrong. It works.

causalmodels 1/23/2026|||

It is fine to have criticisms of this, I have many, but saying that Yegge hasn't built real software is just not true.

anonymous908213 1/23/2026|||

Yegge obviously built real software in the past. He has not built real software wherein he never looked at the code, as he is now promoting.

causalmodels 1/23/2026||

Ok but this entire idea is very new. Its not an honest criticism to say no one has tried the new idea when they are actively doing it.

Honestly I don't get the hostility. Yegge is running an experiment. I don't think it will work, but it will be interesting and informative to watch.

anonymous908213 1/23/2026|||

The 'experiment' isn't the issue. The problem is the entire culture around it. LLM tools are being shoved into everything, LLMs are soaking up trillions in investment, engineers are being told over and over that everything has changed and this garbage is making us obsolete, software quality is decreasing where wide LLM usage is being mandated (eg. Microsoft). Gas Town does not give the vibe of a neutral experiment but rather looks be a full-on delve into AI psychosis with the way Yegge describes it.

To be clear, I think LLMs are useful technology. But the degree of increasing insanity surrounding it is putting people off for obvious reasons.

causalmodels 1/23/2026||

I share the frustration with the hype machine. I just don't think a guy with a blog is an appropriate target for our frustration with corporate hype culture.

WesolyKubeczek 1/23/2026||||

> Ok but this entire idea is very new. Its not an honest criticism to say no one has tried the new idea when they are actively doing it.

Not really new. Back in the day companies used to outsource their stuff to the lowest bidder agencies in proverbial Elbonia, never looked at the code, and then panickedly hired another agency when the things visibly were not what was ordered. Case studies are abound on TheDailyWTF for the last two decades.

Doing the same with agents will give you the same disastrous results for comparably the same money, just faster. Oh and you can't sue them, really.

Maybe it's better, who knows.

causalmodels 1/23/2026||

Fair point on the Elbonia comparison. But we can't sue the SQLite maintainers either, and yet we trust them with basically everything. The reason is that open source developed its own trust mechanisms over decades. We don't have anything close to that with LLMs today. What those mechanisms might look like is an open question that is getting more important as AI generated code becomes more common.

WesolyKubeczek 1/23/2026||

> But we can't sue the SQLite maintainers either, and yet we trust them with basically everything.

But you don’t pay them any money and don’t enter into contractual relationship with them either. Thus you can’t sue them. Well, you can try, of course, but.

You could sue an Elbonian company, though, for contract breach. LLMs are like usual Elbonian quality with two middlemen but quicker, and you only have yourself to blame when they inevitably produce a disaster.

direwolf20 1/23/2026|||

The experiment is fine if you treat it as an experiment. The problem is the state of the industry where it's treated as serious rather than silly — possibly even by Steve himself.

swiftcoder 1/23/2026|||

> saying that Yegge hasn't built real software is just not true

I mean... I feel like it's somewhat telling that his wikipedia page spends half its words on his abrasive communication style, and the only thing approximating a product mentioned is a (lost) Rails-on-Javascript port, and 25 years spent developing a MUD on the side.

Certainly one doesn't get to stay a staff-level engineer at Google without writing code - but in terms of real, shipping software, Yegge's resume is a bit light for his tenure in BigTech

mkl95 1/23/2026|||

OP defines herself as a mediocre engineer. She's trying to sell you Slop Town, not engineering principles.

alvatar 1/24/2026||

Just writing here a line in defense of Rothko. His paintings are far harder to paint than it looks like. There were hundreds of layers, thinly applied, and carefully thought and with a developed technique. Try to paint that by yourself and you'll see.

durch 1/23/2026||

Design indeed becomes the bottleneck, I think that this points to a step that is implied but still worth naming explicitly -> design isn't just planning upfront. It is a loop where you see output, see if it is directionally right, and refine.

While the agents can generate, they can't exercise that judgement, they can't see nuances and they can't really walk their actions back in a "that's not quite what I meant" sense.

Exercising judgement is where design actually happens, it is iterative, in response to something concrete. The bottleneck isn't just thinking ahead, it's the judgment call when you see the result, its the walking back, as well as thinking forward.

msp26 1/23/2026|

Originally I thought that Gas Town was some form of high level satire like GOODY-2 but it seems that some of you people have actually lost the plot.

Ralph loops are also stupid because they don't make use of kv cache properly.

---

https://github.com/steveyegge/gastown/issues/503

Problem:

Every gt command runs bd version to verify the minimum beads version requirement. Under high concurrency (17+ agent sessions), this check times out and blocks gt commands from running.

Impact:

With 17+ concurrent sessions each running gt commands:

- Each gt command spawns bd version

- Each bd version spawns 5-7 git processes

- This creates 85-120+ git processes competing for resources

- The 2-second timeout in gt is exceeded

- gt commands fail with "bd version check timed out"

tucnak 1/23/2026||

I think it is satire, and pretty obvious one at that; is anybody taking it for real?

skybrian 1/23/2026||

Why not both? I think it's pretty clearly both for fun and serious.

He's thrown out his experiments before. Maybe he'll start over one more time.

tucnak 1/24/2026||

The big challenge for me so far has been about setting up "breakpoints" with sufficient prompt adherence, i.e. conditions for agents to break out of loop, and request actionable feedback, rather than pumping as many tokens as possible. Use cases where pumping tokens in unsupervised manner is warranted, are far and few between. For example, dataset-scale 1:n and n:n transformations have been super easy to set up, but the same implementation typically doesn't lend nicely to agent loops, as batching/KV caching suddenly becomes non-obvious and costs ramp up. Task scheduling, with lockstep batching, is a big, unsolved problem as of yet, and Gas Town is not inspiring confidence to that end.

alex_sf 1/23/2026|||

> Ralph loops are also stupid because they don't make use of kv cache properly.

This is a cost/resources thing. If it's more effective and the resources are available, it's completely fine.

BoneShard 1/24/2026||

Gaslighting town.

More comments...