Building a Personal AI Factory

Posted by derek 1 day ago

Building a Personal AI Factory(www.john-rush.com)

220 points | 121 commentspage 2

mmarian 20 hours ago|

And here I am struggling to get Claude to create a nice-looking search bar a la booking.com , with some adjustments for my personal use case; it does ok, but never gets to the end result and once I refreshed my Tailwind knowledge it felt much slower than hand coding. I feel like I'm living in a different world.

hamstergene 19 hours ago||

I think coding assistants aren't great at UI/UX yet because they can't see, their understanding of left/right/lighter/darker is guessed from textual descriptions that accompanied CSS tutorials but they are never actually imagining the looks of what they are working with. I had Cursor repeatedly fix and mess up a CSS grid, over and over again, until I switched to HTML table so that browser would handle layout. Once switched from visuals ("leftmost") to semantics ("first cell in a row") the agent immediately started getting tasks done right.

I guess keep them on backend/library tasks for now. I am sure the companies are already working on getting a snapshot of a browser page and feeding it back into multimodal model so it can comprehend what "looking" means.

mmarian 19 hours ago||

Thx for sharing your experience, good to know I'm not the only one struggling ^_^ The advice makes sense as well.

derencius 18 hours ago||

I use Claude and Cursor in parallel. cursor is doing great on the ui, I took quick screenshots to instruct the changes I wanted and it got it right.

mmarian 18 hours ago||

Cheers! It's hard to keep track of what's good for what.

caporaltito 19 hours ago||

Show us the code, mate.

skybrian 1 day ago||

> It’s essentially free to fire off a dozen attempts at a task - so I do.

What sort of subscription plan is that?

steveklabnik 1 day ago|

Claude Code's $200 Max subscription can take a lot of usage. I haven't done a dozen things at once, but I have worked on two side projects simultaneously with it before.

ccusage shows me getting over 10x the value of paying via API tokens this month so far...

simonw 1 day ago|||

I had to look that up: https://github.com/ryoppippi/ccusage

  npx ccusage@latest

Outputs a table of your token usage over the last few days, which it reads from the jsonl files that Claude Code leaves tucked away in the ~/.claude/ directory.

steveklabnik 1 day ago||

Don’t sleep on the other options either, the live updates are cool, see where you’re at in the five hour session.

Aeolun 1 day ago|||

Given you can nearly run two full code instances with Opus, and Opus is claimed to be 5x more expensive than Sonnet, you can maybe do 10 sonnet instances at the same time?

am17an 1 day ago||

I actually don't understand how you can offload the instruction pointer of the program to another program, permanently. How are you accountable for anything then? You can't debug, you can't program, just a tourist in your own home. Own your code, even if AI wrote it.

IncreasePosts 1 day ago||

Okay, what is he actually building with this?

I have a problem where half the times I see people talking about their AI workflow, I can't tell if they are talking about some kind of dream workflow that they have, or something they're actually using productively

ClawsOnPaws 1 day ago|

I keep coming to the same conclusion, which basically is: if I had an LLM write it for me, I just don't care about it. There are 2 projects out of the maybe 50 or so that are LLM generated, and even for those two I cared enough to make changes myself without an LLM. The rest just sit there because one day I thought huh wouldn't it be neat if, and then realized actually I cared more about having that thought than having the result of that thought. Then you end up fighting with different models and implementation details and then it messes up something and you go back and forth about how you actually want it to work, and somehow this is so much more draining and exhausting than just getting the work done manually with some slight completion help perhaps, maybe a little bit of boilerplate fill-in. And yes, this is after writing extensive design docs, then having some reasoning LLM figure out the tasks that need to be completed, then having some models talk back and forth about what needs to happen and while it's happening, and then I spent a whole lot of money on what exactly? Questionably working software that kinda sorta does what I wanted it to do? If I have a clear idea, or an existing codebase, if I end up guiding it along, agents and stuff are pretty cool I guess. But vibe coding? Maybe I'm in the minority here but as soon as it's a non trivial app, not just a random small script or bespoke app kind of deal, it's not fun, I often don't get the results I actually wanted out of it even if I tried to be as specific as I wanted with my prompting and design docs and example data and all that, it's expensive, code is still messy as heck, and at the end I feel like I just spent a whole lot of time actually literally arguing with my computer. Why would I want to do that?

jwpapi 1 day ago|||

I’ve written a full stack monorepo with over 1,000 files alone now. I’ve started with AI doing a lot of the work, but the percentage goes down and down. For me a good codebase is not about how much you’ve written, but about how it’s architectured. I want to have an app that has the best possible user and dev experience meaning its easy to maintain and easy to extend. This is achieved by making code easy to understand, for yourself, for others.

In my case it’s more like developing a mindset building a framework than to push feature after feature. I would think it’s like that for most companies. You can get an unpolished version of most apps easily, but polishing takes 3-5x the time.

Lets not talk about development robustness, backend security etc etc. Like AI has just way too many slippages for me in these cases.

However I would still consider myself a heavy AI user, but I mainly use it to discuss plans,(what google used to be) or to check it if I’ve forgotten anything.

For most features in my app I’m faster typing it out exactly the way I want it. (with a bit of auto-complete) The whole brain-coordination works better.

I guess long talk, but you’re not alone trust your instinct. You don’t seem narrow minded.

ozten 1 day ago||

What does the full stack monorepo do?

jwpapi 19 hours ago||

It’s nothing special. Not in the realm of anything technical outstanding. I just stated that to emphasize that it’s a slightly bigger project than default single-dev coded SAAS projects which are just a single wrapper. We have workers, multiple white-labeled applications sharing a common infrastructure, data scraping modules, AI-powered services, and email processing pipelines.

I’ve had an impossible learning curve the last year, but as I should rather be vibe-coded biased I still use less AI now to make sure it’s more consistent.

I think the two camps are different in terms of skill honestly, but also in terms of needs. Like of course you are faster vibe-coding a front-end then to write the code manually, but build a robust backend/processing system its a different kind of tier.

So instead of picking a side it’s usually best to stay as unbiased as possible and choose the right tool for the task

tptacek 1 day ago|||

We just had a story last night about a Python cryptography maintainer using Claude to add formally-verified optimizations to LLVM. I think the ship has sailed on skepticism about whether LLMs are going to produce valuable code; you can follow Simon Willison's blog for more examples.

stavros 20 hours ago||

I don't understand people who are sceptical about whether LLMs can give value. We're way past that, now at the stage where we're trying to figure out how to extract the most value out of them, but I guess humans don't like change much.

barrenko 20 hours ago||

> Is 'Azure OpenAI subscription' cheaper than ChatGPT via OpenAI?

hamish-b 19 hours ago||

This sounds great, and is similar to the workflow I get from a high level stand point with https://ampcode.com/ - albeit without the model wrangling.

To the author & anyone reading - publicly release your agent harnesses, even if its shit or vibe coded! I am constantly iterating on my meta and seeking to improve.

vFunct 1 day ago||

The issue I'm facing with multiple agents working on separate work trees is that each independent agent tends to have completely different ideas on absolutely every detail, leading to inconsistent user experience.

For example, an agent working on the dashboard for the Documents portion of my project has a completely different idea from the agent working on the dashboard for the Design portion of my project. The design consistency is not there, not just visually, but architecturally. Database schema and API ideas are inconsistent, for example. Even on the same input things are wildly different. It seems that if it can be different, it will be different.

You start to update instruction files to get things consistent, but then these end up being thousands of lines on a large project just to get the foundations right, eating into the context window.

I think ultimately we might need smaller language models trained on certain rules & schemas only, instead of on the universe of ideas that a prompt could result in. Small language models are likely the correct path.

Swizec 1 day ago||

> each independent agent tends to have completely different ideas on absolutely every detail, leading to inconsistent user experience

> The design consistency is not there, not just visually, but architecturally.

Seniors always gonna have to senior. Doesn't matter if the coders are AI or humans. You have to make sure you provide enough structures for the agents to move in roughly the same direction while allowing enough flexibility that you're not better off just writing the code.

pjm331 1 day ago||

I’ve had success with building the first version of a thing mostly by hand and then telling Claude code to look at it as an example of how to do things when building the next N of them

swader999 1 day ago||

The things that work on a regular dev team translate well to the agentic mode.

guicen 1 day ago||

This "AI factory for everyone" model may be able to break resource inequality and allow people from more places to participate in truly valuable entrepreneurship.

nico 23 hours ago|

> If you know Factorio you know it’s all about building a factory that can produce itself

This is a very interesting concept

Could this be extended to the point of an LLM producing/improving itself?

If not, what are the current limitations to get to that point?

NitpickLawyer 18 hours ago|

> Could this be extended to the point of an LLM producing/improving itself?

Check out aider writing aider stats here: https://aider.chat/HISTORY.html

nico 6 hours ago||

Super interesting, thank you for the link

Aider writing its own code is definitely cool and within the same concept

I’d love to see an LLM or some sort of coding model that modifies/trains the model itself

More comments...