Top
Best
New

Posted by derek 7/1/2025

Building a Personal AI Factory(www.john-rush.com)
266 points | 159 commentspage 2
puersum 7/3/2025|
I believe we need to find more effective ways to integrate AI into our workflows. Anyone who is actively trying to adopt AI has likely encountered similar challenges, yet a definitive solution has yet to emerge. In my view, a key principle at this stage is to assign AI minimal responsibility and highly specific tasks.

For example, I'm currently experimenting with an agent workflow for stock research. I've set up two AI roles: a 'Bullish Guy' and a 'Bearish Guy' and have them debate the pros and cons of a specific stock. The premise is that through this adversarial process, the AIs are forced to research opposing viewpoints, leading to a more comprehensive understanding and a superior final analysis. The idea was inspired by the kinds of arguments you see on social media.

MatveySecured 7/4/2025|
hey! Your agent workflow for stock research sounds very interesting. Can you share a link please?
dkdcio 7/2/2025||
I went down this (and even built a bit of internal web tooling) —- it’s like playing multiple games of online poker for me (instead of the factoria analogy here)

it’s really promising, but I found focusing on a single task and doing it well is still more efficient for now. excited for where this goes

codemonkey-zeta 7/2/2025||
> Because most of my day-to-day is in clojure I tend to use sonnet 4 to get the parens right.

In case the author is lurking, you may want to apply the same fix they do in clojure-mcp: https://github.com/bhauman/clojure-mcp/blob/8150b855282babcd...

The insight that team had was that LLMs get confused with parens, but they are excellent at indentation, so if you run parinfer over the LLMs output it will be correct in 99% of cases.

skybrian 7/1/2025||
> It’s essentially free to fire off a dozen attempts at a task - so I do.

What sort of subscription plan is that?

steveklabnik 7/1/2025|
Claude Code's $200 Max subscription can take a lot of usage. I haven't done a dozen things at once, but I have worked on two side projects simultaneously with it before.

ccusage shows me getting over 10x the value of paying via API tokens this month so far...

simonw 7/2/2025|||
I had to look that up: https://github.com/ryoppippi/ccusage

  npx ccusage@latest
Outputs a table of your token usage over the last few days, which it reads from the jsonl files that Claude Code leaves tucked away in the ~/.claude/ directory.
steveklabnik 7/2/2025||
Don’t sleep on the other options either, the live updates are cool, see where you’re at in the five hour session.
Aeolun 7/2/2025|||
Given you can nearly run two full code instances with Opus, and Opus is claimed to be 5x more expensive than Sonnet, you can maybe do 10 sonnet instances at the same time?
caporaltito 7/2/2025||
Show us the code, mate.
neurostimulant 7/9/2025||
Imagine a future where a program source code is a bunch of .md files, the build script is telling some ai agents to execute the plan in the .md files, and creating a new version is done by re-doing the same steps but with newer and smarter ai agent.

I'll probably became a farmer by then.

nilirl 7/2/2025||
"Fix inputs" => The assumption is there exists some perfect input that will give you exactly what you want.

It probably works well for small inputs and tasks well-represented in the training data (like writing code for well-represented domains).

But how does this work for old code, large codebases, and emergencies?

- Do you still "learn" the system like you used to before?

- How do you think of refactoring if you don't get a feel for the experience of working through the code base?

Overall: I like it. I think this adds speed for code that doesn't need to be reinvented. But new domains, new tools, new ways to model things, the parts that are fun to a developer, are still our monsters to slay.

_1tem 7/2/2025|
> But how does this work for old code, large codebases, and emergencies?

Have you actually tried Claude Code? It works pretty well on my old code, medium size SaaS codebase. I’ve had it build entire features end to end in (backend, front end, data migrations, tests) in one or two prompts.

PhilippGille 7/3/2025||
> Next claude code execute the plan, either with sonnet 3.7 or sonnet 4 depending on the complexity of the task. Because most of my day-to-day is in clojure I tend to use sonnet 4 to get the parens right.

This made me chuckle.

Perfect example of why heavily LLM-driven devs and processes might want to pick a popular programming language which the LLM had a ton of training data for. Or a strong point for specialized LLMs (e.g. here it could be a smaller/cheaper/faster Clojure-specialized model).

mfalcon 7/2/2025||
"Outputs are disposable; plans and prompts compound."

I agree with this and it aligns with the general opinion about what is the true value the SWE's bring to the table.

GTP 7/2/2025|
> That loop is the factory: the code itself is disposable; the instructions and agents are the real asset.

Why do I hear the words "technical debt"? More to the point, the risk I see with this approach is that the author would end hp throwing away working and well tested code to implement some minor change. This has an high risk of introducing many easily avoidable bugs.

More comments...