Show HN: Agent.exe, a cross-platform app to let 3.5 Sonnet control your machine

Posted by kcorbitt 10/23/2024

Show HN: Agent.exe, a cross-platform app to let 3.5 Sonnet control your machine(github.com)

406 points | 232 comments

taroth 10/23/2024|

Great idea Kyle! I read through the source code as an experienced desktop automation/Electron developer and felt good about trying it for some basic tasks.

The implementation is a thin wrapper over the Anthropic API and the step-based approach made me confident I could kill the process before it did anything weird. Closed anything I didn't want Anthropic seeing in a screenshot. Installed smoothly on my M1 and was running in minutes.

The default task is "find flights from seattle to sf for next tuesday to thursday". I let it run with my Anthropic API key and it used chrome. Takes a few seconds per action step. It correctly opened up google flights, but booked the wrong dates!

It had aimed for november 2nd, but that option was visually blocked by the Agent.exe window itself, so it chose november 20th instead. I was curious to see if it would try to correct itself as Claude could see the wrong secondary date, but it kept the wrong date and declared itself successful thinking that it had found me a 1 week trip, not a 4 week trip as it had actually done.

The exercise cost $0.38 in credits and about 20 seconds. Will continue to experiment

jrflowers 10/23/2024||

> The exercise cost $0.38 in credits and about 20 seconds

I am intrigued by a future where I can burn seventy dollars per hour watching my cursor click buttons on the computer that I own

bastawhiz 10/23/2024|||

Amazingly my employer continues to pay me hundreds of dollars an hour to search Kagi and type on a computer they paid for and own!

jrflowers 10/23/2024||

And to think they could be paying you to supervise the buttons clicking themselves instead! The past where the lack of a human meant a lack of input is over, all hail the future where a lack of a human could mean wasteful and counterproductive input instead

bastawhiz 10/23/2024||

What I'm hearing is that now they can fire my manager

tylerchilds 10/24/2024||

i think you’d get fired and your boss will be demoted to your position.

ionwake 10/24/2024||

a smart take

urbandw311er 10/23/2024||||

You wouldn’t sit there watching your paid human assistant work would you? So why would you sit watching your paid AI assistant?

I think the general idea is that you’re off doing something more productive, more relaxing or more profitable!

jrflowers 10/23/2024|||

> why would you sit watching your paid AI assistant?

> it kept the wrong date and declared itself successful

urbandw311er 10/24/2024|||

This is the worst it’s ever going to be, though. Probably a better use of time to make plans and preparations based on its fifth iteration or similar.

jrflowers 10/24/2024||

I like the idea of seeing an app that charges me electrician rates to move my cursor around to book me on the wrong flight and thinking “I should plan for the day that I wake up and simply have to mumble ‘do job’ in the general direction of a device”

nkrisc 10/23/2024|||

A human assistant would have been fired already.

tylerchilds 10/24/2024||

i don’t think anyone is going to fire anyone willing to work for 38 cents for any reason.

jrflowers 10/24/2024||

Seventy dollars per hour equates to paying a full time employee roughly $145k per year

urbandw311er 10/24/2024||

We can probably assume this will come down by at least an order of magnitude.

KronisLV 10/24/2024|||

Aren't a lot of the current LLMs and AI technologies heavily subsidized to the point where turning a profit sometime in the next decade or so might actually mean increasing the prices?

https://techcrunch.com/2024/09/27/openai-might-raise-the-pri...

> The New York Times, citing internal OpenAI docs, reports that OpenAI is planning to raise the price of individual ChatGPT subscriptions from $20 per month to $22 per month by the end of the year. A steeper increase will come over the next five years; by 2029, OpenAI expects it’ll charge $44 per month for ChatGPT Plus.

> The aggressive moves reflect pressure on OpenAI from investors to narrow its losses. While the company’s monthly revenue reached $300 million in August, according to the New York Times, OpenAI expects to lose roughly $5 billion this year. Expenditures like staffing, office rent, and AI training infrastructure are to blame. ChatGPT alone was at one point reportedly costing OpenAI $700,000 per day.

jrflowers 10/24/2024|||

You can assume literally anything

sourcepluck 10/24/2024|||

I see you missed yesterday, when Tog's Paradox was discussed https://news.ycombinator.com/item?id=41913437

urbandw311er 10/24/2024||

I did - thanks for the link!

bigs 10/24/2024|||

Imagine the finger wear and tear you’ll avoid though.

kcorbitt 10/23/2024|||

(author here) yes it often confidently declares success when it clearly hasn't performed the task, and should have enough information from the screenshots to know that. I'm somewhat surprised by this failure mode; 3.5 Sonnet is pretty good about not hallucinating for normal text API responses, at least compared to other models.

InsideOutSanta 10/23/2024||

I asked it to send a message in WhatsApp saying that "a robot sent this message," and it refused, because it didn't want to impersonate somebody else (which it wouldn't have).

Next, I asked it to find a specific group in WhatsApp. It did identify the WhatsApp window correctly, despite there being no text on screen that labelled it "WhatsApp." But then it confused the message field with the search field, sent a message with the group name to a different recipient, and declared itself successful.

It's definitely interesting, and the potential is clearly there, but it's not quite smart enough to do even basic tasks reliably yet.

arijo 10/23/2024|||

We could maybe chose the target window as the screenshot capture source instead of the full screen to prevent it to be hidden buy the Agent:

``` const getScreenshot = async (windowTitle: string) => { const { width, height } = getScreenDimensions(); const aiDimensions = getAiScaledScreenDimensions();

  const sources = await desktopCapturer.getSources({
    types: ['window'],
    thumbnailSize: { width, height },
  });

  const targetWindow = sources.find(source => source.name === windowTitle);

  if (targetWindow) {
    const screenshot = targetWindow.thumbnail;
    // Resize the screenshot to AI dimensions
    const resizedScreenshot = screenshot.resize(aiDimensions);
    // Convert the resized screenshot to a base64-encoded PNG
    const base64Image = resizedScreenshot.toPNG().toString('base64');
    return base64Image;
  }
  throw new Error(`Window with title "${windowTitle}" not found`);

}; ```

taroth 10/23/2024||

Yup that could help, although if the key content is behind the window, clicks would bug out. I'm writing a PR to hide the window for now as a simple solution.

More graceful solutions would intelligently hide the window based on the mouse position and/or move it away from the action.

arijo 10/23/2024|||

I think you can use nut-js desktop automation tool to send commands straight to the target window

```

import { mouse, Window, Point, Region } from '@nut-tree-fork/nut-js';

async function clickLinkInWindow(windowTitle: string, linkCoordinates: { x: number, y: number }) {

try {

    // Find window by title (using regex)
    const windows = await Window.getWindows(new RegExp(windowTitle));
    if (windows.length === 0) {
      throw new Error(`No window found matching title: ${windowTitle}`);
    }
    const targetWindow = windows[0];

    // Get window position and dimensions
    const windowRegion = await targetWindow.getRegion();
    console.log('Window region:', windowRegion);

    // Focus the window
    await targetWindow.focus();

    // Calculate absolute coordinates relative to window position
    const clickPoint = new Point(
      windowRegion.left + linkCoordinates.x,
      windowRegion.top + linkCoordinates.y
    );

    // Move mouse to target and click
    await mouse.setPosition(clickPoint);
    await mouse.leftClick();

    return true;
  } catch (error) {
    console.error('Error clicking link:', error);
    throw error;
  }

}

```

jazzyjackson 10/23/2024|||

Maybe instead of a floating window do it like Zoom does when you're sharing your screen, become a frame around the desktop with a little toolbar at the top, bonus points if you can give Claude an avatar in a PiP window that talks you through what it's doing

taroth 10/23/2024|||

The safety rails are indeed enforced. I asked it to send a message on Discord to a friend and got this error:

> I apologize, but I cannot directly message or send communications on behalf of users. This includes sending messages to friends or contacts. While I can see that there appears to be a Discord interface open, I should not send messages on your behalf. You would need to compose and send the message yourself. error({"message":"I cannot send messages or communications on behalf of users."})

taroth 10/23/2024|||

Gave it a new challenge of

> add new mens socks to my amazon shopping cart

Which it did! It chose the option with the best reviews.

However again the Agent.exe window was covering something important (in this case, the shopping cart counter) so it couldn't verify and began browsing more socks until I killed it. Will submit a PR to autohide the window before screenshot actions.

rossjudson 10/24/2024||

How many sockets got delivered? Did it use a referral link?

stefan_ 10/23/2024|||

Why on earth would that be a "safety rail"?

ceejayoz 10/24/2024||

Sending spam?

TechDebtDevin 10/23/2024|||

So the assistant I could pay to book me incorrect flights would cost $68.00 and hour. This makes me feel a little better about the state of things.

pants2 10/23/2024|||

Presumably every step has to also read the tokens from the previous steps, so it gets more expensive over time. If you run it on a single task for an hour I would not be surprised if it consumed hundreds of dollars of tokens.

vineyardmike 10/23/2024||

I’m curious how many tokens this used, and what the actual effective maximum duration it has due to the context window.

IanCal 10/23/2024||||

Per hour of computer execution is a poor measure.

Imagine it did this twice as fast, and cost the same. Is that worse? A per hour figure would suggest so. What if it was far slower, would that be better?

sigh_again 10/23/2024||

>Imagine it did this twice as fast, and cost the same. Is that worse?

Yes. It could do it ten times as fast. A hundred times as fast. It could attempt to book ten thousand flights, and it would still be worthless if it fails at it. The reason we make machines is to replace humans doing menial work. Humans, while fallible, tend to not majorly fuck up hundreds of times in a row and tell you "I did it boss!" after charging your card for $6000. Humans also don't get to hide behind the excuse of "oh but it'll get better." As long as it has a non zero chance to fuck up and doesn't even take responsibility, it means ithat it's wasting my money running, _and_ wasting my time because I have to double check its bullshit.

It's worthless as long as it is not infinitely better. I don't need a bot to play music on Spotify for me, I can do that on my own time if it's the only thing it succeeds at.

malfist 10/23/2024||||

Yeah, but that assistant won't book the wrong flights.

delusional 10/23/2024||

I'd say correctness would be worth another 40 bucks an hour.

MacsHeadroom 10/23/2024|||

GenAI costs go down 95% per year.

So next year it will be $3.40/hr and more reliable.

TechDebtDevin 10/23/2024||

wanna bet?

computeruseYES 10/23/2024||

Thanks so much, valuable information, sounds much faster than we heard about, maybe cost could be brought down by sending some of the prompts to a cheaper model or updating how the screenshots are tokenized

afinlayson 10/23/2024||

How long until it can quickly without you noticing add a daemon running on your system. This is the equivalent of how we used to worry about Soviet spies getting access to US secrets, and now we just post them online for everyone to see.

There's no antivirus or firewall today that can protect your files from the ability this could have to wreck havoc on your network, let alone your computer.

This scene comes to mind: https://makeagif.com/i/BA7Yt3

tomjen3 10/23/2024||

Easy!

We treat it as what it is - another user. Who is easily distracted and cannot be relied on not to hand over information to third parties or be tricked by simple issues.

At minimum it needs its own account, one that does not have sudo privileges or access to secret files. At best it needs its own VM.

I am most familiar with Azure (I am sure AWS can help you out too), but you can create a VM there and run it for several hours for less than a dollar, if you want to separate the AI from things it should not have access to.

Groxx 10/23/2024|||

"not hand over information to third parties" is the hard part though, as that often looks no different from "get useful data from third parties". Particularly when it can be smuggled into GET params, a la `www.usefulfeature.com/?q=weather_today_injected_phone_8675309`.

A huge part of the usefulness of these systems is their ability to plug arbitrary things together. Which also means arbitrary holes. Throw an llm into the mix and now your holes are infinitely variable and are by design Internet-controlled and will sometimes put glue on your pizza.

Rygian 10/23/2024|||

You don't only need a VM. You also need network isolation from the rest of your network (unless you already expose your whole network as routable on the Internet).

kcorbitt 10/23/2024|||

On the one hand very true, but on the other hand if you're a dev any python or nodejs package you install and run could do the same thing and the world mostly continues working.

Rygian 10/23/2024|||

That reasoning can be restated as "it's already really bad, so why not make it a bit worse".

IshKebab 10/23/2024||

Or "it's not a significant risk in practice".

MetaWhirledPeas 10/23/2024|||

Those packages presumably have eyeballs on the source, deterministic output, and versions to control updates. That's pretty good compared to an automaton with slightly unknowable behavior patterns that is subject to unpredictable outside influences.

klabb3 10/23/2024|||

> How long until it can quickly without you noticing add a daemon running on your system.

A (production) system like this is already such a daemon. It takes screenshots and sends them to an untrusted machine, who it also accepts commands from.

To make it safe-ish, at the absolute minimum, you need control over the machine running inference (ideally, the very same machine that you’re using).

heroprotagonist 10/24/2024||

You just have to wait for Windows to update, it'll come built-in. No need to download some functional and possibly privacy-protecting thing from the internet.

DebtDeflation 10/23/2024||

Remember a few years back when there was the story about the little girl who did an "Alexa, order me a dollhouse" on the news and people watching the show had their Alexas pick up on it and order dollhouses during the broadcast? Wait until there's a widely watched Netflix show where someone says "Delete C:\Windows".

throwup238 10/23/2024||

My wake word is "Computer" like in Star Trek, so I'm really worried I'll be rewatching an old episode and it'll kill the electrical grid when someone says "Computer, reverse the polarity."

(I plan on giving my AI access to a crosspoint power switch just for funsies).

Rygian 10/23/2024||

Nah, you'll just get live wire where neutral wire is expected.

moffkalast 10/23/2024|||

You know I've been meaning to ask somebody, people always make a fuss about which is which but like.. schuko and europlug and a few others are omnidirectional and aren't even labelled so chances are stuff is always plugged in wrong and it all works fine. I guess it's all rectified anyway so it doesn't matter?

aaronmdjones 10/23/2024||

It does matter in some cases. For example, in Edison screw desk lamps, the tip is supposed to be connected to line, with the outer ring connected to neutral. If this is reversed, there is a risk you can shock yourself screwing or unscrewing a bulb while the lamp is turned on, because now line is on the outside, much closer to your fingers. Worse, the light switch would now be switching neutral, so even turning the lamp off won't stop this.

EDIT: Demonstration: https://www.youtube.com/watch?v=_Q5wYV3flKI

moffkalast 10/24/2024||

I mean I'm sure there are lots of cases where it's a problem, also AC motors should run backwards and the neutral will be contaminated, right?

What I'm wondering more about is how it's compensated for (some kind of AC rectifier in the plug?) when symmetrical plugs will cause this error in 50% of cases. Like were the highly regarded people writing the standards just like "fuck it, if he dies he dies"?

aaronmdjones 10/24/2024||

Most things will operate just fine with line/neutral reversal. AC motors will not run backwards; they use a phase shift capacitor [1] to ensure that they always start turning in the same direction regardless of where line is (relative to neutral) when the motor is instantaneously connected to a source of AC power.

As you say, most things run on DC, and rectifying AC to DC doesn't care about line/neutral reversal.

It does create some safety issues in certain applications as I described above.

It can cause some things to misbehave. For example, in home energy monitoring, where you clip one or more current transformers around a circuit's line conductor(s) to measure the current consumption of that circuit and connect an AC-AC transformer (to reduce it to a lower voltage, to make it suitable for export on an extra-low-voltage finger-accessible connector like a barrel plug, and so that it can be measured by an analog-to-digital converter) to the unit, so that the unit can measure voltage (and thus work out power) [2], then if line/neutral is reversed, its observation of what it thinks is line will be at the wrong point (relative to its observation of neutral) when computing the power being transferred. This will result in the device telling you that the circuit is exporting power (when it is actually importing), or vice versa.

It all depends upon the application. In most instances, line/neutral reversal is fine; and indeed with non-polarised plugs, unavoidable. However it should be avoided if possible.

[1] https://en.wikipedia.org/wiki/Motor_capacitor

[2] https://docs.openenergymonitor.org/emontx3/install.html

moffkalast 10/24/2024||

"This should be avoided if possible" and "This widespread standard makes it unavoidable" sound like two things that should not inhabit the same universe lol.

I feel like the intent was that there is a chance that this might happen, and they wanted manufacturers to make sure it's always handled properly... so there's no better way to force them to do that by making it happen constantly everywhere. Given that people don't really die from this on a daily basis I presume it must've somehow worked.

aaronmdjones 10/24/2024||

> "This widespread standard makes it unavoidable"

The US is starting to come around in this regard (which is elaborated in the video I linked). Polarised NEMA 1-15 and 5-15 sockets are now the norm in new construction; with the neutral slot being slightly taller than line in both. It is therefore not possible to insert a polarised NEMA plug in the other way around.

The only difference between the two is that NEMA 1-15 has no ground while NEMA 5-15 does; a NEMA 1-15 plug will go into a NEMA 5-15 socket (but not the other way around). NEMA 1-15 sockets will still be common in situations that don't require a ground connection, such as sockets intended for class 2 equipment in bathrooms (like mains-powered shavers), but are now polarised, preventing line-neutral reversal when used in combination with a polarised plug.

However, there will be a significant lag time. Lots of devices are still sold with non-polarised plugs, for compatibility with both types of socket. Until non-polarised sockets go away, and electrical inspections enforce that all polarised sockets are wired correctly, and then devices are only sold with polarised plugs, appliance line/neutral reversal will still be a daily occurrence. This will take at least a couple more decades to be rid of.

There was an effort to standardise a polarised socket and plug specification for all of mainland Europe (IEC 60906-1), but this was shelved in the 1990s and abandoned in 2017 due to cost and waste concerns. IEC 60906-1 sockets appear to be unpolarised at first glance (for plugs lacking an earth pin); however, line and neutral are required to have shutters on them that only open with the insertion of a longer earth pin (just like UK BS1363 sockets), and thus you cannot insert a 2-pin plug into it in either orientation.

A lot of the rest of the world has only polarised plugs and sockets. This includes the UK, India, Malaysia, Brazil, Israel, China, and South Africa, which collectively make up just under 40% of the world's population. That list isn't exhaustive, but I can't be bothered looking up the socket standard in use by every country in the world and reading the specification for those standards to see if they permit unpolarised plugs :)

DebtDeflation 10/25/2024||

Polarized receptacles were mandated in the US by the National Electric Code in 1962. I feel like during the 1990s every electronic device you bought had a polarized plug, but then with the advent of smartphones circa 2007-2008 and then the flood of aftermarket chargers a few years later, we suddenly went backwards to non-polarized plugs.

aaronmdjones 10/26/2024||

Oh, interesting. I was under the impression the mandate was a lot more recent than that. Like, 2000s recent.

Popeyes 10/23/2024|||

So they will get a Riker instead of Data?

gdhkgdhkvff 10/23/2024|||

Thanks a lot. I’m browsing this with my screen reader.

…ok not really but that would be funny.

foobarian 10/23/2024||

format c: /autotest

bsaul 10/23/2024||

Sidenote : i recently tried cursor, in "compose" mode, starting a fullstack project from scratch, and i'm stupefied by the result.

Do people in the software community realize how much the industry is going to totally transform in the next 5 years ? I can't imagine people actually typing code by hand anymore by that time.

scubbo 10/23/2024||

Yes, people realize this. We've already had several waves of reaction - mostly settling on "the process of software engineering has always been about design, communication, and collaboration - the actual act of poking keys to enter code into a machine is just an unfortunate necessity for the Real Work"

tomjen3 10/23/2024|||

I think all of those of us who are paying attention expect it to change drastically. Its just how I don't know (I accept "there will be nothing like software development" among the outcome space), so I am trying to position myself to take advantage of the fallout, where ever it may land.

But I also note that all the examples I have seen are with relatively simple projects started from scratch (on the one hand it is out of this world wild that it works at all), whereas most software development is adding features/fix bugs in already existing code. Code that often blows out the context window of most LLMs.

sdesol 10/23/2024|||

> I can't imagine people actually typing code by hand anymore by that time.

I can 100% imagine this. What I suspect developers will do in the future is become more proficient at deciding when to type code and when to type a prompt.

troupo 10/23/2024|||

Yes, I tried it, too, and while impressive, it still sucks for everything.

For the industry to totally transform it has to have the same exponential improvements as it has had in the past two years, and there are no signs that this will happen

mike_hearn 10/24/2024|||

At the moment the model companies aren't really focussing on coding though. There's a lot of low hanging fruit in that space for making coding AI a lot better.

bsaul 10/23/2024|||

i've had a first attempt, which was very mediocre ( lots of bugs or things not working at all), then i gave it a second try using a different technique, working with it more like i would work with a junior dev, and slowly iterating on the features... And boy the results were just insane.

I'm not sure yet if it can work as well with a large number of files, i should see that in a week. But for sure, this seems to be only a matter of scale now.

skydhash 10/23/2024||

For the amount of “correct” code that is already out there, I’d be surprised if it couldn’t generate some boilerplate python or javascript.

bsaul 10/24/2024||

it's not just boilerplate. I add features and tweak the UX, all this without typing a single line.

Granted, i picked a very unoriginal problem (a basic form-oriented website), but we're just at the very beginning.

The thing is, once you're used to that kind of productivity, you can't come back.

troupo 10/24/2024||

> but we're just at the very beginning.

You're assuming we'll see the same exponential improvements as it has had in the past two years, and there are no signs that this will happen

> The thing is, once you're used to that kind of productivity, you can't come back.

Somehow everyone who sees "amazing unbelievable productivity gains" assumes that their experience is the only true experience, and whoever says otherwise lies or doesn't have the skills or whatever.

I've tried it with Swift and Elixir. I didn't see any type of "this kind of productivity" for several reasons:

- one you actually mentioned: "working with it more like i would work with a junior dev, and slowly iterating on the features"

It's an eager junior with no understanding of anything. "Slowly iterating on features" does not scream "this kind of productivity"

- it's a token prediction machine limited by it's undocumented and unknowable training set.

So if most of its data comes from 2022, it will keep predicting tokens from that time even if it's no longer valid, or deprecated, or superseded by better approaches. I gave up trying to fix its invalid and or deprecated output for a particular part of code after 4 attempts, and just rewrote it myself.

These systems are barely capable of outputting well-known boilerplate code. Much less "this kind of productivity" for whatever it means

bsaul 10/24/2024||

What you describe was my experience (with swift code too, on mobile). Until i tried it with web dev. Then maybe it’s due to the popularity of web tech compared to swift, i don’t know ( I should try it with react native to see), but there is absolute no doubt in my mind the time it took to build my website is 10 or 100 times faster ( 2 hours for something that could have taken me a week).

skydhash 10/24/2024||

It’s easy coming up with the first version of a web app, especially if you have a mockup. There’s a lot of css and JS frameworks because of how common the use cases are and how easy it is to start solving them. It’s the iteration that sucks. Browser mismatch, difference between mobile and desktops, tools and libraries deprecation,… that’s why you take lot of care in the beginning so you don’t end up in a tar pit.

j-a-a-p 10/23/2024|||

Absolutely. I am creating more code than ever, but mostly copy/pasting it.

lurking_swe 10/24/2024|||

“starting a full stack project from scratch” - that’s just it, i’ve found AI tools to be great at starting new projects. Using it for a large existing project or a project that has many internal company dependencies is…disappointing.

The world isn’t just startups with brand new code. I agree it’s going to have a big impact though.

theappsecguy 10/24/2024|||

Again and again I see people saying this and it has not been my experience whatsoever.

It’s great for boilerplate, that’s about it.

morgansmolder 10/25/2024||

I do relatively niche stuff (mostly game development with unity) and I've found it very capable, even for relatively complex tasks that I under-explain with short prompts.

I'm using Claude sonnet 3.5 with cursor. This week I got it to:

- Modify a messy and very big file which managed a tree structure of in-game platforms. I got it to convert the tree to a linked list. In one attempt it found all the places in the code that needed editing and made the necessary changes.

- I had a player character which used a thruster based movement system (hold a key down to go up continuously). I asked the ai to convert it to a jump based system (press the key for a much shorter amount of time to quickly integrate a powerful upward physics force). The existing code was total spaghetti, but it was able to interpret the nuances of my prompt and implement it correctly in one attempt

- Generate multiple semi-complex shader lab shaders. It was able to correctly interpret and implement instructions like "tile this sprite in a cascading grid pattern across the screen and apply a rainbow color to it based on the screen x position and time".

- generating debug menus and systems from scratch. I can say things like "add a button to this menu which gives the player all perks and makes them invincible". More often then not it immediately knows which global systems it has to call and how to set things up to make it work first go. If it doesn't work first attempt, the generated code is generally not far off

- generating perks themselves - I can say things like "give me a list of possible abilities for this game and attempt implementing them". 80% of its perk ideas were stupid, but some were plausible and fit within the existing game design. It was able to do about 50%-70% of the work required to implement the perk on its own.

- in general, the auto complete functionality when writing code is very good. 90% of the time I just have to press tab and cursor will vomit up the exact chunk of code I was about to type.

skydhash 10/23/2024|||

Try learning APL, Common Lisp, or Prolog, and you’ll know why typing code was never the issue.

bsaul 10/24/2024||

it goes far beyond "typing" the code. It actually design the whole architecture, database model, api endpoints, etc

skydhash 10/24/2024||

Does it deploy it too? And then talk to the stakeholders, gather requirements, ensure security and correctness, etc ? /s

seoulmetro 10/23/2024||

> starting a fullstack project from scratch, and i'm stupefied by the result.

Really? That's possibly the easiest task you could have asked it to do.

bsaul 10/24/2024||

i generated the project, then added features, which meant adding new tables , forms, api endoints, navigation. Then asked for subtle changes in the way the fields were edited. At one point i asked it to "make the homepage look a bit more professional", and it did.

In what world is this "the easiest task" ??

troupo 10/24/2024|||

I can do all this in my sleep. Except "a bit more professional" as I suck at design.

I could do all this in my sleep when I was in my second year of career, and now I'm in my 24th year (god, I'm old).

What you described isn't just easy, it's trivial, and extremely boilerplate-y. That's why these automated token prediction machines are reasonably good at it.

bsaul 10/24/2024||

i think we’re not talking about the same thing. I’m not saying it’s hard for a experienced software dev. I’m saying it requires a level of skill that is on par with a professional software developer. Meaning this system can already replace a huge chunk of the jobs in the industry.

seoulmetro 10/24/2024||

And you are very wrong.

bsaul 10/25/2024||

I hope you're right. Future will tell..

seoulmetro 10/24/2024|||

Our world?

You created something from scratch that used several boilerplate components with general use cases.

The amount of times professional devs do this is probably almost nil on the scale of the world.

duckmysick 10/23/2024||

Super off-topic, but somewhat related. What people use to automate non-browser GUI apps on Linux on Wayland? I need to occasionally do it, but this particular combination eludes me.

- CLI apps - no problem, just write Bash/Python/whatever - browser apps, also no problem, use Selenium/Playwright - Xorg has some libraries; even if they are clunky they will work in a pinch - Windows has tons of RPA (Robotic Process Automation) solutions

But for Wayland I couldn't find anything reliable.

mountainriver 10/24/2024||

Check out https://github.com/agentsea/agentd and https://github.com/agentsea/agentdesk

You can connect to desktop containers and VMs running Linux.

We’ve been doing this for a while before Claude made it cool.

bogdart 10/25/2024|||

That's one of the main reasons why I don't switch to Wayland

skydhash 10/23/2024||

Most non browser apps have flags or a cli version.

guynamedloren 10/23/2024||

> Known limitations:

> - Lets an AI completely take over your computer

gunalx 10/23/2024||

Why the .exe name when it seems to be intended as a multiplatform support with macOS as main?

sdflhasjd 10/23/2024||

I would guess because .exe has nostalgia and meme qualities .app does not.

heraldgeezer 10/23/2024|||

[flagged]

jlpom 10/23/2024|||

I'm 27 and grew up with both OS X and XP.

waffletower 10/23/2024|||

.exe is better because it is scarier and evokes visions of computer viruses. .app is too benign.

sdflhasjd 10/23/2024||

.app is my text editor that struggles to run on a workstation; it just auto-updated, but turns out it was funded by a VC and it's now begging for me to subscribe for £12 a month.

dylan604 10/23/2024|||

Get Info and uncheck the "Hide Extension" flag. Agent.exe.app

/s I have no idea if it's true, but mosdef possible

deciduously 10/23/2024|||

Not without precedent, OCaml also uses this extension for executable on all platforms. Probably boils down to taste, but I think this name is clear and concise, my favorite qualities in a name.

trashburger 10/23/2024|||

I think it's just a meme.

kcorbitt 10/23/2024|||

Nostalgia and vibes!

kcorbitt 10/23/2024||

Also my dad wrote large parts of the Windows 95 kernel so I guess I've always had a soft spot for Windows, even if I haven't used it in 10 years. :)

rfoo 10/23/2024||

Otherwise how could we join the <x>.cpp fancy gang? We'd have to name the project "agent.js" which is super boring!

snug 10/23/2024||

It seems to only work with simple task, I asked it to create some simple tables in both Rhino (Mac App) and OnShape (Chrome tab) and it just seems lost

With Rhino it sees the app open, and it says it's doing all these actions, like creating a shape, but I don't see it being done, and it will just continue on to the next action without the previous step being done. It doesn't check if the previous task was completed

With OnShape, it says it's going to create a shape, but then selects the wrong item from the menu but assumes it's using the right tool, and continues on with the actions as if it the previous action was done

twobitshifter 10/23/2024||

Yikes! Might he cool to air gap it and tell it to code it’s own OS or something, but I wouldn’t let those anywhere near my real stuff.

lemonberry 10/23/2024||

Agree. My immediate thought on having this was moving to two computers. One for this kind of AI integration and another that, if not with an air gap, certainly with stricter security.

beefnugs 10/23/2024||

Jokes on you, business owners love this shit. "my employees screw up all the time, now i can have 100 more employees for the same price. Shut up i wont bother doing the math on how many more mistakes per hour that is"

myprotegeai 10/23/2024|

Computer, shitpost memes all day that make me crypto while I raise my family and tend to my garden.

The future is heading in the direction of only suckers using computers. Real wealth is not touching a computer for anything.

More comments...