Agent Skills - Hacker News

Posted by BOOSTERHIDROGEN 11 hours ago

227 points | 99 commentspage 2

turlockmike 8 hours ago|

The best way to prompt an LLM is to describe the outcome you want, that's it. They are trained as task completers. A clear outcome is way better than a process.

If the LLM fails, either you didn't describe your outcome sufficiently or is misinterpreted what you said or it couldn't do it (rare).

Common errors should be encoded as context for future similar tasks, don't bloat skills with stuff that isn't shown to be necessary.

stingraycharles 7 hours ago||

> The best way to prompt an LLM is to describe the outcome you want, that's it. They are trained as task completers. A clear outcome is way better than a process.

This is not true for anything complex. They’re instruction followers, of which task completion is just one facet.

They’re also extremely eager to complete tasks without enough information, and do it wrongly. In the case of just describing task completion, despite your best efforts, there are always some oversights or things you didn’t even realize were underspecified.

So it helps a lot to add some process around it, eg “look up relevant project conventions and information. think through how to complete the task. ask me clarifying questions to resolve ambiguities. blah blah”. This type of prompt will also help with the new Opus 4.7 adaptive thinking to ensure it thinks through the task properly.

stult 7 hours ago||

Agreed, and further, I'd argue the OP's division of LLM instructions into either process or outcome specification is a false dichotomy. My agentic process specification is about automatically specifying the outcomes that I would otherwise repeatedly have to tell the LLM to consider, like making sure test coverage is maintained, or that decisions are documented on the original Github issue. Or it's about correcting common failure modes, like when the agent spends an enormous amount of time running repo-wide tests while debugging a focused change, because the agent doesn't consistently optimize around the time-to-implement as an outcome. Arguably part of addressing those failure modes boils down to pure process in the sense that I specify a logical order for achieving the outcomes, e.g. creating a plan before implementing. But that is mostly to organize approval gates for my convenience, rather than structuring the agent's work per se.

tecoholic 7 hours ago|||

If there is anything we have learned in decades of Software engineering, it's "A clear outcome" is not easy to describe. In many cases, it's impossible unless people from 4 different domains collaborate. That's why process matters. It allows for software to be built is a "semi-standardized" way that can allow iterations to get us closed towards the expected outcome, that might emerge over time.

Yes, not everything I use LLMs for going to have the same level of ambiguity or complex requirements. Optimizing by choosing to skip over parts of the process is exactly Addy is talking in this article.

_pdp_ 2 hours ago|||

This seems like common sense but it does not work in practice.

Prompting is just the first part. To get the outcome, you need to have other systems to steer the agent as it get things wrong. Proper deterministic tests work. But there is also stuff that need to happen during the LLM execution like cyclic detection etc. All of this adds up.

You cannot just prompt an LLM an hope for a good outcome. It might work in small isolated scenarios but it just does not work consistently enough to call it reliable.

Without further guardrails enforced by the process or the harness, LLMs do not have sufficient capabilities to complete a task up to a certain standard.

alexjurkiewicz 7 hours ago|||

I agree that many skills are overblown and unnecessary. But there's a lot of value in giving AI the right process. See how much more effective Claude can be for moderate or large changes when using the superpowers skill.

tmaly 7 hours ago|||

Sometimes people don't know what they want.

I prefer the start small and iterate approach to arrive at a result.

Then I ask it to summarize. Sometimes after that I ask it to generalize.

peab 7 hours ago|||

a skill is just reusuable/shareable context. It's just text, really. It's useful for things like documentation on how to use an API (this works better than MCP in my opinion), or a non consensus way of doing something. For example, you can use remotion to generate video. There are useful remotion skills that allow you to reliably generate specific types of videos. Captions of a certain style, for example.

markbao 7 hours ago|||

That seems a bit reductive. Even with humans, there’s a range of interpretations and ways that something can be built or a task completed. Engineers remember stuff so you don’t have to keep repeating yourself. Skills are a way to describe your outcome without similar repetition.

nullsanity 7 hours ago||

[dead]

SudheerTammini 5 hours ago||

Recently I have got an access(enterprise)to the latest ChatGPT module with an ability to write skills to automate repeatable taks. Without any prior knowledge I just started tinkering and now after creating and testing multiple skills in real business environment I can confidently say writing a good skill is a skill itself. As the author mentioned it's not an essay but a specific instructions sets organised in steps and in a concise manner.

theahura 4 hours ago||

I really wish he wouldn't use AI to write his posts. It would be faster to just post the prompt he used to write the article

petesergeant 4 hours ago|

I wish this fucking meme of "post the prompt" would die. Very little work is one-shotted, very little has a singular "the prompt", most is iterated until it's close to the vision of what the author actually set out to write.

rossant 3 hours ago||

Exactly. Glad to see someone else articulate this so clearly.

koliber 4 hours ago||

Lately I keep hearing the same thing over and over: the things that are good for managing a team of devs are good for LLMs.

Good test cases.

Clear and concise documentation.

CI/CD.

Best practices and onboarding docs.

Managing LLMs is becoming more and more similar to managing teams of people.

tempoponet 3 hours ago|

Similarly, the agentic coding success stories are from orgs that had all of these things out of the gate.

codemog 6 hours ago||

Everyone who writes this kind of stuff skips the boring parts: science and engineering.

Yep, benchmarks, comparisons of with/without, samples of generated code with/without. This kind of stuff matters, and you may be making your agent stupider or getting worse results without real analysis.

Also this prose reads like the author has drunk the Google kool-aid and not much else.

konaraddi 5 hours ago||

There’s so many ways, many redundant, to set up agents for software development that beyond personal/team/org needs+tastes, I need to look into setting up some benchmarks to evaluate what set up is optimal or whether the differences are even worth it.

senko 8 hours ago||

> This isn’t a coincidence. It’s the same SDLC every functioning engineering organisation runs, just in different vocabulary. [...] Amazon calls it the working-backwards memo and the bar raiser. Every healthy team has some version of this loop.

This (sdlc == working backwards & bar raiser) is so horribly wrong, that I hope this was an LLM hallucination.

In general, I'm starting to see these agent scaffolding systems as an anti-pattern: people obsess over systems for guiding agents and construct elaborate rube-goldberg machines and then others cargo-cult them wholesale, in an effort to optimize and control a random process and minimize human involvement.

yks 8 hours ago||

The problem is it’s so rarely A/B tested, definitely not at scale. An engineer, who writes all these my-workflow-but-for-agents skills, proceeds to get the good outcome, while also seeing affirmations that the agent did follow the prescribed processes - that is considered a victory. In reality the outcome could’ve been just as good if they fed Claude a spec + acceptance criteria, or even a basic prompt for the simpler tasks.

AndyNemmity 7 hours ago||

Yeah, I Blind A/B test everything, and a lot.

But I don't expect anyone to every use my stuff. It's complicated as hell. But it's for me, and it works without me having to remotely think about the complexity.

I love that.

BOOSTERHIDROGEN 8 hours ago||

This is how similarly we collectively approach Taylorism, isn't it? However, the world favors capitalism, of which Taylorism becomes a handy scaffolding.

scotty79 1 hour ago||

> Workflows are agent-actionable; essays are not. The same is true for human teams. If your team handbook is 200 pages, no one reads it under time pressure.

Agents do read that. And actually remember it. Because it's tiny with other things you are cramming into their context.

ElijahLynn 9 hours ago||

I've been using Agent Skills on a new side project and I'm really impressed so far! It really holds my hand a lot of the way and really lets me focus on developing a product instead of figuring out how to build it. I get to focus much more energy on high level architecture and product design.

Very grateful for this repository and everyone who contributed to it!

y-curious 9 hours ago|

Thanks for this, going to steal a lot of this. I would install your plugin, but I worry about being able to delete it later. I also think that each one of these is better served customized to a developer. That said, I'm still going to grab some of these, thanks!

bvirkler 7 hours ago|

A plugin is just a set of files, right? why wouldn't you be able to delete it later?

More comments...