Agent Skills - Hacker News

Posted by mooreds 10 hours ago

326 points | 187 commentspage 2

time0ut 8 hours ago|

I am working on a domain specific agent that includes the concept of skills. I only allow one to be active at a time to reduce the chances for conflicting instructions. I use a small sub-agent to select/maintain/change the active skill at the start of each turn. It uses a small fast model to match the recent conversation to a skill (or none). I tried other approaches, but for my use case this was worked well.

My model for skills is similar to this, but I extended it to have explicit use when and don’t use when examples and counter examples. This helped the small model which tended to not get the nuances of a free form text description.

evanmoran 8 hours ago|

You should consider calling these "behaviors" to mimic behavior trees in game / robot AI. They follow the same notion of a single behavior being active at once: https://en.wikipedia.org/wiki/Behavior_tree_(artificial_inte...

Frannky 9 hours ago||

I started playing with skills yesterday. I'm not sure if it's just easier for the LLM to call APIs inside the skill — and then move the heavier code behind an endpoint that the agent can call instead.

I have a feeling that otherwise it becomes too messy for agents to reliably handle a lot of complex stuff.

For example, I have OpenClaw automatically looking for trending papers, turning them into fun stories, and then sending me the text via Telegram so I can listen to it in the ElevenLabs app.

I'm not sure whether it's better to have the story-generating system behind an API or to code it as a skill — especially since OpenClaw already does a lot of other stuff for me.

replwoacause 9 hours ago||

Are you spending a fortune on running OpenClaw?

Frannky 8 hours ago||

It's free with qwen oauth

empath75 9 hours ago||

They're basically all trade-offs between context-size/token-use and flexibility. If you can write a bash or a python script, or an api or an MCP to do what you want, then write a bash or python script to do it. You can even include it in the skill.

My general design principle for agents, is that the top level context (ie claude.md, etc) is primarily "information about information", a list of skills, mcps, etc, a very general overview, and a limited amount of information that they always need to have with every request. Everything more specific is in a skill, which is mostly some very light touch instructions for how to use various tools we have (scripts, apis and mcps).

I have found that people very often add _way_ to much information into claude.md's and skills. Claude knows a lot of stuff already! Keep your information to things specific whatever you are working on that it doesn't already know. If your internal processes and house style are super complicated to explain to claude and it keeps making mistakes, you might want to adapt to claude instead of the other way around. Claude itself makes this mistake! If you ask it to build a claude md, it'll often fill it with extraneous stuff that it already knows. You should regularly trim it.

Frannky 8 hours ago||

Thanks, super useful!

bob1029 6 hours ago||

I think pre-canned "skills" are an anti-pattern with the frontier models. Arguably, these skills already exist within the LLM. We don't need to explain how to do things they already know how to do.

I prefer to completely invert this problem and provoke the model into surfacing whatever desired behavior & capability by having the environment push back on it over time.

You get way more interesting behavior from agents when you allow them to probe their environment for a few turns and feed them errors about how their actions are inappropriate. It doesn't take very long for the model to "lock on" to the expected behavior if you are detailed in your tool feedback. I can get high quality outcomes using blank system prompts with good tool feedback.

jjice 4 hours ago||

I think skills actually complement what you're saying very well.

> You get way more interesting behavior from agents when you allow them to probe their environment for a few turns and feed them errors about how their actions are inappropriate. It doesn't take very long for the model to "lock on" to the expected behavior if you are detailed in your tool feedback. I can get high quality outcomes using blank system prompts with good tool feedback.

My primary way of developing skills (and previously cursor rules) is to start blank, let the LLM explore, and correct it as we go until the problem is solved. I then ask it to generate a skill (or rule) that explains the process in a way that it could refer to to repeat this again. Next time something like that comes up, we use the skill. If any correction is needed, I tell it to update the skill.

That way we get to have it explore and get more context initially, and then essentially "cache" that summarized context on the process for another time.

bob1029 4 hours ago||

Error feedback from tools could be argued to be isomorphic with skills (or the development of them). It tracks with how we learn things in meatspace. Whatever strings we return in response to a bad SQL query or compiler error could also include the contents of some skill.md file.

Danidada 4 hours ago||

What about libraries that are not in their training data? (e.g. new libraries, private libraries)

Or knowledge that is in their training data, but the majority of its training data isn't following the best practices? (e.g. Web Content Accessibility Guidelines)

I think there is a fair point in those cases of having a bunch of markdown docs files detailing them

appsoftware 5 hours ago||

I don't think a general public set of skills like this is going to work. I see value in vendors producing skills for their own products, and end users maintaining skills to influence agents according to their preferences, but too much in these skills files is opinion. Where does this end? Ordering of skills by specificity, such as org > user > workspace? And we know that skills aren't reliably picked up anyway. And then there's the additional attack surface area for prompt injection.

joshribakoff 4 hours ago|

The post you have commented on is not pertaining to a general set of skills at all. Its a link to a specification for skills.

appsoftware 9 hours ago||

I use a common README_AI.md file, and use CLAUDE.md and AGENTS.md to direct the agent to that common file. From README_AI.md, I make specific references to skills. This works pretty well - it's become pretty rare that the agent behaves in a way contrary to my instructions. More info on my approach here: https://www.appsoftware.com/blog/a-centralised-approach-to-a... ... There was a post on here a couple of days ago referring to a paper that said that the AGENTS file alone worked better than agent skills, but a single agents file doesn't scale. For me, a combination where I use a brief reference to the skill in the main agents file seems like the best approach.

cjonas 7 hours ago||

Implementation Notes:

- There is no reason you have to expose the skills through the file system. Just as easy to add tool-call to load a skill. Just put a skill ID in the instruction metadata. Or have a `discover_skills` tool if you want to keep skills out of the instructions all together.

- Another variation is to put a "skills selector" inference in front of your agent invocation. This inference would receive the current inquiry/transcript + the skills metadata and return a list of potentially relevant skills. Same concept as a tool selection, this can save context bandwidth when there are a large number of skills

mkagenius 6 hours ago|

> Or have a `discover_skills` tool

Yes, treating the "front matter" of skill as "function definition" of tool calls as kind of an equivalence class.

This understanding helped me create an LLM agnostic (also sandboxed) open-skills[1] way before this standardization was proposed.

1. Open-skills: https://github.com/instavm/open-skills

alsetmusic 8 hours ago||

A link from a couple weeks back suggests that putting them in first-person makes them get adopted reliably. Something like, "If this is available, I will read it," vs "Always read this." Haven't tried it myself, but plan to.

ef2k 8 hours ago||

I'm not disagreeing with standards but instead of creating adapters, can't we prompt the agent to create its own version of a skill using its preferred guidelines? I don't think machines care about standards in the way that humans do. If we maintain pure knowledge in markdown, the agents can extract what they need on demand.

nstfn 8 hours ago||

Started to work on a tool to synchronize all skills with symlinks. Its ok for my needs at the moment but feel free to improve it its on GH: https://github.com/Alpha-Coders/agent-loom

baalimago 8 hours ago|

Please help me understand. Is a "skill" the prompt instructing the LLM how to do something? For example, I give it the "skill" of writing a fantasy story, by describing how the hero's journey works. Or I give it the "curl" skill by outputting curl's man page.

headcanon 8 hours ago||

Its additional context that can be loaded by the agent as-needed. Generally it decides to load based on the skill's description, or you can tell it to load a specific skill if you want to.

So for your example, yes you might tell the agent "write a fantasy story" and you might have a "storytelling skill" that explains things like charater arcs, tropes, etc. You might have a separate "fiction writing" skill that defines writing styles, editing, consistency, etc.

All of this stuff is just 'prompt management' tooling though and isn't super commplicated. You could just paste the skill content into your context and go from there, this just provides a standardized spec for how to structure these on-demand context blocks.

lxgr 8 hours ago||

Yes, pretty much.

LLM-powered agents are surprisingly human-like in their errors and misconceptions about less-than-ubiquitous or new tools. Skills are basically just small how-to files, sometimes combined with usage examples, helper scripts etc.

More comments...