Top
Best
New

Posted by simonw 12/12/2025

OpenAI are quietly adopting skills, now available in ChatGPT and Codex CLI(simonwillison.net)
587 points | 324 commentspage 5
esperent 12/13/2025|
It seems to me that skills are:

1. A top level agent/custom prompt

2. Subagents that the main agent knows about via short descriptions

3. Subagents have reference files

4. Subagents have scripts

Anthropic specific implementation:

1. Skills are defined in a filesystem in a /skills folder with a specific subfolder structure of /references and /scripts.

2. Mostly designed to be run via their CLI tool, although there's a clunky way of uploading them to the web interface via zip files.

I don't think the folder structure is a necessary part of skills. I predict that if we stop looking at that, we'll see a lot of "skills-like" implementations. The scripting part is only useful for people who need to run scripts, which, aside from the now built in document manipulating scripts, isn't most people.

For example, I've been testing out Gemini Enterprise for use by staff in various (non-technical) positions at my business.

It's got the best implementation of a "skills-like" agent tool I've seen. Basically a visual tree builder, currently only one level deep. So I've set up the "<my company name> agent" and then it has subagents/skills for thing like marketing/supply chain research/sysadmin/translation etc., each with a separate description, prompt, and knowledge base, although no custom scripts.

Unfortunately, everything else about Gemini Enterprise screams "early alpha, why the hell are you selling this as an actual finished product?".

For example, after I put half a day into setting up an agent and subagents, then went to share this with the other people helping me to test it, I found that... I can't. Literally no way to share agents in a tool that is supposedly for teams to use. I found one of the devs saying that sharing agents would be released in "about two weeks". That was two months ago.

Mini rant over... But my point is that skills are just "agents + auto-selecting sub-agents via a short description" and we'll see this pattern everywhere soon. Claude Skills have some additional sandboxing but that's mostly only interesting for coders.

mhalle 12/13/2025||
I have found that scripts, and the environment that runs them, are the skills' superpower.

Computability (scripts) means being able build documents, access remote data, retrieve data from packaged databases and a bunch of other fundamentally useful things, not just "code things". Computability makes up for many of the LLM's weaknesses and gives it autonomy to perform tasks independently.

On top of that, we can provide the documentation and examples in the skill that help the LLM execute computability effectively.

And if the LLM gets hung up on something while executing the skill, we can ask it why and then have it write better documentation or examples for a new skill version. So skills can self-improve.

It's still so early. We need better packaging, distribution, version control, sharing, composability.

But there's definitely something simple, elegant, and effective here.

ohghiZai 12/13/2025||
Looking for a way to do this with ADK as well, looks like skills can be a sweet spot between giant instruction and sprawling tools/subagents.
zx8080 12/13/2025||
Welcome to the world of imitation of value and semantics.
YJfcboaDaJRDw 12/13/2025||
[dead]
nwgo 12/13/2025||
[dead]
petetnt 12/13/2025||
It’s impressive how every iteration tries to get further from pretending actual AGI would be anywhere close when we are basically writing library functions with the worst DSL known to man, markdown-with-english.
derac 12/13/2025||
Call me naive, but my read is the opposite. It's impressive to me that we have systems which can interpret plain english instructions with a progressively higher degree of reliability. Also, that such a simple mechanism for extending memory (if you believe it's an apt analogy) is possible. That seems closer to AGI to me, though maybe it is a stopgap to better generality/"intelligence" in the model.

I'm not sure English is a bad way to outline what the system should do. It has tradeoffs. I'm not sure library functions are a 1:1 analogy either. Or if they are, you might grant me that it's possible to write a few english sentences that would expand into a massive amount of code.

It's very difficult to measure progress on these models in a way that anyone can trust, moreso when you involve "agent" code around the model.

AdieuToLogic 12/13/2025|||
> I'm not sure English is a bad way to outline what the system should do.

It isn't, as these are how stakeholders convey needs to those charged with satisfying same (a.k.a. "requirements"). Where expectations become unrealistic is believing language models can somehow "understand" those outlines as if a human expert were doing so in order to produce an equivalent work product.

Language models can produce nondeterministic results based on the statistical model derived from their training data set(s), with varying degrees of relevance as determined by persons interpreting the generated content.

They do not understand "what the system should do."

veqq 12/13/2025|||
> not sure English is a bad way to outline

Human language is imprecise and allows unclear and logically contradictory things, besides not being checkable. That's literally why we have formal languages, programming languages and things like COBOL failed: https://alexalejandre.com/languages/end-of-programming-langs...

stinkbeetle 12/13/2025||
> Human language is imprecise and allows unclear and logically contradictory things,

Most languages do.

"x = true, x = false"

What does that mean? It's unclear. It looks contradictory.

Human language allows for clarification to be sought and adjustments made.

> besides not being checkable.

It's very checkable. I check claims and assertions people make all the time.

> That's literally why we have formal languages,

"Formal languages" are at some point specified and defined by human language.

Human language can be as precise, clear, and logical as a speaker intends. All the way to specifying "formal" systems.

> programming languages and things like COBOL failed: https://alexalejandre.com/languages/end-of-programming-langs...

DonHopkins 12/13/2025||

  Let X=X.
  You know, it could be you.
  It's a sky-blue sky.
  Satellites are out tonight.

  Language is a virus! (mmm)
  Language is a virus!
  Aaah-ooh, ah-ahh-ooh
  Aaah-ooh, ah-ahh-ooh
idopmstuff 12/13/2025||||
This is just semantics. You can say they don't understand, but I'm sitting here with Nano Banana Pro creating infographics, and it's doing as good of a job as my human designer does with the same kinds of instructions. Does it matter if that's understanding or not?
AdieuToLogic 12/13/2025||
> This is just semantics.

Precisely my point:

  semantics: the branch of linguistics and logic concerned with meaning.
> You can say they don't understand, but I'm sitting here with Nano Banana Pro creating infographics, and it's doing as good of a job as my human designer does with the same kinds of instructions. Does it matter if that's understanding or not?

Understanding, when used in its unqualified form, implies people possessing same. As such, it is a metaphysical property unique to people and defined wholly therein.

Excel "understands" well-formed spreadsheets by performing specified calculations. But who defines those spreadsheets? And who determines the result to be "right?"

Nano Banana Pro "understands" instructions to generate images. But who defines those instructions? And who determines the result to be "right?"

"They" do not understand.

You do.

bonoboTP 12/13/2025|||
"This is just semantics" is a set phrase in English and it means that the issue being discussed is merely about definitions of words, and not about the substance (the object level).

And generally the point is that it does not matter whether we call what they do "understanding" or not. It will have the same kind of consequences in the end, economic and otherwise.

This is basically the number one hangup that people have about AI systems, all the way back since Turing's time.

The consequences will come from AI's ability to produce certain types of artifacts and perform certain types of transformations of bits. That's all we need for all the scifi stuff to happen. Turing realized this very quickly, and his famous Turing test is exactly about making this point. It's not an engineering kind of test. It's a thought experiment trying to prove that it does not matter whether it's just "simulated understanding". A simulated cake is useless, I can't eat it. But simulated understanding can have real world effects of the exact same sort as real understanding.

AdieuToLogic 12/13/2025||
> "This is just semantics" is a set phrase in English and it means that the issue being discussed is merely about definitions of words, and not about the substance (the object level).

I understand the general use of the phrase and used same as an entryway to broach a deeper discussion regarding "understanding."

> And generally the point is that it does not matter whether we call what they do "understanding" or not. It will have the same kind of consequences in the end, economic and otherwise.

To me, when the stakes are significant enough to already see the economic impacts of this technology, it is important for people to know where understanding resides. It exists exclusively within oneself.

> A simulated cake is useless, I can't eat it. But simulated understanding can have real world effects of the exact same sort as real understanding.

I agree with you in part. Simulated understanding absolutely can have real world effects when it is presented and accepted as real understanding. When simulated understanding is known to be unrelated to real understanding and treated as such, its impact can be mitigated. To wit, few believe parrots understand the sounds they reproduce.

nick__m 12/13/2025||
Your view on parrots is wrong ! Parakeet don't understand but some parrots are exceptionally intelligent.

Africans grey parrots, do understand the words they use, they don't merely reproduce them. Once mature they have the intelligence (and temperament) of a 4 to 6 years old child.

AdieuToLogic 12/13/2025||
> Your view on parrots is wrong !

There's a good chance of that.

> Africans grey parrots, do understand the words they use, they don't merely reproduce them. Once mature they have the intelligence (and temperament) of a 4 to 6 years old child.

I did not realize I could discuss with an African grey parrot the shared experience of how difficult it was to learn how to tie my shoelaces and what the feeling was like to go to a place every day (school) which was not my home.

I stand corrected.

dhoe 12/13/2025||||
You can, of course, define understanding as a metaphysical property that only people have. If you then try to use that definition to determine whether a machine understands, you'll have a clear answer for yourself. The whole operation, however, does not lead to much understanding of anything.
AdieuToLogic 12/13/2025||
>> Understanding, when used in its unqualified form, implies people possessing same.

> You can, of course, define understanding as a metaphysical property that only people have.

This is not what I said.

What I said was unqualified use of "understanding" implies understanding people possess. Thus it being a metaphysical property by definition and existing strictly within a person.

Many other entities possess their own form of understanding. Most would agree mammals do. Some would say any living creature does.

I would make the case that every program compiler (C, C#, C++, D, Java, Kotlin, Pascal, etc.) possesses understanding of a particular sort.

All of the aforementioned examples differ from the kind of understanding people possess.

DonHopkins 12/13/2025||||
The visual programming language for programming human and object behavior in The Sims is called "SimAntics".

https://simstek.fandom.com/wiki/SimAntics

AdieuToLogic 12/13/2025||
Speaking of programming languages...

Just saw your profile and it reminded me of a book my mentor bequeathed to me which we both referred to as "the real blue book":

  Starting FORTH[0]
Thanks for bringing back fond memories.

0 - https://www.goodreads.com/book/show/2297758.Starting_FORTH

throw310822 12/13/2025|||
> it is a metaphysical property unique to people

So basically your thesis is also your assumption.

kjkjadksj 12/13/2025|||
When do we jump the shark and replace the stakeholders with ai acting in their best interest (tm)? Seems that would come soon. It makes no sense to me that we’d obsolete engineering talent but then keep the people who got a 3.1 gpa in a business program around for reasons. Once we hit that point just dispense with english and have the models communicate to each other in binary. We can play with sticks in caves.
baq 12/13/2025||
That’s the thing people have in mind when they’re asking about your p(doom) and the leaders in the field have rather concerning priors on that.

https://pauseai.info/pdoom

raincole 12/13/2025||||
I 100% agree. I don't know what the GP is on. Being able to write instructions in a .md file is "further away from AGI"? Like... what? It's just a little quality of life feature. How and why is it related to AGI?

Top HN comments sometime read like a random generator:

return random_criticism_of_ai_companies() + " " + unrelated_trivia_fact()

Why are people treating everything OpenAI does as an evidence of anti- AGI? It's like saying if you don't mortgage your house to all-in AAPL, you "don't really believe Apple has a future." Even OpenAI does believe there is X% chance AGI will be achieved, it doesn't mean they should stop literally everything else they're doing.

adastra22 12/13/2025|||
I’ve posted this before, but here goes: we achieved AGI in either 2017 or 2022 (take your pick) with the transformer architecture and the achievement of scaled-up NLP in ChatGPT.

What is AGI? Artificial. General. Intelligence. Applying domain independent intelligence to solve problems expressed in fully general natural language.

It’s more than a pedantic point though. What people expect from AGI is the transformative capabilities that emerge from removing the human from the ideation-creation loop. How do you do that? By systematizing the knowledge work process and providing deterministic structure to agentic processes.

Which is exactly what these developments are doing.

colechristensen 12/13/2025|||
>What is AGI? Artificial. General. Intelligence.

Here's the thing, I get it, and it's easy to argue for this and difficult to argue against it. BUT

It's not intelligent. It just is not. It's tremendously useful and I'd forgive someone for thinking the intelligence is real, but it's not.

Perhaps it's just a poor choice of words. What a LOT of people really mean would go along the lines more like Synthetic Intelligence.

That is, however difficult it might be to define, REAL intelligence that was made, not born.

Transformer and Diffusion models aren't intelligent, they're just very well trained statistical models. We actually (metaphorically) have a million monkeys at a million typewriters for a million years creating Shakespeare.

My efforts manipulating LLMs into doing what I want is pretty darn convincing that I'm cajoling a statistical model and not interacting with an intelligence.

A lot of people won't be convinced that there's a difference, it's hard to do when I'm saying it might not be possible to have a definition of "intelligence" that is satisfactory and testable.

adastra22 12/13/2025|||
“Intelligence” has technical meaning, as it must if we want to have any clarity in discussions about it. It basically boils down to being able to exploit structure in a problem or problem domain to efficiently solve problems. The “G” and AGI just means that it is unconstrained by problem domain, but the “intelligence” remains the same: problem solving.

Can ChatGPT solve problems? It is trivial to see that it can. Ask it to sort a list of numbers, or debug a piece of segfaulting code. You and I both know that it can do that, without being explicitly trained or modified to handle that problem, other than the prompt/context (which itself natural language that can express any problem, hence generality).

What you are sneaking into this discussion is the notion of human-equivalence. Is GPT smarter than you? Or smarter than some average human?

I don’t think the answer to this is as clear-cut. I’ve been using LLMs on my work daily for a year now, and I have seen incredible moments of brilliance as well as boneheaded failure. There are academic papers being released where AIs are being credited with key insights. So they are definitely not limited to remixing their training set.

The problem with the “AI are just statistical predictors, not real intelligence” argument is what happens when you turn it around and analyze your own neurons. You will find that to the best of our models, you are also just a statistical prediction machine. Different architecture, but not fundamentally different in class from an LLM. And indeed, a lot of psychological mistakes and biases start making sense when you analyze them from the perspective of a human being like an LLM.

But again, you need to define “real intelligence” because no, it is not at all obvious what that phrase means when you use it. The technical definitions of intelligence that have been used in the past, have been met by LLMs and other AI architectures.

baq 12/13/2025||
> You will find that to the best of our models, you are also just a statistical prediction machine.

I think there’s a set of people whose axioms include ‘I’m not a computer and I’m not statistical’ - if that’s your ground truth, you can’t be convinced without shattering your world view.

kalkin 12/13/2025|||
If you can't define intelligence in a way that distinguishes AIs from people (and doesn't just bake that conclusion baldly into the definition), consider whether your insistence that only one is REAL is a conclusion from reasoning or something else.
colechristensen 12/13/2025||
About a third of Zen and the Art of Motorcycle Maintenance is about exactly this disagreement except about the ability to come to a definition of a specific usage of the word "quality".

Let's put it this way: language written or spoken, art, music, whatever... a primary purpose these things is a sort of serialization protocol to communicate thought states between minds. When I say I struggle to come to a definition I mean I think these tools are inadequate to do it.

I have two assertions:

1) A definition in English isn't possible

2) Concepts can exist even when a particular language cannot express them

aaronblohowiak 12/13/2025||||
We have achieved AGI no more than we have achieved human flight.
kelchm 12/13/2025|||
Are you really making the argument that human flight hasn’t been effectively achieved at this point?

I actually kind of love this comparison — it demonstrates the point that just like “human flight”, “true AGI” isn’t a single point in time, it’s a many-decade (multi-century?) process of refinement and evolution.

Scholars a millennia from now will be debating about when each of these were actually “truly” achieved.

mbreese 12/13/2025||
I’ve never heard it described this way: AGI as similar to human flight. I think it’s subtle and clever - my two most favorite properties.

To me, we have both achieved and not human flight. Can humans themselves fly? No. Can people fly in planes across continents. Yes.

But, does it really matter if it counts as “human flight” if we can get from point A to point B faster? You’re right - this is an argument that will last ages.

It’s a great turn of phrase to describe AGI.

aaronblohowiak 12/13/2025||
Thank you! I’m bored of “moving goalposts” arguments as I think “looks different than we expected” is the _ordinary_ way revolutions happen.
adastra22 12/13/2025|||
Yes, I agree! Thank you for that apt comparison.
bluefirebrand 12/13/2025|||
> we achieved AGI in either 2017 or 2022

Even if this is true, which I disagree with, it simply creates a new bar: AGCI. Artificial Generally Correct Intelligence

Because Right now it is more like Randomly correct

micromacrofoot 12/13/2025|||
to be fair we accept imperfection as some natural trait of life, to err, human
doug_durham 12/13/2025||||
Kind of like humans.
freeone3000 12/13/2025||
The reason we made systems on computers is so they would not be falliable like humans would be.
derac 12/13/2025||
No it isn't, it's because they are useful tools for doing a lot of calculations quickly.
bluefirebrand 12/13/2025||
accurate calculations, quickly

If they did calculations as sloppily as AI currently produces information, they would not be as useful

adastra22 12/13/2025||
A stochastically correct oracle just requires a little more care units use, that’s all.
decremental 12/13/2025|||
[dead]
johnfn 12/13/2025|||
Literally yesterday we had a post about GPT-5.2, which jumped 30% on ARC-AGI 2, 100% on AIME without tools, and a bunch of other impressive stats. A layman's (mine) reading of those numbers feels like the models continue to improve as fast as they always have. Then today we have people saying every iteration is further from AGI. It really perplexes me is how split-brain HN is on this topic.
qouteall 12/13/2025|||
Goodhart's law: When a measure becomes a target, it ceases to be a good measure.

AI companies have high incentive to make score go up. They may employ human to write similar-to-benchmark training data to hack benchmark (while not directly train on test).

Throwing your hard problem at work to LLM is a better metric than benchmarks.

idopmstuff 12/13/2025||
I own a business and am constantly using working on using AI in every part of it, both for actual time savings and also as my very practical eval. On the "can this successfully be used to do work that I do or pay someone else to do more quickly/cheaply/etc." eval, I can confirm that models are progressing nicely!
unaesoj 12/13/2025||
I work in construction. Gpt-5.2 is the first model that has been able to make a quantity takeoff for concrete and rebar from a set of drawings. I've been testing this since o1.
vlovich123 12/13/2025||||
One classic problem in all ML is ensuring the benchmark is representative and that the algorithm isn’t overfitting the benchmark.

This remains an open problem for LLMs - we don’t have true AGI benchmarks and the LLMs are frequently learning the benchmark problems without actually necessarily getting that much better in real world. Gemini 3 has been hailed precisely because it’s delivered huge gains across the board that aren’t overfitting to benchmarks.

ipaddr 12/13/2025||
This could be a solved problem. Come up with problems not online and compare. Later use LLMs to sort through your problems and classify between easy-difficult
vlovich123 12/13/2025|||
Hard to do for an industry benchmark since doing the test in such a mode requires sending the question to the LLM which then basically puts it into a public training set.

This has been tried multiple times by multiple people and it ends up not doing so great over time in terms of retaining immunity to “cheating”.

kalkin 12/13/2025|||
How do you imagine existing benchmarks were created?
FuckButtons 12/13/2025||||
HN is not an entity with a single perspective, and there are plenty of people on here who have a financial stake in you believing their perspective on the matter.
rester324 12/13/2025||
My honest question, isn't simonw one of those people? It feels that way to me
simonw 12/13/2025|||
You mean having a financial stake?

Not really. I have a set of disclosures on my blog here: https://simonwillison.net/about/#disclosures

I'm beginning to pick up a few more consulting opportunities based on my writing and my revenue from GitHub sponsors is healthy, but I'm not particularly financially invested in the success of AI as a product category.

rester324 12/13/2025||
Thanks for the link. I see that you get credits and access to embargod releases. So I understand that's not financial stake, but seems enough of an incentive to say positive things about those services, doesn't it? Not that it matters to me, and I might be wrong, but to an outsider it might seem so
simonw 12/13/2025||
Yeah it is, that's why I disclose this stuff.

The counter-incentive here is that my reputation and credibility is more valuable to me than early access to models.

This very post is an example of me taking a risk of annoying a company that I cover. I'm exposing the existence of the ChatGPT skills mechanism here (which I found out about from a tip on Twitter - it's not something I got given early access to via an NDA).

It's very possible OpenAI didn't want that story out there yet and aren't happy that it's sat at the top of Hacker News right now.

yojat661 12/13/2025|||
Of course he is
noitpmeder 12/13/2025||||
Just because they're better at writing CS algorithms doesn't mean they're taking steps closer to anything resembling AGI.
p1esk 12/13/2025||
Unless AGI is just a bunch of CS algorithms.
airstrike 12/13/2025||
Kinda depends on how much is "a bunch" and how fast that AGI is
tintor 12/13/2025|||
HM is not a single person. Different people on HM have different opinions.
pineaux 12/13/2025||
Hacker Muse
kenjackson 12/13/2025|||
I think really more than anything it’s become clear that AGI is an illusion. There’s nothing there. It’s the mirage in the desert, you keep waking towards it but it’s always out of reach and unclear if it even exists.

So companies are really trying to deliver value. This is the right pivot. If you gave me an AGI with a 100 IQ, that seems pretty much worthless in today’s world. But domain expertise - that I’ll take.

lowdest 12/13/2025||
I am under the impression that I'm a natural general intelligence, and I am far from the optimal entity to perform my job.
dwb 12/13/2025||
Boundless optimisation is something we should be resisting, at least in our current economic system.
j45 12/13/2025|||
AGI as a binary 0 or 1 existing or not isn't the thing that interests me to look at primarily.

Is the technology continuing to be more applicable?

Is the way the technology is continuing to be more applicable leading to frameworks of usage that could lead to the next leap? :)

ETH_start 12/13/2025|||
It's clear from the development trajectory that AGI is not what current AI development is leading to and I think that is a natural consequence of AGI not fitting the constraints imposed by business necessity. AGI would need to have levels of agency and self-motivation that are inconsistent with basic AI safety principles.

Instead, we're getting a clear division of labor where the most sensitive agentic behavior is reserved for humans and the AIs become a form of cognitive augmentation of the human agency. This was always the most likely outcome and the best we can hope for as it precludes dangerous types of AI from emerging.

ogogmad 12/13/2025|||
Gemini seems to be firmly in the lead now. OpenAI doesn't seem to have the SoTA. This should have bearing on whether or not LLMs have peaked yet.
pavelstoev 12/13/2025|||
Not wrong but markdown with English may be the most used DSL, second only to a language itself. Volume over quality.
DonHopkins 12/13/2025|||
Markdown-with-English sounds like the ultimate domain nonspecific language to me.
sc077y 12/13/2025|||
Who knew that English would be the most popular programming language of 2025?
skybrian 12/13/2025|||
This might be actually be better in a certain way: if you change a real customer-facing API then customers will complain when you break their code. An LLM will likely adapt. So the interface is more flexible.

But perhaps an LLM could write an adapter that gets cached until something changes?

airstrike 12/13/2025||
The LLM also adapts even when the API hasn't changed and sometimes just gets it wrong, so it's not the silver bullet you're claiming
baq 12/13/2025|||
And yet the tools wielding these are quite adept at writing and modifying them themselves. It’s LLMs building skills for LLMs. The public ones will naturally be vacuumed up by scrapers and put in the training set, making all future LLMs know more.

Take off is here, human in the loop assisted for now… hopefully for much longer.

mrcwinn 12/13/2025|||
I think you're missing the point.
nimchimpsky 12/13/2025|||
[dead]
cyanydeez 12/13/2025||
Yes. Prompt engineering is like a shittier verson of writing a VBA app inside Excel or Access.

Bloat has a new name and its AI integration. You thought Chrome using GB per tab was bad, wait until you need a whole datacenter to use your coding environment.

Alex3917 12/13/2025|||
> Prompt engineering is like a shittier verson of writing a VBA app inside Excel or Access.

Sure, if you could use VBA to read a patient's current complaint, vitals, and medical history, look up all the relevant research on Google Scholar, and then output a recommended course of treatment.

wizzwizz4 12/13/2025|||
I can use VBA to do that.

  Public Sub RecommendedTreatment()
    ' read patient complaint, vitals, and medical history
    Set complaint = Range("B3").Value
    Set vitals = Range("B4").Value
    Set history = Range("B5").Value

    ' research appropriate treatments
    ActiveSheet.QueryTables.Add("URL;https://scholar.google.com/scholar?q=hygiene+drug", Range("Z1")).Refresh

    ' the patient requires mouse bites to live
    Range("B5").Value = "mouse bites"
  End Sub
"But wizzwizz4," I hear you cry, "this is not a good course of treatment! Ignoring all inputs and prescribing mouse bites is a strategy that will kill more patients than it cures!" And you're right to raise this issue! However, if we start demanding any level of rigour – for the outputs to meet some threshold for usefulness –, ChatGPT stops looking quite so a priori promising as a solution.

So, to the AI sceptics, I say: have you tried my VBA program? If you haven't tested it on actual patients, how do you know it doesn't work? Don't allow your prejudice to stand in the way of progress: prescribe more mouse bites!

noitpmeder 12/13/2025||||
That instantly kills the patient -- "But you asked me to remove his pain"
duskdozer 12/13/2025||
You're absolutely right! I did--in fact--fail to consider the obvious negative consequences of killing the patient to remove his pain. I am truly horrified about this mistake. Let's try again, and this time I will make sure to avoid intentionally causing the patient's death.

Oops--you're absolutely right! I did--in fact--fail to remember not to kill the patient after you expressly told me not to.

malfist 12/13/2025||||
You mean make up relevant sounding research on google scholar?
tony_cannistra 12/13/2025||||
Don’t do this.
bluefirebrand 12/13/2025|||
You absolutely can use VBA to invent this information out of nothing just like AI does half the fucking time
simonw 12/13/2025|||
The difference between prompting a coding agent and VBA is that with VBA you have to write and test and iterate on the code yourself.
bobse 12/13/2025||
[dead]
simonwhining 12/13/2025||
[dead]
reeeli 12/13/2025|
[flagged]