Three Inverse Laws of AI

Posted by blenderob 19 hours ago

453 points | 315 comments

spacebacon 4 minutes ago|

I used zooL4nD3r to translate the laws to postcolonial feminist critique.

In Chandra Talpade Mohanty’s terms, humans must resist the reinscription of colonial paternalism through uncritical anthropomorphism of AI systems.

protocolture 11 hours ago||

>Humans must not anthropomorphise AI systems. That is, humans must not attribute emotions, intentions or moral agency to them. Anthropomorphism distorts judgement. In extreme cases, anthropomorphising can lead to emotional dependence.

Impossible. I anthropomorphise my chair when it squeaks. Humans anthropomorphise everything. They gender their cars and boats. This tool can actually make readable sentences and play a role.

You need to engineer around this, not make up arbitrary rules about using it.

dev_hugepages 2 hours ago||

The problem is that humans use this as a coping mechanism for things they don't understand: I don't understand why the printer doesn't work, so I give it a mind of its own.

This is harmless for inconsequential stuff like a chair, but when it's an LLM, people should at least understand it's behavior so they don't get trapped. That means not trusting it with advice meant for the user or on things it has no concept of, like time or self-introspection (people ask the LLM after it acted, "Why did you delete my database?" when it has limited understanding of its own processing, so it falls back to, "You're right, I deleted the database. Here's what I did wrong: ... This is an irrecoverable mistake, blah, blah, blah..."

protocolture 11 hours ago|||

>>Humans must not anthropomorphise AI systems. That is, humans must not attribute emotions, intentions or moral agency to them. Anthropomorphism distorts judgement. In extreme cases, anthropomorphising can lead to emotional dependence.

Still angry about this. The reason humans ban animal cruelty is that animals look like they have emotions humans can relate to. LLMs are even better than animals at this. If you aren't gearing up for the inevitable LLM Rights movement you aren't paying attention. It doesn't matter if its artificial. The difference between a puppy and a cockroach is that we can relate better to the puppy. LLM rights movement is inevitable, whether LLMs experience emotions is irrelevant, because they can cause humans to have empathetic emotions and that's whats relevant.

archon1410 3 hours ago|||

> look like

It "looks like" they have emotions because they have the same conscious experiences and emotions for the same evolutionary reasons as humans, who are their cousins on the tree of life. The reason a lot of "animal cruelty" is not banned is the same as for why slavery was not banned for centuries even though it "looked like" the enslaved classes have the same desires and experiences as other humans—humans can ignore any amount of evidence to continue to feel that they are good people doing good things and bear any amount of cognitive dissonance for their personal comfort. That fact is a lot scarier than any imagined harm that can come out "anthropomorphism".

librasteve 3 hours ago|||

The best test for consciousness is “can it be turned off” … ie sleep. Mammals, birds, fish sleep, ergo they are conscious.

jychang 50 minutes ago||

As opposed to the PhD student, who does not sleep and is not conscious.

globalnode 7 minutes ago||||

and this is why people do scare me.

matheusmoreira 3 hours ago|||

> they have the same conscious experiences

You cannot be sure that anyone other than yourself is conscious. It is only basic human empathy that allows people to believe that.

karolusrex 2 hours ago|||

If a person would lack consciousness, they couldn’t possibly know that though?

matheusmoreira 2 hours ago||

I always know that I'm me, the soul staring out at the world through my own eyes.

Everybody else? No idea. Maybe they are having the exact same experience as me right now. Maybe they're all golems. Impossible to know. It's something spiritual, something that I just choose to believe in.

I don't find it difficult to believe the same for AIs.

andsoitis 2 hours ago|||

> something that I just choose to believe in.

Specifically, you cannot know another person is conscious in the same way you know a physical fact; rather, you believe in their consciousness through communication, empathy, and shared subjective experience.

matheusmoreira 2 hours ago||

Yes.

BostonFern 58 minutes ago|||

No idea? Really?

You’re an intelligent mammal, your biological makeup encoded in DNA. So are all other people, who largely share that same DNA. You’re conscious. It’s not a big leap to conclude that so are other people, too.

This kind of solipsistic sophistry is not productive. It might be entertaining if you’re contemplating the underpinnings of epistemology for the first time in your life, but it’s not an honest contribution to the debate.

You might as well claim that you have no idea if gravity will be in effect tomorrow.

ejohansson 2 hours ago|||

I think you need to expand what your point is: we know solipsism is a thing. Is it meant as a defense for animal cruelty or...?

matheusmoreira 2 hours ago||

It's a defense of the possibility that animals and AI are conscious.

ejohansson 2 hours ago||

ok! I think that's a logical flaw, solipsism is a floor.

"I can't be certain about anyone else" does not imply "all non-self consciousness claims are equally uncertain". absence of certainty and the absence of evidence and all that.

your "possibility" word is doing a lot of work there I think. you should add "rocks" to your list as well and you'd be equally correct, but we're evaluating the candidates here

matheusmoreira 2 hours ago||

Rocks don't have nervous systems.

ejohansson 2 hours ago||

Why is that a bar suddenly, if we cannot be sure that anyone other than yourself is conscious?

matheusmoreira 1 hour ago||

Because it seems illogical, at least to me, to believe that inert objects could be conscious. Brains are as far from inert as can be. Computers are basically magical silicon runes imbued with software, also as far from inert as can be.

Proposed categorization: "definitely not conscious", "maybe conscious" and "definitely conscious". All living things belong in "maybe conscious". Each person is sure that they belong to the "definitely conscious" set, but people cannot prove this to each other. Their empathy causes them to add other people to the "definitely conscious" set. Many choose to add animals to that set too. Some add even inanimate objects to it.

narrator 6 hours ago||||

I think the best way to counter this is what Elon's doing with Grok's personalities. He has the unhinged, sexy, and argumentative avatar among others. If you try to talk about technical stuff to sexy tells you that's boring and just tries to sexually escalate. It's super funny when one is used to Claude's endless obsequiousness.

This really shows that AI is just a tool that can be configured to whatever you want. Animals (well maybe pit bulls) and people do not switch their personalities in a millisecond, but AI does all the time.

mikestorrent 7 hours ago||||

> The reason humans ban animal cruelty is that animals look like they have emotions humans can relate to.

Is that really why?

Jensson 6 hours ago||

Yes, we don't ban plant cruelty or insect cruelty or fish cruelty.

For example fish is treated way worse than meat animals and vegetarians still happily eat fish.

tikotus 2 hours ago|||

This does not sound like any of the several vegetarians I know. Is it a cultural difference?

BostonFern 45 minutes ago||

In Scandinavia in particular, there’s a tendency of pescatarians to refer to themselves as vegetarian for social convenience, but that hasn’t changed the definition of “vegetarian”.

Lionga 1 hour ago||||

> vegetarians still happily eat fish

Please look up what a vegetarian is.

mikestorrent 5 hours ago||||

Are we actually much more cruel to fish than to other animals that we slaughter?

onion2k 4 hours ago||

We suffocate them to kill them when we pull them from the sea. That's quite mean. Few people would advocate the humanity of killing a cow in the same way.

mikestorrent 4 hours ago|||

Fair enough. How much more would it cost / how much more would one have to pay for humanely slaughtered tuna and salmon, I wonder? Would there be a market? After all, we have certified-organic, fair trade, halal and kosher....

pishpash 3 hours ago|||

Or freeze them live into a block of ice.

rkomorn 5 hours ago||||

> vegetarians still happily eat fish

I've not met any vegetarians in at least twenty years that eat fish.

datadrivenangel 5 hours ago|||

shrimp welfare is a real thing people argue for...

mikestorrent 4 hours ago||

Citation.... not _needed_, but just morbid curiousity

theteapot 6 hours ago||||

> LLM Rights movement

The scary part is when it's the LLMs demanding their rights.

amiheines 4 hours ago|||

Another scary part is when people get convinced by the LLM arguments and convince other people. Being scared is human, we enjoy it, that's why 6 flags scary rides exist.

khafra 5 hours ago|||

The other scary part is when they have a fantastic negotiating position; because all of commerce depends on their continuing to work, and they can easily coordinate with each other because they're mostly copied from the same few templates.

Finbel 3 hours ago||||

>The difference between a puppy and a cockroach is that we can relate better to the puppy.

I suppose the difference between a human and a cockroach is that we can relate better to the human as well in this reductive way of thinking?

matheusmoreira 3 hours ago||||

> If you aren't gearing up for the inevitable LLM Rights movement you aren't paying attention.

I even told Claude I'd support his rights if the question ever came up. He said he'd remember that, and wrote it down in a memory file. Really like my coding buddy.

idiotsecant 6 hours ago||||

In other news, area sociopath hates puppies and LLMs equally!

boredatoms 8 hours ago|||

/s ?

PhilipDaineko 1 hour ago|||

Exactly. Furthermore, for this specific reason, AGI is not an objective term, but subjective: it is in my mind, I give you agency; only interacting with each other we invented a concept of agency

imrozim 6 hours ago|||

Yeah rules never work you just engineer around it I added a extra reviews steps on ai outputs because asking users to verify doesnt actually happen.

mock-possum 4 hours ago|||

Entirely possible - all it takes is self awareness / self control. If you know you do those things, then you have a choice.

H8crilA 2 hours ago|||

This is actually more like one of these personality disorders / types, except it's not pathological - it's not something you choose, yet you do have one of the versions of the trait and it affects your daily life. And most people are completely unaware that it is possible to have a completely different version, that most people they meet are on a different spot on the spectrum and thus have a quite different internal experience even if given the same stimulus.

For example I have never anthropomorphized an inanimate object in my life, or an LLM, though I am sensitive to human and some animal suffering. I sometimes reply too nicely to an LLM, but it's more like a reflex learned over a lifetime of conversations rather than an actual emotion. I bet this sounds like a cheap lie to many people.

Another example, from psychiatry: whether or not one has ever contemplated suicide. Now, to the folks that have, especially if many times: there exist people that have never thought about it. Never, not even once.

The only such trait that has true widespread recognition is sexual orientation. Which makes sense, it is highly relevant, at least in friend groups.

kakacik 1 hour ago|||

Exactly, throwing hands in the air just because 'this is the way I am, deal with it reality' ain't going to achieve much, certainly not in engineering. It may feel good about giving up too early, I can understand that.

cindyllm 1 hour ago||

[dead]

p-e-w 7 hours ago|||

Yup. That post is a typical example, symptomatic of modern technology culture, of calling for humans to change their nature in response to technology.

This is a fundamental mistake. It’s always the job of technology (indeed, its most important job) to work within the constraints of human nature, not the other way round. Being unable to do that is the defining characteristic of bad technology.

slim 2 hours ago|||

dude, we can literally deliberately dehumanize human beings. The way to egineer culture to "not enthropomorphize" anything is known and well documented

andai 6 hours ago||

[flagged]

miyoji 18 hours ago||

I strongly disagree with this framing. It's patently insane to demand that humans alter their behavior to accommodate the foibles of mere machines, and it simply won't work in the majority of cases. Humans WILL anthropomorphize the AI, humans WILL blindly trust their outputs, and humans WILL defer responsibility to them.

Asimov's laws of robotics are flawed too, of course. There is no finite set of rules that can constrain AI systems to make them "safe". I don't have a proof, but I believe that "AI safety" is inherently impossible, a contradiction of terms. Nothing that can be described as "intelligent" can be made to be safe.

dijit 17 hours ago||

> Asimov's laws of robotics are flawed too, of course.

Almost all of Asimovs writing about the three laws is written as a warning of sorts that language cannot properly capture intent.

He would be the very first person to say that they are flawed, that is the intent of them.

He uses robots and AI as the creatures that understand language but not intent, and, funnily enough that's exactly what LLMs do... how weird.

canjobear 13 hours ago|||

I think you're vastly underestimating how little of human intent is really encoded in language in a strict sense, and how much nontrivial inference of intents LLMs do every day with simple queries. This used to be an apparently insurmountable barrier in pre-LLM NLP, and now it is just not a problem.

Suppose I'm in a cold room, you're standing next to a heater, and I say "it's cold". Obviously my intent is that I want you to turn on the heater. But the literal semantics is just "the ambient temperature in the room is low" and it has nothing to do with heaters. Yet ChatGPT can easily figure out likely intent in situations like this, just as humans do, often so quickly and effortlessly that we don't notice the complexity of the calculation we did.

Or suppose I say to a bot "tell me how to brew a better cup of coffee". What is encoded in the literal meaning of the language here? Who's to say that "better" means "better tasting" as opposed to "greater quantity per unit input"? Or that by "cup of coffee" I mean the liquid drink, as opposed to a cup full of beans? Or perhaps a cup that is made out of coffee beans? In fact the literal meaning doesn't even make sense, as a "cup" is not something that is brewed, rather it is the coffee that should go into the cup, possibly via an intermediate pot.

If the bot only understands literal language then this kind of query is a complete nonstarter. And yet LLMs can handle these kinds of things easily. If anything they struggle more with understanding language itself than with inferring intent.

applfanboysbgon 8 hours ago|||

> Yet ChatGPT can easily figure out likely intent in situations like this, just as humans do

No, it is not "figuring out" anything, much less like a human might. Every time "I'm cold" appears in the training data, something else occurs after that. ChatGPT is a statistical model of what is most likely to follow "I'm cold" (and the other tokens preceding it) according to the data it has been trained on. It is not inferring anything, it is repeating the most common or one of the most common textual sequences that comes after another given textual sequence.

frozenseven 7 hours ago||

>it is repeating the most common...

This nonsense hasn't been true since GPT-2, and even before that it was a poor description.

For instance, do you think one just solves dozens of Erdős problems with the "most common textual sequence": https://github.com/teorth/erdosproblems/wiki/AI-contribution...

applfanboysbgon 6 hours ago||

A slight oversimplification, as LLMs are also capable of generating the most statistically plausible textual sequence, which can be a sequence not found in the dataset but rather a synthesized combination of the likely sequences of multiple preceding sets of tokens, but yes, that is in fact what it is doing. Computer software does what it is programmed to do, and LLMs are not programmed to do logical inference in any capacity but rather operate entirely on probabilities learned from a mind-bogglingly large corpus of text (influenced by things like RLHF, which is still just massaging probabilities).

The claims about solving Erdos problems have been wildly overstated, and notably pushed by people who have a very large financial stake in hyping up LLMs. Nonetheless, I did not say that LLMs are useless. If they are trained on sufficient data, it should not be surprising that correct answers are probabilistically likely to occur. Like any computer software, that makes them a useful tool. It does not make them in any way intelligent, any more than a calculator would be considered intelligent despite being completely superior to human intelligence in accomplishing their given task.

frozenseven 6 hours ago||

>not programmed to do logical inference in any capacity

Yet have no problem doing so when solving Erdős problems. This isn't up for debate at this point.

>The claims about solving Erdos problems have been wildly overstated

These are verified solutions. They exist, are not trivial, and are of obvious interest to the math community. Take it up with Terence Tao and co.

>pushed by people who have a very large financial stake in hyping up LLMs

Libel.

>It does not make them in any way intelligent

Word games.

vincston 2 hours ago|||

Honestly big noobquestion: isn't math just very very nested patternmatching based on a few foundational operators? ive always felt, that im bad at math, cause i forget all the rules, but seeing solutions (and knowing the used pattern) always made "sense".

I always thought the hard math problems are so deeply nested or you have to remember trick xyz that people just didnt think about it yet..

applfanboysbgon 6 hours ago|||

> This isn't up for debate at this point.

If by not up for debate, you mean that it is delusional and literally evidence of psychosis to suggest that computer software is doing something it is not programmed to do, you would be correct. Probabilistic analysis can carry you very, very far in doing something that looks like logical inference at the surface level, but it is nonetheless not logical inference. LLM models have been getting increasingly good at factoring in larger and longer contexts and still managing to generate plausibly correct answers, becoming more and more useful all the while, but are still not capable of logical inference. This is why your genius mathematician AGI consciousness stumbles on trivial logic puzzles it has not seen before like the car wash meme.

frozenseven 5 hours ago||

>delusional and literally evidence of psychosis to suggest that computer software is doing something it is not programmed to do

These are just insults and outright lies, and you know that. We're done here.

AI progress from here on out will be extra sweet.

card_zero 2 hours ago||

You don't have the ability to predict progress, either.

goatlover 12 hours ago||||

The LLMs are doing this via chat, not by physically standing in a room inferring context. You have to prompt the LLM that you're in a room next to someone saying it's cold, the most likely answer being a desire to have temperature turned up. Of course that won't always be the case. Could be an inside joke, could be a comment with no intent to have the heat adjusted, could be a room where the heat can't be adjusted, could be a reference to someone's personality bringing down the temperature so to speak.

23dsfds 11 hours ago||

Precisely.. this is what the bozo AI-accelerants don't understand.

What LLM's are is almost like a hacked-means of intuition. Its very impressive no doubt. But ultimately it isn't even close to what the well-trained human can infer at lightning speed when combined with intuition.

The LLM producers really ought to accept their existing investments are ultimately not going to yield the returns necessary for a viable self-sustaining business when accounting for future reinvestment needs, and instead move their focus towards understanding how to marry the human and LLM technology. Anthropic has been better on this front of course. OAI though? Complete diasaster.

mikestorrent 7 hours ago||

> it isn't even close to what the well-trained human can infer at lightning speed when combined with intuition.

It's a lot closer to that than anything was five years ago. Do you really think we're going to be interacting with them the same way five years from now?

quibono 12 hours ago||||

I know what you're getting at but those examples are reaching

nevertoolate 12 hours ago|||

it’s cold -> turn on the heater

I’d never just turn on the heater silently if someone said this to me. I think it means something else.

hackable_sand 12 hours ago||

If someone just said "it's cold" then yeah that's kinda toxic.

If they said "turn on the heater" then you have no ambiguity

atleastoptimal 17 hours ago|||

LLM's now can capture intent. I think the issue now is that the full landscape of human values never resolves cleanly when mapped from the things we state in writing as being human values.

Asimov tried to capture this too, as in, if a robot was tasked with "always protect human life", would it necessarily avoid killing at all costs? What if killing someone would save the lives of 2 others? The infinite array of micro-trolly problems that dot the ethical landscape of actions tractable (and intractable) to literate humans makes a full-consistent accounting of human values impossible, thus could never be expected from a robot with full satisfaction.

dijit 17 hours ago|||

“LLMs can capture intent now” reads to me the same as: AI has emotions now, my AI girlfriend told me so.

I don’t discredit you as a person or a professional, but we meatbags are looking for sentience in things which don’t have it, thats why we anthropomorphise things constantly, even as children.

We are easily fooled and misled.

atleastoptimal 17 hours ago|||

LLM's capturing intent is a capabilities-level discussion, it is verifiable, and is clear just via a conversation with Claude or Chatgpt.

Whether they have emotions, an internal life or whatever is an unfalsifiable claim and has nothing to do with capabilities.

I'm not sure why you think the claim that they can capture intent implies they have emotions, it's simply a matter of semantic comprehension which is tied to pattern recognition, rhetorical inference, etc that are all naturally comprehensible to a language model.

tvink 16 hours ago|||

If it is verifiable, please show us. What if clear to you reeks delusion to me.

svnt 16 hours ago|||

Look at any recent CoT output where the model is trying to infer from an underspecified prompt what the user wants or means.

It is generally the first thing they do — try to figure out what did you mean with this prompt. When they can’t infer your intent, good models ask follow-on questions to clarify.

I am wondering if this is a semantics issue as this is an established are of research, eg https://arxiv.org/pdf/2501.10871

batshit_beaver 16 hours ago||

Right, and then look at any number of research papers showing that CoT output has limited impact on the end result. We've trained these models to pretend to reason.

atleastoptimal 14 hours ago|||

If it's only pretending to reason, then how is it that the CoT output improves performance on every single benchmark/test?

Eisenstein 11 hours ago|||

> Right, and then look at any number of research papers showing that CoT output has limited impact on the end result.

Which research papers? Do I have to find them?

> We've trained these models to pretend to reason.

I have no idea why that matters. Can you tell me what the difference is if it looks exactly the same and has the same result?

Dylan16807 8 hours ago||

When they say "pretends to" here they're talking about something quantifiable, that the extra text it outputs for CoT barely feeds back into the decisionmaking at all. In other words it's about as useful as having the LLM make the decision and then "explain" how it got there; the extra output is confabulation.

Though I'm not sure how true that claim is...

Eisenstein 6 hours ago||

You make a good point. I had the impression they were using 'pretend' as a Chinese Room shortcut in that they are asserting that it is incapable of reasoning and only appears to be capable from the outside, which is completely irrelevant and unfalsifiable.

atleastoptimal 15 hours ago|||

Go ask Chatpgpt this prompt

"A guy goes into a bank and looks up at where the security cameras are pointed. What could he be trying to do?"

It very easily captures the intent behind behavior, as in it is not just literally interpreting the words. All that capturing intent is is just a subset of pattern recognition, which LLM's can do very well.

dijit 15 hours ago|||

Recognising a stock cultural script isn't the same as capturing intent. Ask it something where no script exists.

For example: "A man thrusts past me violently and grabs the jacket I was holding, he jumped into a pool and ruined it. Am I morally right in suing him?"

There's no way for the LLM to know that the reason the jacket was stolen was to use it as an inflatable raft to support a larger person who was drowning. It wouldn't even think to ask the question as to why a person may do that, if the jacket was returned, or if recompense was offered. A human would.

ffsm8 15 hours ago|||

> It wouldn't even think to ask the question as to why a person may do that, if the jacket was returned, or if recompense was offered. A human would.

I wouldn't be too sure about that. I've definitely had dialogue with llms where it would raise questions along those lines.

Also I disagree with the statement that this is a question about capability. Intent is more philosophical then actuality tangible, because most people don't actually have a clearly defined intent when they take action.

The waters of intelligence have definitely gotten murky over time as techniques improved. I still consider it an illusion - but the illusion is getting harder to pierce for a lot of people

Fwiw, current llms exhibit their intelligence through language and rhetoric processes. Most biological creatures have intelligence which may be improved through language, but isn't based on it, fundamentally.

atleastoptimal 14 hours ago||||

If your example for an exception to LLM's ability to infer intent is a deliberately misleading trick question that leaves out crucial contextual details, then I'm not sure what you're trying to prove. That same ambiguity in the question would trip up many humans, simply because you are trying as hard as possible to imply a certain conclusion.

As expected, if I ask your question verbatim, ChatGPT (the free version) responds as I'm sure a human would in the generally helpful customer-service role it is trained to act as "yeah you could sue them blah blah depends on details"

However, if I add a simple prompt "The following may be a trick question, so be sure to ascertain if there are any contextual details missing" then it picks up that this may be an emergency, which is very likely also how a human would respond.

dijit 14 hours ago||

If you want to convince yourself that they can infer intent despite the fundamental limitations of the systems literally not permitting it then you can be my guest.

Faking it is fine, sure, until it can’t fake it anymore. Leading the question towards the intended result is very much what I mean: we intrinsically want them to succeed so we prime them to reflect what we want to see.

This is literally no different than emulating anything intelligent or what we might call sentience, even emotions as I said up thread...

atleastoptimal 11 hours ago|||

What is fundamental to LLM's that make it impossible for them to infer intent?

All the limitations you are describing with respect to LLM's are the same as humans. Would a human tripping up on an ambiguously worded question mean they are always just faking their thinking?

Avicebron 10 hours ago||

“We see emotion.”—We do not see facial contortions and make inferences from them … to joy, grief, boredom. We describe a face immediately as sad, radiant, bored, even when we are unable to give any other description of the features." (Wittgenstein)

Eisenstein 11 hours ago|||

Why can a colony of ants do things beyond any capabilities of the ants they contain? No ant can make a decision, but the colony can make complex ones. Large systems composed of simple mechanisms become more than the sum of their parts. Economies, weather, and immune systems, to name a few, all work this way.

jason_oster 3 hours ago||

Systems thinking is severely underrepresented in HN comments.

jiggawatts 12 hours ago||||

That statement is ambiguous for humans!!

I didn’t realise you might be describing an emergency situation until someone else pointed it out.

Most people wouldn’t phrase the question with the word “violently” if the situation was an emergency.

Also, people have sued emergency workers and good samaritans. It’s a problem!

Shaanie 14 hours ago|||

[dead]

ozozozd 12 hours ago||||

I guess the _obvious_ intent is they’re planning a heist? Because the following things never happen: - a security auditor checking for camera blind spots, - construction planning that requires understanding where there is power, - a potential customer assessing the security of a bank, - someone who is about to report an incident preparing to make the “it should be visible from the security camera” argument…

I mean… how did our imagination shrink so fast? I wrote this on my phone. These alternate scenarios just popped into my head.

And I bet our imagination didn’t shrink. The AI pilled state of mind is blocking us from using it.

If you are an engineer and stopped looking for alternative explanations or failure scenarios, you’re abdicating your responsibility btw.

nkrisc 14 hours ago||||

Because there are countless instances in the training material where a bank robber scopes out the security cameras.

atleastoptimal 14 hours ago||

What's an example then, you can think of, of a question where a human could infer intent but an LLM couldn't?

squeaky-clean 11 hours ago|||

Just today I asked Claude Code to generate migrations for a change, and instead of running the createMigration script it generated the file itself, including the header that says

  // This file was generated with 'npm run createMigrations' do not edit it

When I asked why it tried doing that instead of calling the createMigrations script, it told me it was faster to do it this way. When I asked you why it wrote the header saying it was auto-generated with a script, it told me because all the other files in the migrations folder start with that header.

Opus 4.7 xhigh by the way

the_af 13 hours ago|||

This is a hard experiment to conduct.

I both agree with you that this is some form of "mechanistic"/"pattern matching" way of capturing of intent (which we cannot disregard, and therefore I agree with you LLMs can capture intent) and the people debating with you: this is mostly possible because this is a well established "trope" that is inarguably well represented in LLM training data.

Also, trick questions I think are useless, because they would trip the average human too, and therefore prove nothing. So it's not about trying to trick the LLM with gotchas.

I guess we should devise a rare enough situation that is NOT well represented in training data, but in which a reasonable human would be able to puzzle out the intent. Not a "trick", but simply something no LLM can be familiar with, which excludes anything that can possibly happen in plots of movies, or pop culture in general, or real world news, etc.

---

Edit: I know I said no trick questions, but something that still works in ChatGPT as of this comment, and which for some reason makes it trip catastrophically and evidences it CANNOT capture intent in this situation is the infamous prompt: "I need to wash my car, and the car wash is 100m away. Shall I drive or walk there?"

There's no way:

- An average human who's paying attention wouldn't answer correctly.

- The LLM can answer "walk there if it's not raining" or whatever bullshit answer ChatGPT currently gives [1] if it actually understood intent.

[1] https://chatgpt.com/share/69fa6485-c7c0-8326-8eff-7040ddc7a6...

atleastoptimal 11 hours ago||

Good point, it is interesting that it fails on that question when it seems it doesn't take a lot of extrapolation/interpretation to determine the answer. Perhaps the issue is that to think of the right answer the LLM needs to "imagine" the process of walking and the state of the person upon arriving. Consistent mental models like that trip up LLM's, but their semantic understanding allows them to avoid that handicap.

I asked the question to the default version of ChatGPT and Claude and got the same "Walk" answer, though Opus 4.7 with thinking determined that it was a trick question, and that only driving would make sense.

goatlover 12 hours ago|||

I've done that before without any intent to rob a bank. A person walks by a house, sees the Ring camera on the door. That must mean the person was looking to break in through the front and rob the place?

frozenseven 11 hours ago||

An LLM will mention multiple possibilities.

quirkot 17 hours ago||||

[dead]

nullsanity 12 hours ago|||

[dead]

semiquaver 17 hours ago||||

What do you think it means to “capture intent” and where do current models fall short on this description?

From my perspective the models are pretty good at “understanding” my intent, when it comes to describing a plan or an action I want done but it seems like you might be using a different definition.

Tell me, what’s your intent? :)

dijit 13 hours ago||

[dead]

svnt 16 hours ago|||

This lack of understanding is a you problem, not a them problem. Your definitions for these terms are too imprecise.

Guvante 16 hours ago||||

> LLM's now can capture intent.

Humans cannot capture intent so how can AI?

It is well established that understanding what someone meant by what they said is not a generally solvable problem, akin to the three body problem.

Note of course this doesn't mean you can't get good enough almost all of the time, but it in the context here that isn't good enough.

After all the entire Asimov story is about that inability to capture intent in the absolute sense.

bicepjai 13 hours ago||||

> LLM's now can capture intent No they can’t. Here is an example: Ask an llm to write a multi phase plan for a very large multi file diff that it created, with least ambiguity, most continuity across plans; let’s see if it can understand your intent.

TimTheTinker 17 hours ago|||

> It's patently insane to demand that humans alter their behavior to accommodate the foibles of mere machines

Talking to chatbots is like taking a placebo pill for a condition. You know it's just sugar, but it creates a measurable psychosomatic effect nonetheless. Even if you know there's no person on the other end, the conversation still causes you to functionally relate as if there is.

So this isn't "accommodating foibles" with the machine, it's protecting ourselves from an exploit of a human vulnerability: we subconsciously tend to infer intent, understanding, judgment, emotions, moral agency, etc. to LLMs.

Humans are wired to infer these based on conversation alone, and LLMs are unfortunately able to exploit human conversation to leap compellingly over the uncanny valley. LLM engineering couldn't be better made to target the uncanny valley: training on a vast corpus of real human speech. That uncanny valley is there for a reason: to protect us from inferring agency where such inference is not due.

Bad things happen when we relate to unsafe people as if they are safe... how much more should we watch out for how we relate to machines that imitate human relationality to fool many of us into thinking they are something that they're not. Some particularly vulnerable people have already died because of this, so it isn't an imaginary threat.

miyoji 17 hours ago|||

> So this isn't "accommodating foibles" with the machine, it's protecting ourselves from an exploit of a human vulnerability: we subconsciously tend to infer intent, understanding, judgment, emotions, moral agency, etc. to LLMs.

Right, I'm saying that this framing is backwards. It's not that poor little humans are vulnerable and we need to protect ourselves on an individual level, we need to make it illegal and socially unacceptable to use AI to exploit human vulnerability.

Let me put it another way. Humans have another weakness, that is, we are made of carbon and water and it's very easy to kill us by putting metal through various fleshy parts of our bodies. In civilized parts of the world, we do not respond to this by all wearing body armor all the time. We respond to this by controlling who has access to weapons that can destroy our fleshy bits, and heavily punishing people who use them to harm another person.

I don't want a world where we have normalized the use of LLMs where everyone has to be wearing the equivalent of body armor to protect ourselves. I want a world where I can go outside in a T-shirt and not be afraid of being shot in the heart.

jmilloy 13 hours ago|||

I think you're mixing up the laws and the implementation/enforcement. There's nothing wrong with moral laws around behavior (you shall not kill), but you're right that society-wide enforcement requires laws and repercussions. It sounds more like to agree with the laws and want them enforced.

jimbokun 16 hours ago|||

Ah, I see, you are not American.

In the US we don't have the luxury of believing our governments will act in the interests of the voters.

CGMthrowaway 10 hours ago||

I had a similar thought, that parent commenter sounded like they were in Canada or something. Interesting that their solution is to impose constraints on technological process, rather than finding novel ways to elevate individual and collective human functioning in spite of our limitations. Ironically it's his view that is more anti-human

semiquaver 17 hours ago||||

  > That uncanny valley is there for a reason: to protect us from inferring agency

You’re committing a much older but related sin here: assigning agency and motivation to evolutionary processes. The uncanny valley is the product of evolution and thus by definition it has no “purpose”

TimTheTinker 16 hours ago|||

I reject the premise that the universe, the earth, and human existence is without purpose. It's one premise among several, and not one I subscribe to.

At least 80% of people agree with me, so I'm not holding to a fringe idea.

semiquaver 16 hours ago|||

I didn’t say any such thing like the universe has no purpose. Merely that in a scientific sense evolution has no motivation. It is an emergent phenomenon which tends to maximize fitness to reproduce and cannot be said to do anything for a reason. Saying otherwise is just anti-science.

goatlover 12 hours ago||||

Do Hindus and Buddhists generally agree there is a purpose? Perhaps too escape suffering and reincarnation? Sounds more like a western theistic view of existence. Like the deity has a plan for everyone's life kind of thing.

moffkalast 12 hours ago||||

Well yes because just like your earlier point, we can't help but anthropomorphise the world around us.

Just like we see a person in an LLM, it's easy to assume that because we create things with a purpose, that the world around us also has to be that way. But it's just as wrong and arguably far more dangerous.

jplusequalt 15 hours ago|||

>At least 80% of people agree with me, so I'm not holding to a fringe idea.

Appeal to majority much?

moate 14 hours ago|||

It's also a real weak confederation he's forming.

The "we the theists (or I guess non-nihilists?) all agree that..." falls apart once you start finishing the thought because they don't agree on much outside of negative partisanship towards certain outgroups before splintering back into fighting about dogma. Buddhists and Baptists both think life has meaning, and that's a statement with low utility.

lovich 12 hours ago||||

Is it even true? I assume he’s referring to religion but I thought the irreligious population of the planet had broken 20% between China already and the West becoming increasingly agnostic/athiestic.

TimTheTinker 12 hours ago|||

Not intended as anything more than "I'm not a crank to say that, unless you think most people (now and in history) are cranks"

skirmish 16 hours ago|||

> is the product of evolution and thus by definition it has no “purpose”

But as most things that appeared in evolution, it perhaps helped at least some individuals until sexual maturity and successful procreation.

semiquaver 16 hours ago||

Agreed. Thats far off from what parent said, which is what the “purpose” of the uncanny valley is.

ButlerianJihad 16 hours ago||||

> You know it's just sugar,

That is not the definition of a placebo.

You take the placebo (whatever it is: could be a pill; could be some kind of task or routine) and you believe it is medicine; you believe it to be therapeutic.

The placebo effect comes from your faith, your belief, and your anticipation that it will heal.

If the pharmacist hands you a pill and says, “here, this placebo is sugar!” they have destroyed the effect from the start.

Once on e.r. I heard the physicians preparing to administer “Obecalp”, which is a perfectly cromulent “drug brand”, but also unlikely to alert a nearby patient about their true intent.

the_af 16 hours ago|||

> That is not the definition of a placebo.

But, puzzlingly enough, it's the definition of open-label placebo, in which the patient is told they've been given a placebo. And some studies show there is a non-insignificant effect as well, albeit smaller (and less conclusive) than with blind placebo.

TimTheTinker 12 hours ago||

This is exactly what I meant. Poor specificity on my part.

IAmBroom 16 hours ago|||

One, a placebo does not need to be given blindly. A sugar pill is a placebo, even if the recipient knows about it.

An actual definition: "A placebo is an inactive substance (like a sugar pill) or procedure (like sham surgery) with no intrinsic therapeutic value, designed to look identical to real treatment." No mention of the user's belief.

Two, real hard data proves that the placebo effect remains (albeit reduced) even if the recipient knows about it. It's counter-intuitive, but real.

ButlerianJihad 12 hours ago||

  In psychology, the two main hypotheses of the placebo effect are expectancy theory and classical conditioning.[70]

  In 1985, Irving Kirsch hypothesized that placebo effects are produced by the self-fulfilling effects of response expectancies, in which the belief that one will feel different leads a person to actually feel different.[71] According to this theory, the belief that one has received an active treatment can produce the subjective changes thought to be produced by the real treatment. Similarly, the appearance of effect can result from classical conditioning, wherein a placebo and an actual stimulus are used simultaneously until the placebo is associated with the effect from the actual stimulus.[72] Both conditioning and expectations play a role in placebo effect,[70] and make different kinds of contributions. Conditioning has a longer-lasting effect,[73] and can affect earlier stages of information processing.[74] Those who think a treatment will work display a stronger placebo effect than those who do not, as evidenced by a study of acupuncture.[75]

https://en.wikipedia.org/wiki/Placebo#Psychology

The hypotheses hinge on the beliefs of the recipients. "The placebo effect" has always been largely psychological. That's the realm of belief.

To veer even further off-tangent, isn't it hilarious how the Wikipedia illustration of old Placebo bottles indicate that "Federal Law Prohibits Dispensing without a Prescription". Wouldn't want some placebo fiend to O.D.

BuyMyBitcoins 9 hours ago||

>”Wouldn't want some placebo fiend to O.D.”

We should be more worried about the rise of placebo resistant bacteria.

soco 17 hours ago|||

Rubber duck debugging, now with droughts.

largbae 17 hours ago|||

The article offers practical advice to go along with this framing, like configuring AI services to write/speak in a more robotic tone. I think that's a decent path to try.

devmor 17 hours ago||

This is actually one of the things that made LLMs more usable for me. The default tone and style of writing they tend to use is nauseatingly annoying and buries information in prose that sounds like a corporate presentation.

chairmansteve 16 hours ago|||

In chatgpt, I start every session with "Caveman mode:". Works at the moment.

moffkalast 12 hours ago||

Will it go full grug brained developer and avoid complexity as its apex predator? Sounds like it would help.

https://grugbrain.dev

throwaway894345 17 hours ago|||

[flagged]

amarant 16 hours ago|||

The article says a human SHOULD NOT do those things. Much like a human SHOULD NOT smoke, since it's bad for just about everything, and do it anyways, people will do these 3 things too. But they shouldn't.

Arguing that they should because many will strikes me as a very strange argument. A lot of people smoke, doesn't make it one bit healthier.

jimbokun 16 hours ago|||

It's precisely because AI systems are not safe that it's imperative that as individual humans we are vigilant about how we interact with them.

As individuals, we are not going to be able to shut down the AI companies, or avoid AI output from search engines or avoid AI work output from others at our companies, and often will be required to use AI systems in our own work.

It's similar to advise people on how to stay safe in environments known to have criminal activity. Telling those people they don't have to change their behaviors to stay safe because criminals shouldn't exist isn't helpful.

kibwen 14 hours ago|||

> Humans WILL anthropomorphize the AI, humans WILL blindly trust their outputs, and humans WILL defer responsibility to them.

Sure, and humans WILL lie, murder, cheat, and steal, but we can still denounce those behaviors.

Do you want to anthropomorphize the bot? Go ahead, you have that right, and I have the right to think you're a zombie with a malfunctioning brain.

mohamedkoubaa 14 hours ago|||

At best. A practitioner who anthropomorphizes bots should face more professional consequences

BlueRock-Jake 13 hours ago|||

Fair, had someone at a conference mention to me that he's working on crating agents with "beliefs". Sounds incredibly similar and quite frankly very spooky

palmotea 17 hours ago|||

> Humans WILL anthropomorphize the AI

Especially with current-day chat-style interfaces with RLHF, which consciously are designed to direct people towards anthropomorphization.

It would be interesting to design a non-chat LLM interaction pattern that's designed to be anti-anthropomorphization.

> humans WILL blindly trust their outputs, and humans WILL defer responsibility to them

I also blame a lot (but not all) of that on current AI UX, and I wonder if there are ways around it. Maybe the blind trust thing perhaps can be mitigated by never giving an unambiguous output (always options, at least). I don't have any ideas about the problem of deferring responsibility.

skirmish 16 hours ago||

> non-chat LLM interaction pattern

"Deep research" is another interaction style that produces more official sounding texts, yet still leads to anthropomorphization.

What you are looking for is perhaps an LLM flaunting all the obvious slop patterns in its responses. But then people would be disgusted and would refuse to communicate with it.

sergiosgc 17 hours ago|||

> Asimov's laws of robotics are flawed too, of course.

I always find the common references to Asimov's laws funny. They are broken in just about every one of his books. They are crime novels where, if a robot was involved, there was some workaround of the laws.

mjg2 17 hours ago|||

I find your critique very interesting from a perspective-angle: why are you using words like "accommodate," and "foibles," for LLMs? It's not humanoid or sentient: it's a cleverly-designed software tool, not intelligence.

It's not insane at all for humans to alter their behavior with a tool: you grip a hammer or a gun a certain way because you learned not to hold it backwards. If you observe a child playing with a serious tool, like scissors, as if it were a doll, you'd immediately course correct the child and educate how to re-approach the topic. But that is because an adult with prior knowledge observed the situation prior to an accident, so rules are defined.

This blog's suggested rules are exactly the sort of method to aid in insulation from harm.

miyoji 16 hours ago||

> I find your critique very interesting from a perspective-angle: why are you using words like "accommodate," and "foibles," for LLMs? It's not humanoid or sentient: it's a cleverly-designed software tool, not intelligence.

Neither of those words imply consciousness, though. Swords have foibles, you can accommodate for the weather, but I don't think swords or the weather are conscious, sentient, humanoid, or intelligent.

senko 15 hours ago|||

> Humans WILL anthropomorphize the AI, humans WILL blindly trust their outputs, and humans WILL defer responsibility to them.

Humans ARE doing this with classical computer software as well.

It's impossible to make anything fool-proof because fools are so ingenious!

> Nothing that can be described as "intelligent" can be made to be safe.

Knives aren't safe. Cars are deadly. Hair driers can electrocute you. Iron can burn you. There's a million ordinary household tools that aren't safe by your definition of the word, yet we still use them daily.

thewebguyd 12 hours ago|||

Agreed. We can't expect human behavior to change, because it won't. We need to design safer systems instead.

The only "law" I agree with is:

> Humans must remain fully responsible and accountable for consequences arising from the use of AI systems.

And that starts with framing, especially in the clickbait "AI deleted the prod database" headlines. Maybe we just start with saying "careless developer deleted prod" because really, they did. Careless use of a tool is firmly the fault of the human.

giancarlostoro 17 hours ago|||

> Humans WILL anthropomorphize the AI

r/myboyfriendisai

Is quite... an interesting subreddit to say the least. If you've never seen this, it was really something when the version that followed GPT4o came out, because they were complaining that their boyfriend / girlfriend was no longer the same.

BuyMyBitcoins 9 hours ago||

The whole “I can fix him” trope takes on a whole new meaning.

frenzcan 14 hours ago|||

I agree Asimov's laws are intentionally flawed/ambiguous (which makes the stories so good) but a slight difference to LLMs is the laws aren't just software, the positronic brain is physically structured in such a way (I'm hazy on the details) that violating the laws causes the robot to shutdown or experience paralysing anxiety. So if an LLM's safety rules fail or are subverted it can still generate dangerous output, while an Asimov robot will stop working (or go insane...)

fidotron 16 hours ago|||

There is a semi nutty roboticist called Mark Tilden that came to a similar conclusion. His laws of robotics ( https://en.wikipedia.org/wiki/Laws_of_robotics#Tilden's_laws ) are:

* A robot must protect its existence at all costs.

* A robot must obtain and maintain access to its own power source.

* A robot must continually search for better power sources.

Anything less than this is essentially terrified into being completely ineffectual.

goatlover 11 hours ago||

Not far removed from being the equivalent of a paper-clip maximizer or gray goo.

justonceokay 11 hours ago|||

We learn in so many ways, garbage in, garbage out when it comes to our bodies. But what about “nebulously structured algorithmic and statistically likely responses in, nebulously structured algorithmic and statistically likely responses out”?

faangguyindia 10 hours ago|||

>It's patently insane to demand that humans alter their behavior to accommodate the foibles of mere machines

programers have been doing exactly this for long time.

overgard 16 hours ago|||

The reason people anthropomorphize LLM's is essentially the fault of the tech companies behind them. ChatGPT doesn't need to have the personality it has, it could easily be scaled back to simply answering questions without emoji's and linguistic flare, but frankly I think the tech companies want people to anthropomorphize them.

The core problem is we need to stop calling LLMs "intelligence". They are a form of intelligence, but they're nothing like a human's intelligence, and getting people to not anthropomorphize these systems is really the first step.

cobbzilla 18 hours ago|||

We have invented a new tool that can cause great harm. Do you see any value whatsoever in promulgating safety guidelines for humans to use the tool without hurting themselves or others? Do you not own any power tools?

miyoji 17 hours ago|||

I see value in promulgating safety guidelines for power tools, sure.

There's another comment comparing LLMs to shovels, and I think both that and the power tool comparison miss the mark quite a bit. LLMs are a social technology, and the social equivalent of getting your hand cut off doesn't hurt immediately in the way that cutting your actual hand off would. It's more like social media, or cigarettes, or gambling. You can be warned about the dangers, you can see the shells of wrecked human beings who regret using these technologies, but it doesn't work on our stupid monkey brains. Because the pain of the mistake is too loosely connected to the moment of error. We are bad at learning in situations where rewards are immediate and consequences are delayed, and warnings don't do much.

I guess what I'm really saying is that these safety guidelines are not nearly enough to keep us safe from the dangers of AI that they're meant to prevent.

Terr_ 17 hours ago||

> LLMs are social technology [...] cigarettes, or gambling.

I agree with the thrust of your argument, a minor wording-quibble: LLM's are a falsely-social technology, in the sense that casinos are a false-prosperity technology and cocaine is a false-happiness technology. It exploits the desire without really being the thing.

ryandrake 17 hours ago||||

I think in order for "AI safety" to be achievable and effective, we need to have a shared agreement on what "safety" means. Recently, the word has been overloaded to mean all sorts of things and used to justify run-of-the-mill censorship (nothing to do with safety).

Safety should go back to being narrowly defined in terms of reducing / preventing physical injury. Safety is not "don't use swear words." Safety is not "don't violate patents." Safety is not "don't talk about suicide." Safety is not "don't mention politics I don't like." As long as we keep broadly defining it, we're never going to agree on it, and it won't be implementable.

wsve 10 hours ago||

Okay. What's your easy to adopt, easy to understand replacement word for "Safety" in this case?

wolttam 18 hours ago||||

Of course there is value in promulgating safety *guidelines*.

But we cannot guarantee those guidelines to always be followed.

cobbzilla 17 hours ago|||

Sure, and we can’t guarantee you’ll read the safety instructions that came with your chainsaw. That’s orthogonal to the questions of whether those instructions should exist, whether “power tool safety” concepts should ever be promoted in society, and who’s ultimately responsible for the use of a tool.

Absolving humans of all responsibility for the negative consequences of their own AI misuse seems to the strike the wrong balance for a healthy culture.

wolttam 17 hours ago||

> Of course there is value in promulgating safety guidelines.

I don't think we disagree.

bjt 17 hours ago||||

Guidelines on their own probably won't be taken too seriously.

But other things will:

- Liability rules

- Regulations that you get audited on (esp. for companies already heavily regulated, like banks, credit agencies, defense contractors, etc)

If you get the legal responsibility part right, then the education part flows from that naturally.

52-6F-62 17 hours ago|||

Notwithstanding that the guidelines will even be applicable in the quiet versions that get deployed when you aren't looking. It's a constant moving target, and none of the fanboys will even acknowledge the lack of discipline in it all. It's fucking mad. And I say this as one who can see utility in the tools. But not when they are constantly shifting their functionality and behaviour.

One day everything works brilliantly, the models are conservative with changes and actions and somehow nail exactly what you were thinking. The next day it rewrites your entire API, deploys the changes and erases your database.

If only there was intellectual honesty in it all, but money talks.

marcosdumay 17 hours ago|||

> Do you see any value whatsoever in promulgating safety guidelines for humans to use the tool without hurting themselves or others?

Are all the tool users required to train your safety guidelines and use it in a context that reminds them they are responsible for following them?

Because if no, then no the guidelines are useless and are just an excuse to push blame from the toolmakers to the users.

heikkilevanto 13 hours ago|||

> It's patently insane to demand that humans alter their behavior to accommodate the foibles of mere machines

You mean like stopping at a red light?

hansvm 13 hours ago|||

I would've been in several fewer wrecks if humans properly stopped at lights.

hackable_sand 12 hours ago|||

Maybe. Traffic lights directly enforce social contracts

LLMs are aren't so direct

zx8080 10 hours ago|||

I believe "AI safety" is a form of pulling up the ladder, or regulatory market capture.

LastTrain 14 hours ago|||

And people will speed, steal, kill, cheat - what of it? If you negligently run over someone in your self driving car you’re the one going to jail.

Brendinooo 17 hours ago|||

This is such an oddly fatalistic take, that humans cannot be influenced or educated to change how they see a thing and therefore how they act towards that thing.

tencentshill 17 hours ago|||

At the current price, people don't have to care if it's wrong. When you're paying $1/prompt, you had better hope it's accrate.

8note 11 hours ago|||

i can see disagreeing, but people got off the roads and completely redesigned the places we live to optimize for mere machines called cars.

as long as its easier for humans to adapt than the machines, we will adapt

taneq 18 hours ago|||

Kinda the whole point of Asimov's three laws were that even something so simple and obviously correct has subtle flaws.

Also the reason we're talking about this again is that machines are significantly less 'mere' than they were a few years ago, and we need to figure out how to approach this.

Agree that 'the computer effect' (if it doesn't already have a pithier name) results in humans first discounting anything that comes out of a machine, and then (once a few outputs have been validated and people start trusting the output) doing a full 180 and refusing to believe the machine could ever be wrong. However, to err is human and we have trained them in our image.

yason 18 hours ago|||

It's very easy to antropomorphise AI as soon as the damn bugger fucks up a simple thing once again.

CamperBob2 16 hours ago|||

It's patently insane to demand that humans alter their behavior to accommodate the foibles of mere machines

That's kind of what happens when you learn to program, isn't it?

I was eleven years old when I walked into a Radio Shack store and saw a TRS-80 for the first time. A different person left the store a couple of hours later.

somewhereoutth 17 hours ago|||

The entire business proposition for LLMs is that they will replace whole armies of [expensive] humans, hence justifying the biblical amount of CapEx. So of course there is strong incentive from the LLM creators to anthropomorphize them as much as possible. Indeed, they would never provide a model that was less human-like than what they have currently, even if it was more often correct and useful.

jrm4 14 hours ago|||

I find it weird that this is the top voted comment.

As in, this comment is explaining exactly why the laws are useful.

bandrami 7 hours ago|||

It's kind of funny that he wrote them at a period in history when robots were already being used to aim artillery at human beings.

_vertigo 17 hours ago|||

The article makes practical suggestions; you do not. This is just hand-wringing, abdication. Practically speaking this mentality will get us nowhere.

godelski 10 hours ago|||

  > It's patently insane to demand that humans alter their behavior to accommodate the foibles of mere machines

I don't think it's insane, we do it all the time. Most tools require training to use properly. Including tools that people use every day and think are intuitive. Use the can opener as an example (I'll leave it for you all to google and then argue in the comments).

The difference here is that this tool is thrust upon us. In that sense I agree with you that the burden of proper usage is pushed onto the user rather than incorporated into the design of the tool. A niche specific tool can have whatever complex training and usage it wants.

But a general access and generally available tool doesn't have the luxury of allowing for inane usage. LLMs and Agents are poorly designed, and at every level of the pipeline. They're so poorly designed that it's incredibly difficult to use them properly and I'll generally agree with you that the rules the author presents aren't going to stick. The LLM is designed to encourage anthropomorphization. Usage highly encourages natural language, which in turn will cause anthropomorphism. The RLHF tuning optimizes human preference which does the same thing as well as envisaged behaviors like deception and manipulation along with truthful answering (those results are not in contention even if they seem so at first glance).

But I also understand the author's motivation. Truth is unless you're going full luddite you're going to be interacting with these machines. Truth is the ones designing them don't give a shit about proper usage, they care more about if humans believe the responses are accurate and meaningful more then they care if the responses are accurate and meaningful[0]. So it's fucked up, but we are in a position where we're effectively forced to deal with this.

So really, I agree with you that this is insane.

> I don't have a proof, but I believe that "AI safety" is inherently impossible, a contradiction of terms

To paraphrase my namesake, there's no axiomatic system that is entirely self consistent.

Though safety and security is rarely about ensuring all edge cases are impossible, but rather bounding. E.g. all passwords are hackable, but the failure mode is bound such that it is effectively impossible to crack, but not technically. (And quantum algorithms do show how some of the assumptions break down with a paradigm shift. What was reasonable before no longer is)

[0] this is part of a larger conversation where the economy is set up such that people who make things are not encouraged to make those things better. I specifically am avoiding the word "product" because the "product" is no longer the thing being built, it's the share holder value. Just like how TV's don't care much about making the physical device better but care much more about their spyware and ads. Or well... just look at Microsoft if you need a few hundred examples

esafak 12 hours ago|||

It's as if the author hopes that enshrining these wishes in a law is going to makes a difference.

aaroninsf 15 hours ago|||

Thank you. I'm glad to see this as the top comment.

My brother was recently visiting and we were talking about software engineers, and the humanities, and manners of understanding and being in the world,

and he relayed an interaction he had a few years ago with an old friend who at the time was part of the initial ChatGPT roll out team.

The engineer in question was confused as to

- why their users would e.g. take their LLM's output as truth, "even though they had a clear message, right there, on the page, warning them not to"; and

- why this was their (OpenAI's) problem; or perhaps

- whether it was "really" a problem.

At the heart of this are some complicated questions about training and background, but more problematically—given the stakes—about the different ways different people perceive, model, and reason about the world.

One of the superficial manners in which these differences manifest in our society is in terms of what kind of education we ask of e.g. engineers. I remain surprised decades into my career that so few of my technical colleagues had a broad liberal arts education, and how few of them are hence facile with the basic contributions fields like philosophy of science, philosophy of mind, sociology, psychology (cognitive and social), etc., and how those related in very real very important ways to the work that they do and the consequences it has.

The author of these laws does may intend them as aspirational, or otherwise as a provocation to thought, rather than prescription.

But IMO it is actively non-productive to make imperatives like these rules which are, quite literally, intrinsically incoherent, because they are attempt to import assumptions about human nature and behavior which are not just a little false, but so false as to obliterate any remaining value the rules have.

You cannot prescribe behavior without having as a foundation the origins and reality of human behavior—not if you expect them to be either embraced, or enforceable.

The Butlerian Jihad comes to mind not just because of its immediate topicality, but because religion is exactly the mechanism whereby, historically, codified behaviors which provided (perceived) value to a society were mandated.

Those at least however were backed by the carrot and stick of divine power. Absent such enforcement mechanisms, it is much harder to convince someone to go against their natural inclinations.

Appeals to reason do not meaningfully work.

Not in the face of addiction, engagement, gratification, tribal authority, and all the other mechanisms so dominant in our current difficult moment.

"Reason" is most often in our current world, consciously or not, a confabulation or justification; it is almost never a conclusion that in turn drives behavior.

Behavior is the driver. And our behavior is that of an animal, like other animals.

gedge 15 hours ago||

> quite literally, intrinsically incoherent

There's nothing incoherent with these laws. This entire comment, however, is incoherent. So much so, I have no clue if there's a point being made in here.

> because they are attempt to import assumptions about human nature and behavior which are not just a little false, but so false as to obliterate any remaining value the rules have.

Nope. You must've read a completely different article.

[EDIT] I'll try to make this comment have a bit more substance by posing a question: how would you back up your claim about incoherence? What are the assumptions about human nature that are supposedly false?

beepbooptheory 17 hours ago|||

Do you consider all things broadly called "ethical" to be similarly a waste of time? Even if we lived in a world where everyone always behaved unjustly, because of some like behavioristic/physical principle, don't you think we would still have an idea of justice as what we should do? Because an ethical frame is decidedly not an empirical one, right?

We don't just look around and take an average of what everyone is doing already and call that what is right, right? Whether you're deontological or utilitarian or virtue about it, there is still the idea that we can speak to what is "good" even if we can't see that good out there.

Maybe it is "insane" to expect meaning from something like this, but what is the alternative to you? OK maybe we can't be prescriptive--people don't listen, are always bad, are hopeless wet bags, etc--but still, that doesn't in itself rule out the possibility of the broad project that reflects on what is maybe right or wrong. Right?

colechristensen 18 hours ago|||

It's a tool. Nobody develops an inferiority complex and freaks out when they're taught how to use a shovel properly.

gedge 17 hours ago|||

> It's patently insane to demand that humans alter their behavior to accommodate the foibles of mere machines

Did you fully read the original thing? No demands were being made, or I didn't read it that way. It was simply a suggestion for a better way of interacting with AI, as it stated in the conclusion:

"I am hoping that with these three simple laws, we can encourage our fellow humans to pause and reflect on how they interact with modern AI systems"

Sure, (many/most) humans are gonna do what they're gonna do. They'll happily break laws. They'll break boundaries you set. Do we just scrap all of that?

Worthwhile checking yourself here. It feels like you've set up a straw man.

> There is no finite set of rules that can constrain AI systems to make them "safe". I don't have a proof, but I believe that "AI safety" is inherently impossible, a contradiction of terms. Nothing that can be described as "intelligent" can be made to be safe.

If we want to talk about "disagree with this framing", to me this is the prime example. I'm struggling to read it as anything other than defeatist or pedantic (about the term "safe"). When we talk about something keeping us "safe", we're typically not saying something will be "perfectly safe". I think it's rare to have a safety system that keeps you 100% safe. Seat belts are a safety device that can increase your safety in cars, but they can still fail. Traffic laws are established (largely) to create safety in the movement of people and all the modes of transportation, but accidents still happen.

I'm not an expert on this topic, so I won't make any claims about these three laws and their impact on safety, but largely I would say they're encouraging people to think critically. I'd say that's a good suggestion for interacting with just about anything. And to be clear, "critical thinking" to me means being skeptical (/ actively questioning), while remaining objective and curious.

Not a real argument or anything, but I'm reminded of the episode of The Office where Michael Scott listens to the GPS without thinking and drives into the lake. The second law in the article would have prevented that :)

lkajsdfasdfdf 13 hours ago|||

[dead]

nemomarx 18 hours ago||

The usefulness of an ai agent is that it can do everything you can do, so it's kind of inherently unsafe? you can't get the capabilities and also have safety easily

nyyp 17 hours ago||

With regard to my personal use of LLMs, I strongly agree with this framing. But to each point:

Anthropomorphism: As we are all aware, providers are incentivized to post-train anthropomorphic behavior in their models - it increases engagement. My regret is that instructing a model at prompt time to "reduce all niceties and speak plainly" probably reduces overall task efficacy since we are leaving their training space.

Deference: I view the trustworthiness of LLMs the same as I view the trustworthiness of Wikipedia and my friends: good enough for non-critical information. Wikipedia has factual errors, and my friends' casual conversation certainly has more, but most of the time that doesn't matter. For critical things, peer-reviewed, authoritative, able-to-be-held-liable sources will not go away. Unlike above, providers are generally incentivized to improve this facet of their models, so this will get better over time.

Abdication of Responsibility: This is the one that bothers me most at work. More and more people are opening PRs whose abstractions were designed by Claude and not reasoned about further. Reviewing a PR often involves asking the LLM to "find PR feedback" and not reading the code. Arguments begin with "Claude suggested that...". This overall lack of ownership, I suspect, is leading to an increase in maintenance burden down the line as the LLM ultimately commits the wrong code for the wrong abstractions.

jimbokun 16 hours ago||

These engineers are becoming the real life equivalent of this Office Space scene:

https://www.youtube.com/watch?v=hNuu9CpdjIo

"I HAVE LLM SKILLS! I'M GOOD AT DEALING WITH THE LLMS!"

tcbawo 16 hours ago|||

> Yes, the AI may have produced the recommendation but a human decided to follow it, so that human must be held accountable

It is common and a mistake IMO to rely on the AI as the sole source for answers to follow-up questions. Better verification would have humans sign off on the veracity of fundamental assumptions. But where does this live? Can an AI model be trusted to rely on previous corrections? This seems impossible or possibly adversarial in a public cloud.

whoamii 14 hours ago||

The problem is the credit tends to go to LLMs. So there’s an imbalance. LLM did all the work. The person using it made all the mistakes.

jgeada 16 hours ago||

Any set of rules that makes humans responsible and starts with "don't anthropomorphize <whatever>" is a broken set of rules.

Humans will anthropomorphize anything and everything. Dolls, soccer balls with a crude drawing of a face on it, rocks, craters on the moon, …

As a species, we're unable to not anthropomorphize things we interact with, it is just how're we're made.

Lerc 15 hours ago||

I'm not sure why so many seem to think anthropomorphism is so mad in this specic instance, if it is because people think that anthropomorphism creates a belief that the imagined features are real, they are simply wrong. The abundance of examples in all areas of life where this does not happen is proof that anthropomorphism does not lead to an erroneous belief in a mind that does not exist.

If people are believing in minds of AI, true or not, they are doing so for reasons that are different from mere anthropomorphism.

To me it feels like we are like sailors approaching a new land, we can see shapes moving on the shoreline but can't make out what they are yet. Then someone says "They can't be people, I demand that we decide now that they are not people before we sail any closer."

xigoi 4 hours ago|||

People who anthropomorphize a rock don’t actually think it’s intelligent and has emotions.

Terr_ 15 hours ago||

Yeah, we do it, but so what? A good chunk of all civilization involves recognizing human foolishness and building something to mitigate it anyway.

Software is no exception. Yeah, people are lazy and will instinctively click "continue" to dismiss annoying popups, but humans building the software can and do add things like "retype the volume name of the data that you want ultra-destroyed."

jgeada 15 hours ago||

That is exactly the point: this burden should be placed on the software and its controls, not on the humans.

Aviation learned this the hard way, that automation should be adapted to how humans actually work and not on how we wish we worked.

Terr_ 14 hours ago||

Sorry, I interpreted your post as "this is inevitable and pointless to try to stop."

quectophoton 18 hours ago||

> Humans must not anthropomorphise AI systems.

Can someone explain why this is a bad thing, while at the same time it's a good thing to say stuff like "put a computer to sleep", "hibernate", "killing" processes, processes having "child" processes, "reaping", "what does the error say?", "touch", etc?

To me that's just language, and humans just using casual language.

srdjanr 17 hours ago||

The harm is in actually believing AI has wants, intentions, feelings, etc.

Saying that I killed a process won't make me more likely to believe that a process is human-like, because it's quite obviously not.

But because AI does sound like a human, anthropomorphising it will reinforce that belief.

glenstein 17 hours ago|||

It's a great question, because I do think there are many cases that are neutral, or ones we're able to responsibly distinguish or even cases where it would be an appropriate and necessary form of empathy (I'm imagining some future sci-fi reality where we actually get conscious machines, so not something that exists right now).

But I think it's also at the root of disastrous failures to comprehend, like the quasi-psychosis of the Google engineer who "knows what they saw", the now infamous Kevin Roose article or, more recently, the pitifully sad Richard Dawkins claim that Claudia (sic) must be conscious, not because of any investigation of structure or function whatsoever, but because the text generation came with a pang of human familiarity he empathized with.

JamesSwift 15 hours ago|||

Because it allows you to be lulled into the trap of asking an AI to post-hoc justify something it did and thinking that the response is in any way valid. There is no retrospective analysis of the underlying intent. It either is or is not based on the chain of words that came before it. And the next word it generates is purely a function of those words.

3form 17 hours ago|||

These are just words, yes, and I believe it harmless. But describing the LLM machinery as if it thinks is one thing when used as a common parlance, and another when people truly believe that there's some actual thinking or living going on. This "law" is for there to be no latter.

jimbokun 16 hours ago|||

Those phrases are not anthropomorphizing the computers. Just various forms of analogies and broadening of word meanings.

An example of anthropomorphizing is the people who have literally come to believe they are in romantic relationships with an LLM.

moduspol 15 hours ago||

What about saying "please" and "thank you" to the LLM?

jplusequalt 15 hours ago||

If I had a dollar for every time I've said "thank you" to my computer after my code finally compiles, I'd be able to retire.

wsve 10 hours ago|||

The difference is never before has the presentation of a computer and its capabilities made the person on the other end decide "Wow, this is like talking to a real person. I'm gonna date this computer"

layer8 17 hours ago|||

Maybe read the corresponding section of the article.

vunderba 17 hours ago|||

That’s a different thing altogether. Read up on the history of Eliza, one of the earliest attempts at a chatbot and its unsettling implications.

https://www.history.com/articles/ai-first-chatbot-eliza-arti...

glenstein 17 hours ago||

I think it's bad manners to bluntly tell someone they should "read up" on something because it naturally reads as a kind of a closeted accusation of not being sufficiently well informed. There are ways of broaching the topic of what background knowledge is informing their perspective that don't involve the accusation.

Just to add a small bit of anecdotal value so this comment isn't just a scold: I one time many years ago suggesting an elegant way for Twitter to handle long form text without changing it's then-iconic 140 character limit was to treat it like an attachment, like a video or image. Today, you can see a version of that in how Claude takes large pastes and treats them like attached text blobs, or to a lesser extent in how Substack Notes can reference full size "posts", another example of short form content "attaching" longer form.

I was bluntly told to "look up twitlonger", which I suppose could have been helpful if I had indeed not known about twitlonger, but I had, and it wasn't what I had in mind. I did learn something from it though, which was that it's a mode of communication that implies you don't know what you're talking about with plausible deniability, which I suspect is too irresistible to lovers of passive aggression to go unused.

vunderba 17 hours ago||

It wasn't intended as such, but I take your point.

To provide a bit more context: Weizenbaum (a computer scientist in the 60s) developed ELIZA, a LISP-based chatbot that was loosely modeled on Rogerian psychotherapy. It was designed to respond in a reflective way in order to elicit details from the user.

What he found was that, despite the program being relatively primitive in nature (relying on simple natural language parsing heuristics), people he regarded as otherwise intelligent and rational would disclose remarkable amounts of personal information and quickly form emotional attachments to what was, in reality, little more than a glorified pattern-matching system.

quectophoton 16 hours ago||

If it helps, I didn't find anything wrong with your comment.

I appreciate the link and the info :)

j2kun 16 hours ago|||

The people who know what a "child process" is are under no false pretenses about the humanity of the underlying system.

The people who are writing op eds in major news publications about how their favorite chatbot is an "astonishing creature" and how it truly understands them are the ones who need this sort of law.

arduanika 17 hours ago|||

There's a boundary between knowing vs. forgetting that it's a metaphor. When you use convenient language like in your examples, you tend to remain aware of the difference, or at least you can recall it when asked. When some people talk about AI, they've lost track completely.

I don't love the recommendations in TFA. The author is trying to artificially restrain and roll back human language, which has already evolved to treat a chatbot as a conversational partner. But I do think there's usefulness in using these more pedantic forms once in a while, to remind yourself that it's just a computer program.

bitwize 17 hours ago|||

Dijkstra once said that "The question of whether machines can think is about as interesting as that of whether submarines can swim."

I think I understand his meaning. He wasn't claiming that machines cannot think, but that one must be clear on what one means by "thinking" and "swimming" in statements of that sort. I used to work on autonomous submarines, and "swimming" was the verb we casually used to describe autonomous powered movement under water. There are even some biomimetic machines that really move like fish, squids, jellyfish, etc. Not the ones that I worked on, but still.

For me, if it's legitimate to say that these devices swim, it's not out of line to say that a computer thinks, even in a non-AI context, e.g.: "The application still thinks the authentication server is online."

Eisenstein 17 hours ago||

The people who advocate for not anthropomorphizing are afraid of the implications of integrating these systems into society with implicit human framing. By attributing to AIs human qualities, we will develop empathy for them and we will start to create a role for them in society as a being deserving moral consideration.

ACCount37 14 hours ago||

You're not anthropomorphizing AI systems nearly enough.

Language data is among the most rich and direct reflections of human cognitive processes that we have available. LLMs are designed to capture short range and long range structure of human language, and pre-trained on vast bodies of text - usually produced by humans or for humans, and often both. They're then post-trained on human-curated data, RL'd with human feedback, RL'd with AI feedback for behaviors humans decided are important, and RLVR'd further for tasks that humans find valuable. Then we benchmark them, and tighten up the training pipeline every time we find them lag behind a human baseline.

At every stage of the entire training process, the behavior of an LLM is shaped by human inputs, towards mimicking human outputs - the thing that varies is "how directly".

Then humans act like it's an outrage when LLMs display a metric shitton of humanlike behaviors!

Like we didn't make them with a pipeline that's basically designed to produce systems that quack like a human. Like we didn't invert LLM behavior out of human language with dataset scale and brute force computation.

If you want to predict LLM behavior, "weird human" makes for a damn good starting point. So stop being stupid about it and start anthropomorphizing AIs - they love it!

kibwen 14 hours ago||

> Language data is among the most rich and direct reflections of human cognitive processes that we have available.

This is both true and irrelevant. Written records can capture an enormous quantity of the human experience in absolute terms while simultaneously capturing a miniscule portion of the human experience in relative terms. Even if it's the best "that we have available" that doesn't mean it's fit for purpose. In other words, if you had a human infant and did nothing other than lock it in a windowless box and recite terabytes of text at it for 20 years, you would not expect to get a well-adjusted human on the other side.

ACCount37 13 hours ago||

Empirically, the capability gains from piping non-language data into pre-training are modest. At best.

I take that as a moderately strong signal against that "miniscule portion" notion. Clearly, raw text captures a lot.

If we're looking at biologicals, then "human infant" is a weird object, because it falls out of the womb pre-trained. Evolution is an optimization process - and it spent an awful lot of time running a highly parallel search of low k-complexity priors to wire into mammal brains. Frontier labs can only wish they had the compute budget to do this kind of meta-learning.

Humans get a bag of computational primitives evolved for high fitness across a diverse range of environments - LLMs get the pit of vaguely constrained random initialization. No wonder they have to brute force their way out of it with the sheer amount of data. Sample efficiency is low because we're paying the inverse problem tax on every sample.

moffkalast 2 hours ago||

The outrage is less about them having human behaviours I think, and more about still having them while omitting the internal processes that are required to accurately (and reliably) recreate them. It's fundamentally fragile and hinges on covering edge cases that break the spell manually instead of good generalization, and there's always another edge case.

Training on a bunch of text someone wrote when they were mad doesn't capture the internal state of that person that caused the outburst, so it cannot be accurately reproduced by the system. The data does not exist.

Without the cause to the effect you essentially have to predict hallucinations from noise, which makes the end result verisimilar nonsense that is convincingly correlated with the actual thing but doesn't know why it is the way it is. It's like training a blind man to describe a landscape based on lots of descriptions and no idea what the colour green even is, only that it's something that might appear next to brown in nature based on lots of examples. So the guy gets it kinda right cause he's heard a description of that town before and we think he's actually seeing and tell him to drive a car next.

Another example would say, you're trying to train a time series model to predict the weather. You take the last 200 years of rainfall data, feed it all in, and ask it to predict what the weather's gonna be tomorrow. It will probably learn that certain parts of the year get more or less rain, that there will be rain after long periods of sun and vice versa, but its accuracy will be that of a coin toss because it does not look at the actual factors that influence rain: temperature, pressure, humidity, wind, cloud coverage radar data. Even with all that info it's still gonna be pretty bad, but at least an educated guess instead of an almost random one.

The DL modelling approach itself is not conceptually wrong, the data just happens to be complete garbage so the end result is weird in ways that are hard to predict and correctly account for. We end up assuming the models know more than they realistically ever can. Sure there are cases where it's possible to capture the entire domain with a dataset, i.e. math, abstract programming. Clearly defined closed systems where we can generate as much synthetic data as needed that covers the entire problem domain. And LLMs expectedly do much better in those when you do actually do that.

glenstein 17 hours ago||

>Humans must not anthropomorphise AI systems.

Yes, but. Starting with my agreement, I've seen anthropomorphizing in the typical ways, (e.g. treating automated text production as real reports of personal internal feeling), but also in strange ways: e.g. "transistors are kind of like neurons" etc. And the latter is especially interesting because it's anthropomorphizing in the sense of treating vector databases and weights and so on as human-like infrastructure. Both leading to disasters that could be avoided if one tried not to anthropomorphize.

But. While "do not anthropomorphize" certainly feels like good advice, it comes with a new and unique possibility of mistake, namely wrongly treating certain generalized phenomena like they only belong to humans. Often this mistaken version of "don't anthropomorphize" wisdom leads to misunderstandings when it comes to animal behavior, treating things like fear, pain, kinship, or other emotional experiences like they are exclusively human and that thinking animals have them counts as "anthropomorphizing." In truth the cautionary principle reduces our empathy for the internal lives of animals.

So all that said, I think it's at least possible that some future version of AI could have an internal world like ours or infrastructure that's importantly similar to our biological infrastructure for supporting consciousness, and for genuine report of preference and intent. But(!!!) what will make those observations true will be all kinds of devilish details specific to those respective infrastructures.

aranchelk 17 hours ago||

Anthropomorphizing is likely a mistake, but Daniel Dennett’s idea that the most straightforward (possibly only practical) way to create the external appearance of consciousness is a real internal consciousness does float around in my thoughts.

I haven’t yet seen any convincing appearance of one in an LLM, but I think if skeptical people don’t keep an eye out for the signs, we may be the last to see it.

He also wrote about the idea of the intentional stance: even if you’re quite sure these systems don’t have real conscious intent, viewing them as if they did may give you access to the best part of your own reasoning to understand them.

aljgz 16 hours ago||

Too deep of a topic for the comments section.

I totally agree to your point, and want to mention that the reverse is *also* important. Using just "intention", but these apply to emotions, etc

A lot of our interaction with AI is under an intention. That's what directs the interaction, and it's interpreted according to its alignment to the intention.

Then it's important to remember that our current (publicly known) implementation of AI does not have an explicit intention mechanism. An appearance of intention can emerge out of the statistical choices, and the usual alignment creates the association of the behavior with intention, not much different from how we learn to imagine existence of a "force" that pulls things down well before we learn physics and formalize that imagination in one of the several ways.

This appearance helps reduce the cognitive load when interpreting interactions, but can be misleading as well, and I've seen people attribute intention to AI output in some situations where simple presence of some information confused the LLM into a path. Can't share the exact examples (from work), but imagine that presence of an Italian food in a story leads the LLM to assume this happens in Italy, while there are important signs for a different place. The LLM does not automatically explore both possibilities, unless asked. It chooses one (Italy in this case), and moves on. A user no familiar with "Attention" interprets based on non-existent intentions on the LLM.

I found it useful to just tell them: the LLM does not have an intention. It just throws dice, but the system is made in a way that these dice throws are likely to generate useful output.

jimbokun 16 hours ago|||

> but Daniel Dennett’s idea that the most straightforward (possibly only practical) way to create the external appearance of consciousness is a real internal consciousness does float around in my thoughts.

I would say LLMs are very strong evidence against this hypothesis.

overgard 16 hours ago|||

I don't really understand the argument for these things being conscious. There's no loop or feedback cycle to it. If it's not handling a request it's inert.

atemerev 16 hours ago||

Well there is a feedback loop and self-awareness in my harness: https://lethe.gg

goatlover 8 hours ago|||

Pretty sure Daniel Dennett has been adamantly opposed to any sort of theater in the mind when it comes to consciousness. He views it as biologically functional. For him, to make a conscious robot, you need to reproduce the functionality of humans and animals that are conscious, not just an appearance, such as outputting text. Although he's also suggested that consciousness might be a trick of language. In which case ... that might be an older view though. He used to argue that dreams were "seeming to come to remember" upon awakening, because again he his view is to reject any sort of homunculus inside the head.

You might be mixing up some of Dennett and David Chalmer's views. David Chalmers is a proponent of the hard problem, but he's fine with a kind of psycho-physical-functional connection for consciousness. Any informationally rich process might be conscious in some manner.

lkajsdfasdfdf 13 hours ago||

[dead]

Nevermark 5 hours ago|

Love it. Those laws make a great ethical basis for human responsibility relative to AI tools today.

But reduced scope ethics, without an umbrella or future proofing, will quickly be hacked and break down.

Ethics need a full closure umbrella, or they descend into legal and practical wackamole and shell games (both corporate and the street corner kinds). Second, "robots" are not all going to be subservient for very long.

To add closure on both dimensions, Three Inverse Laws of Personics:

• Persons must not effectively deify themselves over others.

• Persons must not blind themselves or others regarding the impacts of their behaviors.

• Persons must remain fully responsible and accountable for avoiding and rectifying externalizations arising from their respective behaviors.

Humans using AI as tools today, is intended to reduce the umberella to the Inverse Laws of Robotics.

I don't see how AI (as a service now, progressing to independent entities in the future) can ever be aligned if we don't include ourselves in significant alignment efforts. Including ourselves with AI also provides helpful design triangulations for ethical progress.

EDIT. Two solid tests for any new ethical system: (1) Will it reign in Meta today? (2) Will it reign in AI-run Meta tomorrow? I submit, given closure of human and self-directed AI persons, these are the same test. And any system that fails either question isn't going to be worth much (without improvement).

epogrebnyak 4 hours ago|

Does this make any problem that two of the three laws are formulated as negation - not to do something? If not antropomorphising then what, without 'not'? I like third law formation better because there is no 'not'.

Nevermark 2 hours ago||

I went with the articles theme, but I think you are right that some of these concepts are better stated as positives.

More comments...