Changes in the system prompt between Claude Opus 4.6 and 4.7

Posted by pretext 16 hours ago

Changes in the system prompt between Claude Opus 4.6 and 4.7(simonwillison.net)

196 points | 115 commentspage 2

SoKamil 14 hours ago|

New knowledge cutoff date means this is a new foundation model?

lkbm 13 hours ago||

Yes, but doesn't the token change mean that?

clickety_clack 4 hours ago||

You can train a tokenizer on old data just like you can train a model on old data.

wongarsu 2 hours ago||

But you can't use an old model with a new tokenizer. Changing the tokenizer implies you trained the model from scratch

jimmypk 14 hours ago||

[dead]

mwexler 12 hours ago||

Interesting that it's not a direct "you should" but an omniscient 3rd person perspective "Claude should".

Also full of "can" and "should" phrases: feels both passive and subjunctive as wishes, vs strict commands (I guess these are better termed “modals”, but not an expert)

zmmmmm 3 hours ago||

Yes I was interested in that too. It suggests that in writing our own guidance for we should follow a similar style, but I rarely if ever see people doing that. Most people still stick to "You" or abstract voice "There is ..." "Never do ..." etc.

It must be that they are training very deeply the sense of identity in to the model as Claude. Which makes me wonder how it then works when it is asked to assume a different identity - "You are Bob, a plumber who specialises in advising design of water systems for hospitals". Now what? Is it confused? Is it still going to think all the verbiage about what "Claude" does applies?

KolenCh 4 hours ago|||

“Claude” is more specific than “you”. Why rely on attention to figure out who’s the subject? Also it is in their (people from Anthropic) believe that rule based alignment won’t work and that’s why they wrote the soul document as “something like you’d write to your child to show them how they should behave in the world” (I paraphrase). I guess system prompt should be similar in this aspect.

saagarjha 5 hours ago||

That’s because Anthropic does not consider their model as having personality but rather that it simulates the experience of an abstract entity named Claude.

akdor1154 4 hours ago||

That sounds really interesting, but my google-fu is not up to task here, I'm getting pages and pages of nonsense asking if Claude is conscious. Can you elaborate?

saagarjha 3 hours ago|||

I actually think this is pretty straightforward if you think of it something like

  class Claude {}
  
  Claude anthropicInstance = new Claude();
  anthropicInstance.greet();

Just like a "Cat" object in Java is supposed to behave like a cat, but is not a cat, and there is no way for Cat@439f5b3d to "be" a cat. However, it is supposed to act like a cat. When Anthropic spins up a model and "runs" it they are asking the matrix multipliers to simulate the concept of a person named Claude. It is not conscious, but it is supposed to simulate a person who is conscious. At least that is how they view it, anyway.

EMM_386 3 hours ago|||

You can read the latest Claude Constitution plus more info here:

https://www.anthropic.com/news/claude-new-constitution

dmk 15 hours ago||

The acting_vs_clarifying change is the one I notice most as a heavy user. Older Claude would ask 3 clarifying questions before doing anything. Now it just picks the most reasonable interpretation and goes. Way less friction in practice.

bavell 13 hours ago||

Haven't had a chance to test 4.7 much but one of my pet peeves with 4.6 is how eager it is to jump into implementation. Though maybe the 4.7 is smarter about this now.

sersi 12 hours ago|||

I really hate that change, it's now regularly picking bad interpretation instead of asking.

verve_rat 6 hours ago||

Yeah, that really feels like a choice that should be user preference.

poszlem 5 hours ago||

I have the opposite experience. It now picks the most inane interpretation or make wild assumptions and I have to keep interrupting it more than ever.

ikidd 13 hours ago||

I had seen reports that it was clamping down on security research and things like web-scraping projects were getting caught up in that and not able to use the model very easily anymore. But I don't see any changes mentioned in the prompt that seem likely to have affected that, which is where I would think such changes would have been implemented.

embedding-shape 13 hours ago||

I think it depends on how badly they want to avoid it. Stuff that is "We prefer if the model didn't do these things when the model is used here" goes into the system prompt, meanwhile stuff that is "We really need to avoid this ever being in any outputs, regardless of when/where the model is used" goes into post-training.

So I'm guessing they want none of the model users (webui + API) to be able to do those things, rather than not being able to do that just in the webui. The changes mentioned in the submission is just for claude.ai AFAIK, not API users, so the "disordered eating" stuff will only be prevented when API users would prompt against it in their system prompts, but not required.

kaoD 13 hours ago|||

I wonder if the child safety section "leaks" behavior into other risky topics, like malware analysis. I see overlap in how the reports mention that once the safety has been tripped it becomes even more reluctant to work, which seems to match the instructions here for child safety.

bakugo 12 hours ago||

It's built into the model, not part of the system prompt. You'll get the same refusals via the API.

varispeed 15 hours ago||

Before Opus 4.7, the 4.6 became very much unusable as it has been flagging normal data analysis scripts it wrote itself as cyber security risk. Got several sessions blocked and was unable to finish research with it and had to switch to GPT-5.4 which has its own problems, but at least is not eager to interfere in legitimate work.

edit: to be fair Anthropic should be giving money back for sessions terminated this way.

ceejayoz 14 hours ago|

> edit: to be fair Anthropic should be giving money back for sessions terminated this way.

I asked it for one and it told me to file a Github issue.

Which I interpreted as "fuck off".

slashdave 2 hours ago||

You asked the agent directly for a refund?

mannanj 13 hours ago||

Personally, as someone who has been lucky enough to completely cure "incurable" diseases with diet, self experimentation and learning from experts who disagreed with the common societal beliefs at the time - I'm concerned that an AI model and an AI company is planting beliefs and limiting what people can and can't learn through their own will and agency.

My concern is these models revert all medical, scientific and personal inquiry to the norm and averages of whats socially acceptable. That's very anti-scientific in my opinion and feels dystopian.

gausswho 9 hours ago|

While I share your concern for a winners-take-all model getting bent, I do have an optimism that models we've never heard of plug away challenging conclusions in medical canon. We will have a popular vaccine denying AND vaccine authoring models.

techpulselab 2 hours ago||

[dead]

xdavidshinx1 3 hours ago||

[dead]

kantaro 12 hours ago|

[dead]

More comments...