Why XML tags are so fundamental to Claude

Posted by glth 7 hours ago

Why XML tags are so fundamental to Claude(glthr.com)

126 points | 82 comments

RadiozRadioz 2 hours ago|

> a contrast between Claude’s modern approach [...] XML, a technology dating back to 1998

Are we really at the point where some people see XML as a spooky old technology? The phrasing dotted around this article makes me feel that way. I find this quite strange.

coldtea 2 hours ago||

XML has been "spooky old technology" for over a decade now. It's heyday was something like 2002.

Nobody dares advertise the XML capabilities of their product (which back then everybody did), nobody considers it either hot new thing (like back then) or mature - just obsolete enterprise shit.

It's about as popular now as J2EE, except to people that think "10 years ago" means 1999.

fc417fc802 13 minutes ago|||

It's not the hot new thing but when has hype ever mattered for getting shit done? I don't think anyone who considers it obsolete has an informed opinion on the matter.

Typically a more primitive (sorry, minimal) format such as JSON is sufficient in which case there's no excuse to overcomplicate things. But sometimes JSON isn't sufficient and people start inventing half baked solutions such as JSON-LD for what is already a solved problem with a mature tech stack.

XSLT remains an elegant and underused solution. Guile even includes built in XML facilities named SXML.

rhdunn 55 minutes ago||||

XML is used a lot in standards and publishing industries -- JATS, EPUB, ODF, DOCX/XLSX/..., DocBook, etc. are all XML based/use XML.

graemep 3 minutes ago|||

Also in finance. XBRL and FIXML although I do not know how widely used the latter is.

michaelbarton 31 minutes ago|||

Without being facetious, isn’t HTML a dialect of XML and very widely used?

Twisol 27 minutes ago|||

HTML is actually a dialect of SGML. XHTML was an attempt to move to an XML-based foundation, but XML's strictness in parsing worked against it, and eventually folks just standardized how HTML parsers should interpret ill-formed HTML instead.

vitejose 26 minutes ago||||

No, HTML was historically supposed to be a subset of SGML; XML is also an application of SGML. XHTML is the XML version of HTML. As of HTML5, HTML is no longer technically SGML or XML.

girvo 28 minutes ago||||

I kind of miss SOAP. Ahead of its time? Probably not, but I built some cool things on top of it

pfraze 15 minutes ago||

atproto's lexicon-based rpc is pretty soap-like

vlovich123 1 hour ago||||

For me, even when it was first released, I considered obsolete enterprise shit. That view has not diminished as the sorry state of performance and security in that space has just reaffirmed that perception.

cyanydeez 1 hour ago||||

20 years old means 1980!

eduction 49 minutes ago||||

Obsolete enterprise shit I guess includes podcasting. Impressive for the enterprise.

I’d be very curious what lasting open formats JSON has been used to build.

himata4113 1 hour ago|||

didn't know html was spooky tech, TIL. /s

shams93 5 minutes ago|||

It has a number of security issues which have not been fixed which could be used for really interesting exploitation.

oytis 1 hour ago|||

XML is still around, but I don't think many people would choose it as a serialization format today for something new.

dathanb82 14 minutes ago||

The use of XML as a data serialization format was always a bad choice. It was designed as a document _markup_ language (it’s in the name), which is exactly the way it’s being used for Claude, and is actually a good use case.

hbarka 2 hours ago|||

If you think XML is old tech, wait until you hear of EDI, still powering Walmart and Amazon logistics. XML came in like a wrecking ball with its self-documenting promise designed to replace that cryptic pesky payload called EDI. XML promised to solve world hunger. It spawned SOAP, XML over RPC, DOM, DTD, the heyday was beautiful and Microsoft was leading the charge. C# was also right around this time. Consulting firms were bloomed charged with delivering the asynchronous revolution, the loosely coupled messaging promises of XML. I think it succeeded and it’s now quietly in the halls of warehouse having a beer or two with its older cousin the Electronic Data Interchange aka EDI.

actionfromafar 47 minutes ago||

EDI is XML now.

intrasight 2 hours ago|||

Yup. Kids these days...

theowaway213456 2 hours ago||

The evidence suggests that XML was never that popular though for the general audience, you have to admit.

For Web markup, as an industry we tried XHTML (HTML that was strictly XML) for a while, and that didn't stick, and now we have HTML5 which is much more lenient as it doesn't even require closing tags in some cases.

For data exchange, people vastly prefer JSON as an exchange format for its simplicity, or protobuf and friends for their efficiency.

As a configuration format, it has been vastly overtaken by YAML, TOML, and INI, due to their content-forward syntax.

Having said all this I know there are some popular tools that use XML like ClickHouse, Apple's launchd, ROS, etc. but these are relatively niche compared to (e.g.) HTML

icermann 1 hour ago||

MS Office and Open-/LibreOffice are using zipped xml files (e.g. .docx, .xlsx and .odt). Svg vector graphics is xml, the x in ajax stands for xml (although replaced by json by now). SOAP (probably counts as the predecessor of REST) is xml-based.

XML was definitely popular in the "well used" sense. How popular it was in the "well liked" sense can maybe be up for debate, but it was the best tool for the job at the time for alot of use cases.

kid64 4 hours ago||

The thesis here seems to be that delimiters provide important context for Claude, and for that putpose we should use XML.

The article even references English's built-in delimiter, the quotation mark, which is reprented as a token for Claude, part of its training data.

So are we sure the lesson isn't simply to leverage delimiters, such as quotation marks, in prompts, period? The article doesn't identify any way in which XML is superior to quotation marks in scenarios requiring the type of disambiguation quotation marks provide.

Rather, the example XML tags shown seem to be serving as a shorthand for notating sections of the prompt ("treat this part of the prompt in this particular way"). That's useful, but seems to be addressing concerns that are separate from those contemplated by the author.

sheept 1 hour ago||

XML is a bit more special/first class to Claude because it uses XML for tool calling:

    <antml:invoke name="Read">                                                    
      <antml:parameter name="file_path">/path/to/file</antml:parameter>             
      <antml:parameter name="offset">100</antml:parameter>                          
      <antml:parameter name="limit">50</antml:parameter>                            
    </antml:invoke>

I'm sure Claude can handle any delimiter and pseudo markup you throw at it, but one benefit of XML delimiters over quotation marks is that you repeat the delimiter name at the end, which I'd imagine might help if its contents are long (it certainly helps humans).

jinushaun 3 hours ago||

Except quotation marks look like regular text. I regularly use quotes in prompts for, ya know, quotes.

wolttam 3 hours ago||

The GP isn't suggesting to literally use quotes as the delimiter when prompting LLMs. They're pointing out that we humans already use delimiters in our natural language (quotation marks to delimit quotes). They're suggesting that delimiters of any kind may be helpful in the context of LLM prompting, which to me makes intuitive sense. That Claude is using XML is merely a convention.

hkbuilds 40 minutes ago||

This matches my experience building AI-powered analysis tools. Structured output from LLMs is dramatically more reliable when you give the model clear delimiters to work with.

One thing I've found: even with XML tags, you still need to validate and parse defensively. Models will occasionally nest tags wrong, omit closing tags, or hallucinate new tag names. Having a fallback parser that extracts content even from malformed XML has saved me more than once.

The real win is that XML tags give you a natural way to do few-shot prompting with structure. You can show the model exactly what shape the output should take, and it follows remarkably well.

Jcampuzano2 1 hour ago||

But should this extend to anything that could end up in Claudes context? Should we be using xml even in skills for instance, or commands, custom subagents etc.

And then do we end up over indexing on Claude and maybe this ends up hurting other models for those using multiple tools.

I just dislike how much of AI is people saying "do this thing for better results" with no definitive proof but alas it comes with the non determinism.

At least this one has the stamp of approval by Claude codes team itself.

lmeyerov 1 hour ago||

My intuition is it comes down to error-correcting codes. We're dealing with lossy systems that get off track, so including parity bits helps.

Ex: <message>...</message> helps keep track. Even better? <message78>...</message78>. That's ugly xml, but great for LLMs. Likewise, using standard ontologies for identifiers (ex: we'll do OCSF, AT&CK, & CIM for splunk/kusto in louie.ai), even if they're not formally XML.

For all these things... these intuitions need backing by evals in practice, and part of why I begrudgingly flipped from JSON to XML

Lerc 2 hours ago||

I am unconvinced.

To me it seems like handling symbols that start and end sequences that could contain further start and end symbols is a difficult case.

Humans can't do this very well either, we use visual aids such as indentation, synax hilighting or resort to just plain counting of levels.

Obviously it's easy to throw parameters and training at the problem, you can easily synthetically generate all the XML training data you want.

I can't help but think that training data should have a metadata token per content token. A way to encode the known information about each token that is not represented in the literal text.

Especially tagging tokens explicitly as fiction, code, code from a known working project, something generated by itself, something provided by the user.

While it might be fighting the bitter lesson, I think for explicitly structured data there should be benefits. I'd even go as far to suggest the metadata could handle nesting if it contained dimensions that performed rope operations to keep track of the depth.

If you had such a metadata stream per token there's also the possibility of fine tuning instruction models to only follow instructions with a 'said by user' metadata, and then at inference time filter out that particular metadata signal from all other inputs.

It seems like that would make prompt injection much harder.

scotty79 2 hours ago||

Transformers look like perfect tech for keeping track of how deep and inside of what we are at the moment.

thesz 1 hour ago||

Transformers are able to recognize balanced brackets grammar at 97% success rate: https://openreview.net/pdf?id=kaILSVAspn

This is 3% or infinitely far away from the perfect tech.

The perfect tech is the stack.

cyanydeez 1 hour ago||

Basically, the only way you're separting user input from model meta-input is using some kind of character that'll never show up in the output of either users or LLMs.

While technically possible, it'd be like a unicode conspiracy that had to quietly update everywhere without anyone being the wiser.

michaelcampbell 4 hours ago||

Total tangent, but what vagary of HTML (or the Brave Browser, which I'm using here) causes words to be split in very odd places? The "inspect" devtools certainly didn't show anything unusual to me. (Edit: Chrome, MS Edge, and Firefox do the same thing. I also notice they're all links; wonder if that has something to do with it.)

https://i.imgur.com/HGa0i3m.png

werdnapk 4 hours ago||

CSS on the <a> tags:

word-break: break-all;

knallfrosch 3 hours ago|||

It's an error in the site's CSS. CSS has way better methods, like splitting words correctly depending on the language and hyphenating it.

Although I can never remember the correct incantation, should be easy for LLMs.

fancy_pantser 4 hours ago|||

CSS word-break property

rosstex 2 hours ago||

Ask Claude?

strongpigeon 2 hours ago||

This seems like an actual good use for XML. Using it as a serialization format always rubbed me the wrong way (it’s super verbose, the named closing tag are unnecessary grammar-wise, the attribute-or-child question etc.) But to markup and structure LLM prompts and response it feels better than markdown (which doesn’t stream that well)

apwheele 4 hours ago||

I think XML is good to know for prompting (similar to how <think></think> was popular for outputs, you can do that for other sections). But I have had much better experience just writing JSON and using line breaks, colons, etc. to demarcate sections.

E.g. instead of

    <examples>
      <ex1>
        <input>....</input>
        <output>.....</output>
      </ex1>
      <ex2>....</ex2>
      ...
    </examples>
    <instructions>....</instructions>
    <input>{actual input}</input>

Just doing something like:

    ...instructions...
    input: ....
    output: {..json here}
    ...maybe further instructions...
    input: {actual input}

Use case document processing/extraction (both with Haiku and OpenAI models), the latter example works much better than the XML.

N of 1 anecdote anyway for one use case.

galaxyLogic 2 hours ago||

XML helps because it a) Lets you to describe structures b) Make a clear context-change which make it clear you are not "talking in XML" you are "talking about XML".

I assume you are right too, JSON is a less verbose format which allows you to express any structure you can express in XML, and should be as easy for AI to parse. Although that probably depends on the training data too.

I recently asked AI why .md files are so prevalent with agentic AI and the answer is ... because .md files also express structure, like headers and lists.

Again, depends on what the AI has been trained on.

I would go with JSON, or some version of it which would also allow comments.

ekjhgkejhgk 4 hours ago|||

Could you clarify, do those tags need to be tags which exist and we need to lear about them and how to use them? Or we can put inside them whatever we want and just by virtue of being tags, Claude understands them in a special way?

ezfe 4 hours ago|||

They probably don’t need to be specific values. The model is fine tuned to see the tags as signals and then interprets them

galaxyLogic 2 hours ago||

If it walks like a duck ... AI thinks it is something like a duck.

apwheele 4 hours ago|||

All the major foundation models will understand them implicitly, so it was popular to use <think>, but you could also use <reason> or <thinkhard> and the model would still go through the same process.

cyanydeez 1 hour ago||

<ponderforamoment>HTML is a large subsection of their training data, so they're used to seeing a somewhat semantic worldview</ponderforamoment>

marxisttemp 2 hours ago||

XML is much more readable than JSON, especially if your data has characters that are meaningful JSON syntax

galaxyLogic 2 hours ago||

I think readability is in the eye of the reader. JSON is less verbose, no ending tags everywhere, which I think makes it more readable than XML.

But I'd be happy to hear about studies that show evidence for XML being more readable, than JSON.

ezfe 5 minutes ago||

I disagree that XML is more readable in general, but for the purpose of tagging blocks of text as <important>important</important> in freeform writing, JSON is basically useless

TutleCpt 2 hours ago|

I think this article is 100% relevant to you today. Anthropic put out a training video, a number of months ago saying that XML should be highly encouraged for prompts. See https://m.youtube.com/watch?v=ysPbXH0LpIE

More comments...