Structured outputs create false confidence

Posted by gmays 12/21/2025

Structured outputs create false confidence(boundaryml.com)

155 points | 66 commentspage 3

TZubiri 12/21/2025|

They worked fine for me. Keep working at it until results are positive instead of rabbit holing into a failure mode with a blog post.

It's usually more productive to right about how LLMs work rather than how they don't. In this case especially, there's improvements that can be made to the schema, without forfeiting on the idea of schemas altogether

refulgentis 12/21/2025||

"CoT x JSON means you can't get JSON" is 2024.

Every model has built-in segmentation between reasoning/CoT + JSON.

Veen 12/21/2025||

Doesn't the Claude APIs recently introduced ability to combine extended thinking with structured outputs overcome this issue? You get the unconstrained(ish) generation in the extended thinking blocks and then structured formatting informed by that thinking in the final output.

Oras 12/21/2025||

I would like to see a real example, the one given is assuming wanting float and assigning int.

What if you put “float” instead of int to get the required number?

Also the post is missing another use case, enums in structured data. I’ve been using it successfully for a few months now and it’s doing a fantastic job.

sebazzz 12/21/2025||

If this analysis is sound, I wonder if it can be mitigated by using tools instead of structured outputs.

machinationu 12/21/2025||

or tell it to output the data at the end as markdown and then do a second pass with a cheaper model to build the structured output

also, xml works much better than json, all the model guides say this

dzrmb 12/21/2025||

Interesting read and perspective. I had very good results with structured outputs, both text, images and tool calling. Also a lot of SDKs are using it, including Vercel AI SDK.

Thanks for sharing

alienbaby 12/22/2025||

I've wondered if it's because structured outputs rely on visual cues to impart meaning, and turning them into token streams looses that spatial structure.

ursAxZA 12/22/2025||

This seems less like a failure of structured outputs and more like expecting LLMs to behave like deterministic parsers — or am I missing something?

IshKebab 12/22/2025|

This seems pretty silly to me. Their solution for how do get structured output is pretty much just "don't". Well we still need the structured output so what do we do then?

> you need a parser that can find JSON in your output and, when working with non-frontier models, can handle unquoted strings, key-value pairs without comma delimiters, unescaped quotes and newlines; and you need a parser that can coerce the JSON into your output schema, if the model, say, returns a float where you wanted an int, or a string where you wanted a string[].

Oh cool I'm sure that will be really reliably. Facepalm.

> Allow it to respond in a free-form style: let it refuse to count the number of entries in a list, let it warn you when you've given it contradictory information, let it tell you the correct approach when you inadvertently ask it to use the wrong approach

This makes zero sense. The whole point of structured output is that it's a (non-AI) program reading it. That program needs JSON input with a given schema. If it is able to handle contradictory-information warnings, or being told you're using the wrong approach then that will be in the schema anyway!

I think the point about thinking models is interesting, but the solution to that is obviously to allow it to think without the structuring constraint, and then feed the output from that into a query with the structured output constraint.

More comments...