Posted by carlos-menezes 4/12/2025
The 1.2 spec just treats all scalar types as opaque strings, along with a configurable mechanism[0] for auto-converting non-quoted scalars if you so please.
As such, I really don't quite grok why upstream libraries haven't moved to YAML 1.2. Would love to hear details from anyone with more info.
[0]:https://yaml.org/spec/1.2.2/#chapter-10-recommended-schemas
This:
isRegistered: true
could be replaced with accountStatus: "UNREGISTRED"And for this reason, "logging: false" would be clearer than "logging: no" to represent "I do not want logging".
1. Specify in the key
loggingEnabled: false
2. Specify in the value: logging: disabledDon’t use bool at all.
While it has the YAML-like significant whitespace, it looks nice because it doesn't try to be clever.
That sounds like a breaking change that rendered old YAML documents to be parsed differently.
The tag schema used is supposed to be modifiable folks!
And why anyone would still be using 1.1 at this point is just forehead palming foolishness.
The yaml.orf website also lists a bunch of implementations, with about 1/3 supporting 1.2. I'm guessing that the users of those libraries just happily hum along and we never hear from them!
The issue is that downstream consumers of popular languages with a vocal community here on HN tend to just pull in libyaml, PyYAML being the major offender in my mind.
https://www.theverge.com/2020/8/6/21355674/human-genes-renam...
and, setting that aside, the very next paragraph says that this is a legit representation of -2.0 which means something has gone gravely wrong
value: -
# change this to 3.14 one day
2.0 User:
Name: >-
Bob
Phone: >-
01234 56789
Description:>-
This is a
multi line
description
That’s both readable and parses your records as strings.Edit: This stack overflow like provides more details https://stackoverflow.com/questions/3790454/how-do-i-break-a...
Seeing that used systemically, versus just for "risky" fields makes me want to draw attention to the fantastic remarshal tool[1], which offers a "--yaml-style >" (and "|" and the rest) which will render yaml fields quoted as one wishes
1: https://github.com/remarshal-project/remarshal#readme and/or $(brew install remarshal)
The trailing ‘:’ was there right after the ‘n’.
Examples of this syntax:
https://github.com/lmorg/murex/blob/master/builtins/core/arr...
I do agree it’s a bit of a kludge. But if you want data types and unquoted strings then anything you do to the syntax to denote strings over other data types then becomes a bit of a kludge.
The one good thing about this kludge is it allows for string literals (ie no complicated escaping rules).
> Seeing that used systemically, versus just for "risky" fields makes me want to draw attention to the fantastic remarshal tool[1], which offers a "--yaml-style >" (and "|" and the rest) which will render yaml fields quoted as one wishes
I don’t really understand what you’re alluding to here.
$ /usr/local/opt/ansible/libexec/bin/python3 -c 'import sys, yaml; print(yaml.safe_load(sys.stdin.read()))' <<YML
User:
Name: >-
Bob
Phone: >-
01234 56789
Description:>-
This is a
multi line
description
YML
yaml.scanner.ScannerError: while scanning a simple key
in "<unicode string>", line 6, column 6:
Description:>-
$ gojq --yaml-input . <<YML
User:
Name: >-
Bob
Phone: >-
01234 56789
Description:>-
This is a
multi line
description
YML
gojq: invalid yaml: <stdin>:6
6 | Description:>-
^ could not find expected ':'
That's because, for better or worse, yaml considers that a legitimate key name, just missing its delimiter $ gojq --yaml-input . <<YML
User:
Name: >-
Bob
Phone: >-
01234 56789
Description:>-:
This is a
multi line
description
YML
{
"User": {
"Description:>-": "This is a multi line description",
"Name": "Bob",
"Phone": "01234 56789"
}
}
This exchange in a thread complaining about the whitespace sensitivity doesn't escape meAs for remarshal, it was the systemic application of that quoting style that made me think of it, since writing { Name: >- Bob} is the worst of both worlds: not as legible as the plain unquoted version, not suitable for grep, and indentation sensitive
Further to that point, none of the example links I’ve shared have the : at the end and I have production code that works using the formatting I’ve described. So you’re flat out wrong there with your assumption that block keys always terminate with :
> As for remarshal, it was the systemic application of that quoting style that made me think of it, since writing { Name: >- Bob} is the worst of both worlds: not as legible as the plain unquoted version, not suitable for grep, and indentation sensitive
You wouldn’t write code like that because >- denotes a block and you’re now inlining a string.
I mean I’ve shared links explaining how this works and you’re clearly not reading them.
At the end of the day, I’m not going to argue that >- (and its ilk) solves everything. It clearly doesn’t. If you want to write “minimized” YAML using JSON syntax then you’re far far better off quoting the string.
But if you are writing a string in YAML and either don’t want to deal with quotation marks, or need that string to be a string literal (ie not having to escape things like quotation marks) then my suggestion is an option.
It’s not there as a silver bullet but it is a lesser known feature of YAML. Hence me sharing.
Now go read the links and understand it better. You might genuinely find it useful under some scenarios ;)
And yet I brought receipts for my claims, and you just bring "reed the manul, n00b"
Secondly, your "receipts" were incorrect. Neither of your examples follows the examples I cited, and your second example creates a key named "Description:>-", which is clearly wrong. Hence why ">-" needs to be after the colon.
Here is more examples and evidence of how to use >- and why your "receipts" were also incorrect:
https://go.dev/play/p/1B4ba-dUARq
Here you can clearly see my example:
Foo: >-
hello
world
produces: { "Foo": "hello world" }
which is correct.Whereas your example:
Bar:>-:
hello
world
produces { "Bar:\u003e-": "hello world" }
which is incorrect.----
One final point: I don't understand why you're being so argumentative here. I posted a lesser-known YAML feature in case it helps some people and you've turned it into some kind of pissing match based on bad-faith interpretations of my comments. There was no need for you to do that.
It’s a bit of a sore spot in the YAML community as to why PyYAML can’t / won’t support YAML 1.2. It was in maintenance mode for a while. YAML 1.2 also introduced breaking changes.
From a SO comment: “ As long as you're okay with the YAML 1.1 standard, PyYAML is still perfectly fine, secure, etc. If you want to support the YAML 1.2 spec (released in 2009), you can use ruamel.yaml, which started out as a fork of PyYAML. – CrazyChucky Commented Mar 26, 2023 at 20:51”
So people work around the little paper cuts, while still hitting the traps from time to time as they forget them.
> generate YAML
I've a hard time finding a situation where I'd want to do that. Usually YAML is chosen for human readability, but here we're already in a higher level language first. JSON sounds a more appropriate target most of the time ?
> In my opinion, instead of pressuring and insulting people who actually clarify issues with YAML and the wrong statements of some of its proponents, I would kindly suggest reading the JSON spec (which is not that difficult or long) and finally make YAML compatible to it, and educating users about the changes, instead of spreading lies about the real compatibility for many years and trying to silence people who point out that it isn't true.
> Addendum/2009: the YAML 1.2 spec is still incompatible with JSON, even though the incompatibilities have been documented (and are known to Brian) for many years and the spec makes explicit claims that YAML is a superset of JSON. It would be so easy to fix, but apparently, bullying people and corrupting userdata is so much easier.
Well that’s disappointing.
I guess software are human texts after all.
I’m just pointing out that it should be very simple to swap a YAML file for a JSON file in any system that accepts YAML
Configuration files for programs. These tend to be short.
DSLs which are large manifests for things like cloud infrastructure. These tend to be long, they grow over time.
My pet hypothesis is these DSLs exist mostly for neutrality - the vendor can't assume you have Python or something present. But as a user, you can assume just that and gain a lot by authoring in a proper language and generating YAML.
See https://github.com/cloudtools/troposphere for a great example for AWS CloudFormation.
This is where I use YAML and it shines there. IMO easier to read and write by hand than JSON, and short sweet config files don't have the various problems people run into with YAML. It's great.
On cloud infra, yes, having one or two layers of languages is a natural situation. GCP and AWS both accepting (encouraging?) JSON as a subset of YAML makes it a simpler choice when choosing an auto generating target.
You mention people wanting to author the generated files, I think in other situations tweaking the auto-generated files will be seen as riskier with potential overwriting issues, so lower readability will be seen as a positive.
!!boolean
https://dev.to/kalkwst/a-gentle-introduction-to-the-yaml-for...
Trying to find a tag-line for it I like, maybe “markdown for config”?