Posted by qbow883 12/27/2025
My impression is, their audience equates file size with quality so the bigger the file the more "value" they got from the creator. This is frustrating because bigger files means hitting transfer limits, slower to download, slower to copy, taking more space, etc...
Unless one lives in a country where the internet is slow and/or hard drives are expensive, I think the audience does not care.
My hypothesis is that they use a really high quality value, and that there are diminishing returns there.
The way I understand it, we've got the YCbCr that is being converted to an RGB value which directly corresponds to how bright we drive the R, G, and B subpixels. So wouldn't the entire range already be available? As in, post-conversion to RGB you've got 256 levels for each channel which can be anywhere from 0 to 255 or 0% to 100%? We could go to 10-bit color which would then give you finer control with 1024 levels per channel instead of 256, but you still have the same range of 0% to 100%. Does the YCbCr -> RGB conversion not use the full 0-255 range in RGB?
Naturally, we can stick brighter backlights in our monitors to make the difference between light and dark more significant, but that wouldn't change the on-disk or on-the-wire formats. Those formats have changed (video files are specifically HDR or SDR and operating systems need to support HDR to drive HDR monitors), so clearly I am missing something but all of my searches only find people comparing the final image without digging into the technical details behind the shift. Anyone care to explain or have links to a good source of information on the topic?
This was probably not the most accurate explanation, but hopefully it's enough to point you in the right direction.
It's actually the opposite that makes the biggest difference with the physical monitor. CRTs always had a residual glow that caused blacks to be grays. It was very hard to get true black on a CRT unless it was off and had been for some time. It wasn't until you could actually have no light from a pixel where black was actually black.
Sony did a demo when they released their OLED monitors where they had the top of each monitor type side by side: CRT, LCD, OLED. The CRT was just gray while the OLED was actually black. To the point that I was thinking in my head that surely this is a joke and the OLED wasn't actually on. That's precisely when the narrator said "and just to show that the monitors are all on" as the video switched to a test pattern.
As for the true question you're getting at, TFA mentions things like color matrix, primaries, and transfer settings in the file. Depending on the values, the decoder makes decision on the math used to calculate the values. You can use any of the values on the same video and arrive at different results. Using the wrong ones will make your video look bad, so ensuring your file has the correct values is important.
From TFA: https://gist.github.com/arch1t3cht/b5b9552633567fa7658deee5a...
For what it's worth, the display I liked best was a monochrome terminal, a vt220, Let me explain, a crt does not really have pixels as we think of them on an modern display, but they do have a shadow mask which is nearly the same thing. however a monochrome crt(as found in a terminal or oscilloscope) has no shadow mask, the text of those vt220 was tight, it was a surprisingly good reading experience.
If so, try this: https://gregbenzphotography.com/hdr-gain-map-gallery/
Clicking the "Limit to SDR" and "Allow Full HDR (as supported)" should show a significant difference if you device supports HDR. If you don't see a difference then your device doesn't support HDR (or your browser)
For these images, there's a specific extension to JPEG where they store the original JPEG like you've always seen, and then a separate embedded gain map to add brightness if the device supports it. That's for stills (JPEGs) though, not video but the "on the wire difference" is that gain map
I'm not an expert but for videos, ATM, afaict, they switched them to 10bits (SDR is 8bits), and added metadata to map that 10 bits to values > "white" where white = 100nits. This metadata (PQ or HLG) can map those 10 bits up to 10000 nits.
1. A larger color space, allowing for more colors (through different color primaries) and a higher brightness range (though a different gamma function)
2. Metadata (either static or per-scene or per-frame) like a scene's peak brightness concrete tonemapping settinsg, which can help players and displays map the video's colors to the set of colors it can display.
I actually have a more advanced but more compact "list of resources" on video stuff in another gist; that has a section on color spaces and HDR:
https://gist.github.com/arch1t3cht/ef5ec3fe0e2e8ae58fcbae903...
if you expand limited YCrCb to a large HDR range you'll get a "blurred" output.
Imaging converting 1 bit image (0 or 1, black or white pixel) to full range HDR RGB - it's still black and white
> 10 bits per sample Rec. 2020 uses video levels where the black level is defined as code 64 and the nominal peak is defined as code 940. Codes 0–3 and 1,020–1,023 are used for the timing reference. Codes 4 through 63 provide video data below the black level while codes 941 through 1,019 provide video data above the nominal peak.
https://en.wikipedia.org/wiki/Rec._2020
Compare to
Edit: from some googling it looks like encoding is encoding, whether it’s used for recording or rendering footage. In that case the same quality arguments the article is making should apply for recording too. I only did a cursory search though and have not had a chance to test so if anyone knows better feel free to respond
GPU acceleration could be used to accelerate a CPU encode in a quality-neutral way, but NVENC and the various other HW accelerators available to end users are designed for realtime encoding for broadcast or for immediate storage (for example, to an SD card).
For distribution, you can either distribute the original source (if bandwidth and space are no concern), or you can ideally encode in a modern, efficient codec like x265 or AV1. AV1 might be particularly useful if you have a noisy source, since denoising and classification of the noise is part of the algorithm. The reference software encoders are considered the best quality, but often the slowest, options.
GPU is best if you need to temporarily transcode (for Plex), or you want to make a working copy for temporary distribution before a final encode.
ffmpeg seems ridiculously complicated, but infact its amazing the amount of work that happens under the hood when you do
ffmpeg -i input.mp4 output.webm
and tbh theyve made the interface about as smooth as can be given the scope of the problem.Grump grump grumpity grump. Same experience with every dashcam I've bought over the years.
I think it might be one of those classic “everyone should just get good like me” style opinions you find polluting some subject matter communities.
The vlc was how you could get any movie to work (instead of messing with all these codecs, which apparently, in lieu to another comment in this thread, aren't really codecs).
The idea that YCbCr is only here because of "legacy reasons", and that we only we discard half of chrominance because of equally "legacy reasons" is bonkers, though.
Similarly, chroma subsampling is motivated by psychovisual aspects, but I truly believe that enforcing it on a format level is just no longer necessary. Modern video encoders are much better at encoding low-frequency content at high resolutions than they used to be, so keeping chroma at full resolution with a lower bitrate would get you very similar quality but give much more freedom to the encoder (not to mention getting rid of all the headaches regarding chroma location and having to up- and downscale chroma whenever needing to process something in RGB).
Regarding the tone of the article, I address that in my top-level comment here.
It really isn't. You have to scroll 75% of the way through the document before you it tells you what to actually type in. Everything before (9000+ words) is just ranty exposition that might be relevant, but is hardly "quick".