Posted by todsacerdoti 14 hours ago
Did the site get the HN kiss of death?
I, too, was reading about the new Epstein files, wondering what text artifact was causing things to look like that.
https://nitter.net/AFpost/status/2017415163763429779?s=201
Something clearly went wrong in the process.
I'm glad to know the real reason!
I wonder why even have a max line length limit in the first place? I.e. is this for a technical reason or just display related?
I wonder if the person who had the idea of virtualizing the typewriter carriage knew how much trouble they would cause over time.
It would’ve been far less messy to make printers process linefeed like \n acts today, and omit the redundant CR. Then you could still use CR for those overstrike purposes but have a 1-byte universal newline character, which we almost finally have today now that Windows mostly stopped resisting the inevitable.
I've been trying to get Visual Studio to stop mucking with line endings and encodings for years. I've searched and set all the relevant settings I could find, including using a .editorconfig file, but it refuses to be consistent. Someone please tell me I'm wrong and there's a way to force LF and UTF-8 no-BOM for all files all the time. I can't believe how much time I waste on this, mainly so diffs are clean.
How far can you get with setting core.autocrlf on your machine? See https://git-scm.com/book/en/v2/Customizing-Git-Git-Configura...
Now, if you want to use CR by itself for fancy overstriking etc. you'd need to put something else into the character stream, like a space followed by a backspace, just to kill time.
In any event, wouldn't you have to either buffer or use flow-control to pause receiving while a CR was being processed? You wouldn't want to start printing the next line's characters in reverse while the carriage was going back to the beginning.
My suspicion is there was a committee that was more bent on purity than practicality that day, and they were opposed to the idea of having CR for "go to column 0" and newline for "go to column 0 and also advance the paper", even though it seems extremely unlikely you'd ever want "advance the paper without going to column 0" (which you could still emulate it with newline + tab or newline + 43 spaces for those exceptional cases).
If you look at the schematics for an ASR-33, there's just 2 transistors in the whole thing (https://drive.google.com/file/d/1acB3nhXU1Bb7YhQZcCb5jBA8cer...). Even the serial decoding is done electromechanically (per https://www.pdp8online.com/asr33/asr33.shtml), and the only "flow control" was that if you sent XON, the teletype would start the paper tape reader -- there was no way, as far as I can tell, for the teletype to ask the sender to pause while it processes a CR.
These things ran at 110 baud. If you can't do flow control, your only option if CR takes more than 1/10th of a second is to buffer... but if you can't do flow control, and the computer continues to send you stuff at 110 baud, you can't get that buffer emptied until the computer stops sending, so each subsequent CR will fill your buffer just a little bit more until you're screwed. You need the character following CR (which presumably takes about 2/10ths of a second) to be a non-printing character... so splitting out LF as its own thing gives you that and allows for the occasional case where doing a linefeed without a carriage return is desirable.
Curious Marc (https://www.curiousmarc.com/mechanical/teletype-asr-33) built a current loop adapter for his ASR-33, and you'll note that one of the features is "Pin #32: Send extra NUL character after CR (helps to not loose first char of new line)" -- so I'd guess that on his old and probably worn-out machine, even sending LR after CR doesn't buy enough time and the next character sometimes gets "lost" unless you send a filler NUL.
Now, I haven't really used serial communications in anger for over a decade, and I've never used a printing terminal, so somebody with actual experience is welcome to come in and tell me I'm wrong.
But see, that's why I think there has to be more to it. That extra LF character wouldn't be enough to satisfy the timing requirements, so you'd also need to send NUL to appropriately pad the delay time. And come to think of it, the delay time would be proportional to the column the carriage was on when you sent the CR, wouldn't it? I guess it's possible that it always went to the end but that seems unlikely, not least because if that were true then you'd never need to send CR at all, just send NUL or space until you calculated it was at EOL.
Edit: yes I think that's most likely what it is (and it's SHOULD 78ch; MUST 998ch) - I was forgetting that it also specifies the CRLF usage, it's not (necessarily) related to Windows at all here as described in TFA.
Here it is in my 'notmuch-more' email lib: https://github.com/OJFord/amail/blob/8904c91de6dfb5cba2b279f...
The article doesn't claim that it's Windows related. The article is very clear in explaining that the spec requires =CRLF (3 characters), then mentions (in passing) that CRLF is the typical line ending on Windows, then speculates that someone replaced the two characters CRLF with a one character new line, as on Unix or other OSs.
It's just sp hacky i cant belive it's a real life's solution
Consider converting the original text (maintaining the author’s original line wrapping and indentation) to base64. Has anything been “inserted” into the text? I would suggest not. It has been encoded.
Now consider an encoding that leaves most of the text readable, translates some things based on a line length limit, and some other things based on transport limitations (e.g. passing through 7-bit systems.) As long as one follows the correct decoding rules, the original will remain intact - nothing “inserted.” The problem is someone just knowledgeable enough to be aware that email is human readable but not aware of the proper decoding has attempted to “clean up” the email for sharing.
Infinite line length = infinite buffer. Even worse, QP is 7-bit (because SMTP started out ASCII only), so characters >127 get encoded as three bytes (equal, then two hex digits), so a 500-character non-ASCII UTF8 line is 1500 bytes.
It all made sense at the time. Not so much these days when 7-bit pipes only exist because they always have.
But I agree with sibling comment: it makes more sense when its called "encoding" instead of "inserting chars into original stream"
Digital communication is based on the postmen reading, transcribing and copying your letters. There is a reason why digital communication is treated differently then letters by the law and why the legally mandated secrecy for letters doesn't apply to emails.
cat title | sed 's/anyway/in email/'
would save a click for those already familiar with =20 etc.On a side note: There are actually products marketed as kosher bacon (it's usually beef or turkey). And secular Jews frequently make jokes like this about our kosher bros who aren't allowed to eat the real stuff for some dumb reason like it has too many toes.
That said, there is a _possibly_ kosher pig: https://en.wikipedia.org/wiki/Babirusa#Relationship_with_hum...
Yeah clearly you guys are the biggest victims in all this... get in there and make it about you!