Posted by todsacerdoti 10/28/2024
There is already an RTP protocol [1] for UDP media streaming, and that is already frequently confused with RDP [2], for GUI remoting.
There is also an rtp:// URI scheme in use in various media applications [3, 4] to identity streaming endpoints.
So I'd kindly request not to increase the confusion even more.
[1] https://en.wikipedia.org/wiki/Real-time_Transport_Protocol
[2] https://en.wikipedia.org/wiki/Remote_Desktop_Protocol
[3] https://ffmpeg.org/ffmpeg-protocols.html#rtp
[4] https://wiki.2n.com/faqaudio/en/rtp-destinations-how-to-test...
It's even stranger (to me) that they picked an existing protocol's name, given that they never even define "RTP" in their paper.
If it's just arbitrary letters, might as well avoid the collision. If it isn't arbitrary letters, please, if you're describing the protocol, you should start by describing the damn acronym.
I think if you don't insist on them making sense as an acronym, there are still some interesting 3-letter combinations free (if you avoid the obvious minefields, that is).
E.g., I've never heard of qqq:// . There is a QQQ Trust apparently, but no network protocol AFAIK.
And since this proposed protocol operates over TCP, there's relatively little that can be done to achieve the performance goals vs what you can already do with HTTP.
And because "everything" already speaks HTTP, you can get pretty close to max performance just via client side intelligence talking to existing backend infrastructure, so there's no need to try to get people to adopt a new protocol. Modern CDNs have gobs of endpoints worldwide.
A relatively simple client can do enough range requests in parallel to saturate typical last-mile pipes, and more intelligent clients can do fancy things to get max performance.
For example, some clients will do range requests against all IPs returned from DNS resolution to detect which servers are "closer" or less busy, and for really large downloads, they'll repeat this throughout the download to constantly meander towards the fastest sources. Another variation (which might be less common these days), is if the initial response is a redirect, it may imply redirects are being used as a load distribution mechanism, so again clients can ask again throughout the download to see if a different set of servers gets offered up as potentially faster sources. Again, all of this works today with plain old HTTP.
Since then I've started to check the sums after downloading, just to be sure.
I wish every binary format would include a hash of the content.
Also this is something that can be in HTTP – it's kind of silly I need to manually download a separate sum file and run a command to check it. Servers can send a header, and user agents can verify the hash. I don't know why this isn't part of HTTP already, because it seems pretty useful to me.
I’m guessing that for your very large file download you had an unusually high number of corrupted TCP packets and some of those were extra unlucky and still had valid checksums.
Also, it's quite possible that the HTTP client didn't even know that the download failed: a common pattern is for the server to send a Content-Length of 0, and simply close the connection when it's done sending all of the traffic (i.e. set the TCP FIN flag on the last data packet). If the server decides to abandon the connection early for any reason, then it will... close the connection - which the client will just interpret as the end of the body, and have no idea that the file failed to download fully.
It could probably be improved, but HTTP does support this already:
Commenting on this proposal directly, I don't see how a stateful protocol could ever be simpler than a subset of HTTP/1.1 with range requests.
I remember hearing that range requests are clunky to implement for HTTP (reverse) proxies, but otherwise they seem to do their job just fine?
A pure proxy (reverse or not) should have no problem with a range request.
They really should be big-endian, because that’s network byte order.
IMHO it makes sense to use decimal-encoded ASCII digits instead and keep the protocol readable. Nothing like ‘telnet host.example 80’ followed by ‘GET / HTTP/1.0.’
> (1 bit) request_type: integer representing type of request
With two types already defined. No room for future extensions, then. Is the idea to just use another protocol altogether if change is necessary?
That's not a good reason. It's the byte order used for some network protocols, but definitely not all. And given that protocols aren't interchangeable there's no advantage from having all protocols use the same endianness.
Little endian makes way more sense today because all modern computers are little endian. The network itself doesn't care what you use. It's just a stream of bytes to the network.
I'm all for silly names, but I think this one went a little too much into obscure references and metaphors.
The 80 in port 80 is not ASCII encoded on the wire. That's an UI feature of telnet and/or your OS. (The 1.0 in HTTP is, though.)
Yup! You’re exactly right. I honestly thought that was obvious in context.
There’s something just right about being able to manually connect to a web server and run queries against it, with very little in the way of tooling to do so. Technically, of course, both telnet(1) and nc(1) are tools, and even one’s TCP/IP stack and OS are, too, which is why I write ‘very little.’ It’s a heck of a lot fewer tools, and more general tools, than JSON-over-HTTP or RTP prober.
And — please — do not refer to me as ‘they.’ I am not a collective, but rather a human being. I find being referred to as ‘they’ to be profoundly offensive.
I can't seriously believe you're not trolling with this. The use of singular "they" to refer to an individual of unknown gender has been a thing for literally centuries.
Some people, like me, grew up on the Internet assuming every person they interacted with was male by default, because it was a statically sound assumption. We would then use the right pronoun if corrected (unless we thought you might be a G.I.R.L.) Some people, like the person you're ostensibly trolling, do not make this assumption, and just use the singular, gender-neutral "they", which is arguably the nicer, more humane approach anyway.
Being offended by such an approach just smacks of trying to "own the libs", or just otherwise bring contrarian for the sake of it. Neither involves being genuinely offended by anything, and is instead a transparent piece of performative posturing. Please don't bring that here.
The most compelling reason for big endian is that it's easier to read in hex dumps.
So it's with some amusement that I see this new protocol is named RTP, which clashes perfectly with the RTP that represents how the Net heads and Bell people finally came to terms with each other. What next, SIP for Super Internet Protocol?
It’s odd that someone would propose a new protocol but not check the IANA port number registry for whether the name is already in use. Surely you’ll eventually want a port for your new protocol when it’s massively successful.
> RTP is commonly used as an abbreviation for the Real-time Transport Protocol, and is not descriptive enough. This is a protocol for reliably downloading large files. it is not designed to be a drop in replacement for http or BitTorrent.
> I am currently drafting a successor proposal that addresses these issues.
https://www.cisco.com/c/en/us/support/docs/ip/enhanced-inter...
And, why not call it RDLF (reliably download large files protocol) instead of RTP? And what does RTP stand for in this case anyway?
* For the variable length data (uri in open request, error payload, etc) you'll want to specify a length in the message. It allows for more efficient parsing (knowing a buffer size up front) and prevents some security(ish) issues - e.g. i could just send an open followed by uri that follows the scheme but is essentially gibberish until the connection is closed by the server (or the server chokes on it somehow).
You may also want to consider a lenth field for READ responses - in the specified range where the server has all of the resource this is redundant, but if the server doesn't have all the resource, it allows the client to request from another at an adjusted offset even while still receiving the values from the first READ.
* Tokens: If i'm understanding the draft (incl erata) correctly you're using tokens as handles for message pairs, additionally one of the tokens is used to associate the open request and read request, and they are all chosen by the client.
Like this:
Client Server
Open ---------new_token(val1)-->
<---------reuest_token(val1)-- OpenResponse
Read --new_token(val2), open_token(val1) -->
< --- request_token(val2) --- ReadResponse
I'd suggest making the payload of the OPEN request be a server chosen open_token. Having the client manage all of the token values forces the server to track tokens and sender IP or other, similar unique identifying information. It also opens the door for various token collision and/or exhaustion attacks.* Specify various edge cases well (client closes connection early, connection breaks, etc), because they will have interop consequences, and affect server design (e.g. how tokens are handled in the program) too.
I'd expect the "one" protocol to be able to sync files, especially if they are "large", as advertised. In other words, instead of a case "transfer" implement a universal "sync" to "rule them all".