Posted by nnx 12 hours ago
Coalgebras might seem too academic but so were monads at some point and now they are everywhere.
https://github.com/ralusek/streamie
allows you to do things like
infiniteRecords
.map(item => doSomeAsyncThing(item), { concurrency: 5 });
And then because I found that I often want to switch between batching items vs dealing with single items: infiniteRecords
.map(item => doSomeAsyncSingularThing(item), { concurrency: 5 })
.map(groupOf10 => doSomeBatchThing(groupsOf10), { batchSize: 10 })
// Can flatten back to single items
.map(item => backToSingleItem(item), { flatten: true });The objection is
> The Web streams spec requires promise creation at numerous points — often in hot paths and often invisible to users. Each read() call doesn't just return a promise; internally, the implementation creates additional promises for queue management, pull() coordination, and backpressure signaling.
But that's 95% manageable by altering buffer sizes.
And as for that last 5%....what are you doing with JS to begin with?
I may be naive in asking this, but what leads someone to building high perf data tools in JS? JS doesn't seem to me like it would be the tool of choice for such things
Performance-wise, I get about half the throughput I had with the same processsing done it rust, which doesn't change anything for my use-case.
However that's not really relevant to the context of the post as I'm using node.js streams which are both saner and fast. I'm guessing that the post is relevant to people using server-side runtimes that only implement web streams.
To your question, I was about to point out Firefox[1], but realized you clarified 'mainstream'[2]...
right now when i need to wrangle bytes, i switch languages to Golang. it’s easy gc language, and all its IO is built around BYOB api:
interface Reader { read(b: Uint8Array): [number, Error?] }
you pass in your own Uint8Array allocation (in go terms, []byte), the reader fills at most the entire thing, and returns (bytes filled, error). it’s a fully pull stream API with one method at its core. now, the api gets to be that simple because it’s always sync, and blocks until the reader can fill data into the buffer or returns an error indicating no data available right now.
go has a TeeReader with no buffering - it too just blocks until it can write to the forked stream.
https://pkg.go.dev/io#TeeReader
we can’t do the same api in JS, because go gets to insert `await` wherever it wants with its coroutine/goroutine runtime. but we can dream of such simplicity combined with zero allocation performance.
This is what UDP is for. Everything actually has to be async all the way down and since it’s not, we’ll just completely reimplement the OS and network on top of itself and hey maybe when we’re done with that we can do it a third time to have the cloud of clouds.
The entire stack we’re using right down to the hardware is not fit for purpose and we’re burning our talent and money building these ever more brittle towering abstractions.
A stream API can layer over UDP as well (reading in order of arrival with packet level framing), but such a stream would a bit weird and incompatible with many stream consumers (e.g. [de]compression). A UDP API is simpler and more naturally event (packet) oriented. The concepts don’t mix well.
Still, it would be nice if they browser supported a UDP API instead of the weird and heavy DTLS and QUIC immitations.
I was trying to be open minded about that and conceive a stream API over a UDP socket. It’d work IMHO, but be a little odd compared to an event-like API.
Sadly it will never happen. WebAssembly failed to keep some of its promises here.
classic case of not using an await before your promise