Posted by nathell 22 hours ago
A few possible solutions I explored:
1) You can try and consume (and possibly write to) the frame buffer directly. https://github.com/ddvk/remarkable2-framebuffer was my starting point. This gets you instant updates about what's going on. I guess you could pair this with speculative decoding to get a much faster output.
2) You can use the streaming API on the device to stream the screen to a beefy server, possibly over Tailscale, letting you do everything off device.
3) You can write your own Qt app; ddvk's repos are a good starting point here.
I ultimately instructed claude to write me my own app. Which worked enough to scratch my itch, and I never use it. But this was five months ago, an eternity in vibe hobby projects, so perhaps modern tooling would let me get it in shape and be more usable. Basically, it worked, but VLMs weren't great at what I wanted -- decorating a blank page with a grimoire-style answer to a written question while leaving the original text alone -- and getting it started / stopped from the pen UI is difficult.
Where my mind goes for your project is that I think it'd be nicest to keep a sort of Jupyter notebook somewhere that's the canonical representation, that would have your handwritten blocks and an interpretation, and then the output. then a render layer to get it back onto the screen. At that point, I don't think I'd care very much if it's stored as a PDF on the device, which points back to having this be an app.
Either way it's fun to tinker! the RM line is very hackable, and I still wish it were even easier, the hardware just makes you think of so many possibilities.
> Why do it? It's so impractical!
Because you can and it's fun is always a perfectly valid answer here!
Was it exported by writing on remarkable? How did he include link into the text that he wrote?
[0]: https://handwritten.danieljanus.pl/2022-10-01-hyperlinks-in-...
What I'd find interesting is the trace of that 14 seconds. How much is the Remarkable processing, how much is the claude transcription, how much is the let-go start-up / processing, etc.
The author did not find a solution to trigger file write earlier/more frequently.
Searching the article again, I see in the FAQ:
> Xochitl takes approximately 12 seconds to update the notebook on-disk
12 definitely isn't "several" in my understanding, but regardless, I guess there's little the author can do then.
I tried the model on your handwriting and it worked great. My handwriting is bad enough that it doesn't work :)
The challenge comes from the way the e-paper works. To turn a pixel from white to black (or vice versa) it needs multiple actual frames. The pixel data must also be packed in a specific format. Instructions for how many frames a single operation requires are coded in a wbf[0] file, which comes included with the OEM firmware.
The most commonly used approach is hooking into xochitl, since it handles all of the user facing stuff like the notebooks but also the actual drawing. This is somewhat brittle and tends to break with software updates, because all of the actual function addresses have to be updated as well.
I was excited to find waved[1], a C++ library that allowed to drive the display directly using a sane API. Although it's not been updated for quite a while it still works and you can compile and run it yourself.
Since I was interested in driving the display myself, I tried to rewriting waved in Zig. It works - I can now get pixels up on the screen. Unfortunately the code is a mess and its only redeeming quality (stemming from being written in Zig) is that I can cross compile a statically linked binary that 'just werks' using just the Zig toolchain as the only dependency. For debugging purposes, I also implemented a SDL emulator for the display. [2]
While writing this, I stumbled upon a recent Rust implementation of the same ideas. Nice. [3]
Since the xochitl display implementation is the most optimized, that's probably the reason why it's being used most commonly*, even though it might be 'ugly'.
* Citation needed
[0] https://gitlab.com/zephray/glider#understanding-waveform
[1] https://github.com/matteodelabre/waved
I really want to try for further reverse engineering of xochitl now that I've heard those AI tools have gotten so good at doing it.
Man I really wish one day I could run mainline Linux kernel on this tablet with the display just works.
The experience programming with handwritten code in tablets keeps being a pain, when it would be ideal as digital executable paper.
You must be a Recurser, right? This is so very Recurser.