Top
Best
New

Posted by imtomt 9 hours ago

Show HN: Building a web server in assembly to give my life (a lack of) meaning(github.com)
This is ymawky, a static file web server for MacOS written entirely in ARM64 assembly. It supports GET, PUT, DELETE, HEAD, and OPTIONS requests, and supports Range: bytes=X-Y headers (which allows scrubbing for video streaming). It decodes percent-encoded URLs, strictly enforces docroot, serves custom error pages for any HTTP error response, supports directory listing, and has (some) mitigations against slowloris-like attacks.

I’ve also written a more detailed writeup here: https://imtomt.github.io/ymawky/

298 points | 139 commentspage 2
dalleh 6 hours ago|
With the bubble of LLMs, these projects are really appreciated. Keep up the good work!

P.S.: I would love a copy of that book please!

dragontamer 5 hours ago||
Hmmmm.

One of my first assembly projects was a CGI Script 100% in x86 assembly.

A full web server is certainly more impressive! Though I'd recommend to beginners to look up CGI and mod_cgi in Apache first lol

imtomt 3 hours ago|
Woah! I honestly feel more intimidated writing a CGI script in assembly than I was writing a server, lol. CGI support has been on my mind for a couple weeks, but I haven't really dug into it yet. I'd love to see yours if it's hosted anywhere! Could be a great reference when I do.
dragontamer 3 hours ago||
Really? It's a bit of a nonsense that I did so long ago so it's weird to hear someone interested in it...

The script has been lost to time. I wrote it 5+ computers ago and I don't even know where input that backup...

The overall gist is that CGI Bin specification sets Environmental variables, STDIN and STDOUT to various values. A minimal pure assembly that writes <h1> Hello World </h1> over stdout is your minimalist CGI Script.

A bit of research into what those STDIN/Environmental variables is needed for more. I knew this may e 20+ years ago but have long forgotten....

With access to the various input parameters offered over CGI, you can easily access form data (buttons and whatever clicked by the user). Use some smart file writing to store sessions and off you go....

-------

Maybe start with a Perl CGI tutorial. Then go backwards to C, and finally raw assembly by hand

mappu 6 hours ago||
Syscalls on macOS aren't guaranteed to be stable - Go found out the hard way and in 1.12 they changed to call libSystem.dylib instead.

In general, stable syscall numbers are just a Linux thing. Everyone else uses blessed system libraries

imtomt 6 hours ago|
Yeah, I know MacOS syscalls aren't stable. Interesting point about Go, I hadn't heard about that. Unfortunately I'm a masochist though, and want to avoid libSystem.dylib as much as possible. The only reason I link against it at all is because MacOS requires it for executables to run, I never actually call into it. Figured I'd just update the syscall numbers if/when they change.
ybouane 6 hours ago||
We are moving to AI and stopped writing code / scratching our heads, and you're here writing a web server in assembly.

Humbling.

dwedge 5 hours ago|
Yeah, humbling - I know which path I prefer
thatxliner 9 hours ago||
I'm wanting to read this repository as a learning tool, so it'd also be nice to include docs—even AI-generated docs, but obvious I'd prefer docs with your own design notes and decisions—about the architecture of the code.

Really cool project though!

imtomt 9 hours ago||
Thanks, I appreciate it a lot! I tried to comment my code pretty heavily (~3000 lines of code, ~1000 lines of comments all together), since this was a learning project for myself in the first place. Hopefully those will be of some use. But separate in-depth documentation is definitely a good idea, I'll work on adding that. In the meantime I'm always down to answer any questions about it!
thatxliner 9 hours ago|||
My first question would be where should I start reading? It seems like you modularized it into multiple assembly files (how does that even work?)
imtomt 8 hours ago||
Honestly, read the main file, ymawky.S first. Then I'd read through get.S maybe, checking parse.S on an as-needed basis for parsing-related functions. delete.S or options.S are pretty short, too, so give those a read too.

Modularizing it into multiple files was easier than I expected it to be, you basically have other functions/labels in other files, and mark them as .global at the top. The Makefile compiles each file into their own .o, which you then link all together. You can "b" or "bl" to any label from any other file, as long as it's global and linked together. Same with data in .bss or .data, mark them as .global and they can be accessed from elsewhere.

vasco 6 hours ago||
If you'd be happy with that then you can generate them yourself!
Ati985 6 hours ago||
Your determination to make this happen was remarkable — and you truly accomplished it. Congratulations
cylinder714 8 hours ago||
Here's a piece on writing portable ARM64 assembly: https://ariadne.space/2023/04/12/writing-portable-arm-assemb...
imtomt 7 hours ago|
Thanks for the link, bookmarking. I should note ymawky's main portability issues are unfortunately at the syscall layer rather than the asm layer. proc_info() and getdirentries64() are pretty Darwin-specific, so making it portable would require reworking that whole area rather than adjusting register/calling conventions.
_the_inflator 8 hours ago||
I feel the guy’s suspicion towards any high level language. I exclusively programmed in assembly on C64, Amiga and the recognized that this ain’t sustainable on PC because there are more and more edge cases or different machine configurations.

I had a very hard time simply using and even utilizing C++ or Java.

C and Turbo Pascal especially was easier because the compiled code was very much resembling to hand written code.

As the author described, you can do in 4.000 lines what others can do with way less pain in 100.

So you build macros, come up with your own library and in the end you kind of build a meta language build on top of assembly because some lines are so hard to grasp that you delegate working code into a library for reuse.

It is funny how much we take conventions for numbers for granted. If you happen to know assembly and its intricacies you immediately will learn to work with a sign bits which mark negative numbers. But how do you know? Maybe you use the whole addressable space only for positive numbers.

Small things that make a huge different.

Nice article, I enjoyed your adventures and would do the same.

imtomt 8 hours ago||
Thank you! The thing about eventually building your own meta language ends up happening all the time with bigger assembly projects. I do have a fair few quality-of-life macros too, but probably fewer than I should. I did end up needing to implement by hand what would be standard functions, things like atoi, itoa, strlen, memcpy, streqn.

Higher level languages are more convenient for 99% of things, but the directness of Assembly gives me a rush unlike any other. I didn't live through the C64/Amiga, but I was obsessed with old C64/ZX emulators growing up.

qingcharles 4 hours ago||
I don't know. Certainly the PC had a lot of options, but it wasn't impossible. My first piece of commercial software was written entirely in x86 assembler and had to navigate things like graphics card options and multiple sound card options. It could be done, it was just a lot more of a PITA.

Once I was doing 3D I quickly started moving everything but the inner loops to Turbo C, because I'm not a total masochist :)

digitaltrees 8 hours ago||
I don’t know why, but this project has me irrationally excited!
AppAttestationz 5 hours ago|
I suspect that the test suite isn't great. Bun has so many different behaviors compared to other JS engines, sometimes just plain wrong or contradicting the spec. Test suite didnt catch those..
More comments...