Posted by rbanffy 12/18/2025
I like doing development work on a Mac, but this has to be my biggest bugbear with the system.
But would it be possible to utilize RoCE with these boxes rather than RDMA over Thunderbolt? And what would the expected performance be? As I understand RDMA should be 7-10 times faster than via TCP. But if I understand it correctly RoCE is RDMA over Converged Ethernet. So using ethernet frames and lower layer rather than TCP.
10G Thunderbolt adapters are fairly common. But you can find 40G and 80G Thunderbolt ethernet adapters from Atto. Probably not cheap - but would be fun to test! But ieven if the bandwidth is there we might get killed with latency.
Imagine this hardware with a PCIe slot. The Infiniband hardware is there - then we "just" need the driver.
Then you _just_ need the driver. Fascinating, Apple ships MLX5 drivers, that's crazy imo. I understand that's something they might need internally, but shipping that on ipadOs is wild. https://kittenlabs.de/blog/2024/05/17/25gbit/s-on-macos-ios/
Infiniband is way faster and lower latency than a NIC. These days NIC==Ethernet.
Instead we get gimmicks over Thunderbolt.
It seems like every time someone does an AI hardware “review” we end up with figures for just a single instance, which simply isn’t how the target demographic for a 40k cluster are going to be using it.
Jeff, I love reading your reviews, but can’t help but feel this was a wasted opportunity for some serious benchmarking of LLM performance.
I definitely would not be buying an M3 Ultra right now on my own dime.
I have an M4 Max I can use to bridge any gap...
Which I guess is the point of this for Apple, but still.
Does anyone remember a guy here posting about linking Mac Studios with Thunderbolt for HPC/clustering? I wasn't able to find it with a quick search.
Edit: I think it was this?
https://buildai.substack.com/p/kv-cache-sharding-and-distrib...
Texas Memory Systems was in the business of making large 'RAM Drives'. They had a product line known as "RamSan" which made many gigabytes/terabytes of DDR available via a block storage interface over infiniband and fibre channel. The control layer was implemented via FPGA.
I recall a press release from 2004 which publicized the US govt purchase of a 2.5TB RamSan. They later expanded into SSDs and were acquired by IBM in 2012.
https://en.wikipedia.org/wiki/Texas_Memory_Systems
https://www.lhcomp.com/vendors/tms/TMS-RamSan300-DataSheet.p...
https://gizmodo.com/u-s-government-purchases-worlds-largest-...
https://www.lhcomp.com/vendors/tms/TMS-RamSan20-DataSheet.pd...
https://www.ibm.com/support/pages/ibm-plans-acquire-texas-me...
But the industry knows this, and there’s a technology that is electrically compatible with PCIe that is intended for use as RAM among other things: CXL. I wonder if a anyone will ever build CXL over USB-C.
It is a little sad that they gave someone an uber machine and this was the best he could come up with.
Question answering is interesting but not the most interesting thing one can do, especially with a home rig.
The realm of the possible
Video generation: CogVideoX at full resolution, longer clips
Mochi or Hunyuan Video with extended duration
Image generation at scale:
FLUX batch generation — 50 images simultaneously
Fine-tuning:
Actually train something — show LoRA on a 400B model, or full fine-tuning on a 70B
but I suppose "You have it for the weekend" means chatbot go brrrrr and snark
Yeah, that's what I wanted to see too.
Use them for something creative, write a short story on spec, generate images.
Or the best option: give it tools and let it actually DO something like "read my message history with my wife, find top 5 gift ideas she might have hinted at and search for options to purchase them" - perfect for a local model, there's no way in hell I'd feed my messages to a public LLM, but the one sitting next to me that I can turn off the second it twitches the wrong way? - sure.
Because web search is so broken these days, if you want a clean answer instead of wading through pages of SEO nonsense. It's really common (even) amongst non-techy friends that "I'll ask ChatGPT" has replaced "I'll Google it".
Google is useless