Posted by ricardbejarano 8 hours ago
ive been working with quite a few open weight models for the last year and especially for things like images, models from 6 months would return garbage data quickly, but these days qwen 3.5 is incredible, even the 9b model.
But yes, if there is a choice I want quality over speed. At same quality, I definitely want speed.
Its using WebGPU as a proxy to estimate system resource. Chrome tends to leverage as much resources (Compute + Memory) as the OS makes available. Safari tends to be more efficient.
Maybe this was obvious to everyone else. But its worth re-iterating for those of us skimmers of HN :)
Since I considered buying M3 Ultra and feel like it the most often discussed regarding using Apple hardware for runninh local LLMs. Where speed might be okay, but prompt processing can take ages.