Can I run AI locally?

Posted by ricardbejarano 13 hours ago

884 points | 231 commentspage 5

AstroBen 9 hours ago|

This doesn't look accurate to me. I have an RX9070 and I've been messing around with Qwen 3.5 35B-A3B. According to this site I can't even run it, yet I'm getting 32tok/s ^.-

mongrelion 4 hours ago||

Which quantization are you running and what context size? 32tok/s for that model on that card sounds pretty good to me!

misnome 8 hours ago||

It seems to be missing a whole load of the quantized Qwen models, Qwen3.5:122b works fine in the 96GB GH200 (a machine that is also missing here....)

amelius 8 hours ago||

It would be great if something like this was built into ollama, so you could easily list available models based on your current hardware setup, from the CLI.

rootusrootus 8 hours ago|

Someone linked to llmfit. That would be a great tool to integrate with ollama. Just highlight the one you want and tell it to install.

Quick, someone go vibe code that.

dugidugout 6 hours ago||

The latest level of abstraction! You just release your ideas half baked in some internet connected box and wake up with products! Yahoo! Onwards into the Gestell!

SXX 7 hours ago||

Sorry if already been answered, but will there be a metric for latency aka time to first token?

Since I considered buying M3 Ultra and feel like it the most often discussed regarding using Apple hardware for runninh local LLMs. Where speed might be okay, but prompt processing can take ages.

teaearlgraycold 7 hours ago|

Wait for the M5 Ultra. It will get the 4x prompt processing speeds from the rest of the M5 product line. I hear rumors it will be released this year.

sdingi 8 hours ago||

When running models on my phone - either through the web browser or via an app - is there any chance it uses the phone's NPU, or will these be GPU only?

I don't really understand how the interface to the NPU chip looks from the perspective of a non-system caller, if it exists at all. This is a Samsung device but I am wondering about the general principle.

sshagent 9 hours ago||

I don't see my beloved 5060ti. looks great though

urba_ 4 hours ago||

Man, I wonder when there will be AI server farms made from iCloud locked jailbroken iPhone 16s with backported MacOS

ementally 4 hours ago||

In mobile section it is missing Tensor chips (used by Google Pixel devices).

kuon 6 hours ago||

I have amd 9700 and it is not listed while it is great llm hardware because it has 32Gb for a reasonable price. I tried doing "custom" but it didn't seem to work.

The tool is very nice though.

vova_hn2 9 hours ago||

It says "RAM - unknown", but doesn't give me an option to specify how much RAM I have. Why?

zitterbewegung 7 hours ago|

The M4 Ultra doesn't exist and there is more credible rumors for an M5 Ultra. I wouldn't put a projection like that without highlighting that this processor doesn't exist yet.

More comments...