Saw this one on X the other day updated with Gemma 4 and they have the built-in Apple Foundation model, Qwen3.5, and other models:
Locally AI - https://locallyai.app/
I assume it is the 26B A4B one, if it runs locally?
Google really ought to shut down their phone chip team. Literally every chip from them has been a disappointment. As much as I hate to say it, sticking with Qualcomm would have been the right choice.
Actually I found official performance numbers from Google saying iPhone gets 56 tok/s and Qualcomm gets 52. They don't even bother listing Tensor in their table. Maybe because it would be too embarrassing. Ouch! https://ai.google.dev/edge/litert-lm/overview
Second idea is input audio in other language, like Czech, Polish, French