Posted by meetpateltech 12/17/2025
Developer Blog: https://blog.google/technology/developers/build-with-gemini-...
Model Card [pdf]: https://deepmind.google/models/model-cards/gemini-3-flash/
Gemini 3 Flash in Search AI mode: https://blog.google/products/search/google-ai-mode-update-ge...
Deepmind Page: https://deepmind.google/models/gemini/flash/
its almost as good as 5.2 and 4.5 but way faster and cheaper
Firstly, 3 Flash is wicked fast and seems to be very smart for a low latency model, and it's a rush just watching it work. Much like the YOLO mode that exists in Gemini CLI, Flash 3 seems to YOLO into solutions without fully understanding all the angles e.g. why something was intentionally designed in a way that at first glance may look wrong, but ended up this way through hard won experience. Codex gpt 5.2 xhigh on the other hand does consider more angles.
It's a hard come-down off the high of using it for the first time because I really really really want these models to go that fast, and to have that much context window. But it ain't there. And turns out for my purposes the longer chain of thought that codex gpt 5.2 xhigh seems to engage in is a more effective approach in terms of outcomes.
And I hate that reality because having to break a lift into 9 stages instead of just doing it in a single wicked fast run is just not as much fun!
I'm more excited to see 3 Flash Lite. Gemini 2.5 Flash Lite needs a lot more steering than regular 2.5 Flash, but it is a very capable model and combined with the 50% batch mode discount it is CHEAP ($0.05/$0.20).