Posted by meetpateltech 12/17/2025
Developer Blog: https://blog.google/technology/developers/build-with-gemini-...
Model Card [pdf]: https://deepmind.google/models/model-cards/gemini-3-flash/
Gemini 3 Flash in Search AI mode: https://blog.google/products/search/google-ai-mode-update-ge...
Deepmind Page: https://deepmind.google/models/gemini/flash/
Skatval is a small local area I live in, so I know when it's bullshitting. Usually, I get a long-winded answer that is PURE Barnum-statement, like "Skatval is a rural area known for its beautiful fields and mountains" and bla bla bla.
Even with minimal thinking (it seems to do none), it gives an extremely good answer. I am really happy about this.
I also noticed it had VERY good scores on tool-use, terminal, and agentic stuff. If that is TRUE, it might be awesome for coding.
I'm tentatively optimistic about this.
This might be fun for vibecode to just let it go crazy and don't stop until an MVP is working, but I'm actually afraid to turn on agent mode with this now.
If it was just over-eager, that would be fine, but it's also not LISTENING to my instructions. Like the previous example, I didn't ask it to install a testing framework, I asked it for options fitting my project. And this happened many times. It feels like it treats user prompts/instructions as: "Suggestions for topics that you can work on."
For now the venders I pay for are 90% Google, and 10% combination of Chinese models and from the French company Mistral.
I love the new Gemini 3 Flash model - it hits so many sweet-spots for me. The API is inexpensive enough for my use cases that I don’t even think about the cost.
My preference is using local open models with Ollama and LM Studio, but commercial models are also a large part of my use cases.
-> 2.5 Flash Lite is super fast & cheap (~1-1.5s inference), but poor quality responses.
-> 2.5 Flash gives high quality responses, but fairly expensive & slow (5-7s inference)
I really just need an in-between for Flash and Flash Lite for cost and performance. Right now, users have to wait up to 7s for a quality response.
You're not doing anything wrong. Everyone knows what you're doing. You have no secrets to hide.
Yet you value your privacy anyway. Why?
Also - I have no problem using Anthropic's cloud-hosted services. Being opposed to some cloud providers doesn't mean I'm opposed to all cloud providers.
Anthropic - one of GCP’s largest TPU customers? Good for you.
https://www.anthropic.com/news/expanding-our-use-of-google-c...
1, has anyone actually found 3 Pro better than 2.5 (on non code tasks)? I struggle to find a difference beyond the quicker reasoning time and fewer tokens.
2, has anyone found any non-thinking models better than 2.5 or 3 Pro? So far I find the thinking ones significantly ahead of non thinking models (of any company for that matter.)