Gemini 3 Flash: Frontier intelligence built for speed

Posted by meetpateltech 12/17/2025

1102 points | 580 commentspage 5

jtrn 12/17/2025|

This is the first flash/mini model that doesn't make a complete ass of itself when I prompt for the following: "Tell me as much as possible about Skatval in Norway. Not general information. Only what is uniquely true for Skatval."

Skatval is a small local area I live in, so I know when it's bullshitting. Usually, I get a long-winded answer that is PURE Barnum-statement, like "Skatval is a rural area known for its beautiful fields and mountains" and bla bla bla.

Even with minimal thinking (it seems to do none), it gives an extremely good answer. I am really happy about this.

I also noticed it had VERY good scores on tool-use, terminal, and agentic stuff. If that is TRUE, it might be awesome for coding.

I'm tentatively optimistic about this.

amunozo 12/17/2025||

I tried the same with my father's little village (Zarza Capilla, in Spain), and it gave a surprisingly good answer in a couple of seconds. Amazing.

peterldowns 12/17/2025|||

That's a really cool prompt idea, I just tried it with my neighborhood and it nailed it. Very impressive.

kingstnap 12/17/2025|||

You are effectively describing SimpleQA but with a single question instead of a comprehensive benchmark and you can note the dramatic increase in performance there.

jtrn 12/17/2025||

I tested it for coding in Cursor, and the disappointment is real. It's completely INSANE when it comes to just doing anything agentic. I asked it to give me an option for how to best solve a problem, and within 1 second it was NPM installing into my local environment without ANY thinking. It's like working with a manic patient. It's like it thinks: I just HAVE TO DO SOMETHING, ANYTHING! RIGHT NOW! DO IT DO IT! I HEARD TEST!?!?!? LET'S INSTALL PLAYWRIGHT RIGHT NOW LET'S GOOOOOO.

This might be fun for vibecode to just let it go crazy and don't stop until an MVP is working, but I'm actually afraid to turn on agent mode with this now.

If it was just over-eager, that would be fine, but it's also not LISTENING to my instructions. Like the previous example, I didn't ask it to install a testing framework, I asked it for options fitting my project. And this happened many times. It feels like it treats user prompts/instructions as: "Suggestions for topics that you can work on."

mark_l_watson 12/18/2025||

I only use commercial LLM vendors who I consider to be “commercially viable.” I don’t want to deal with companies who are losing money selling me products.

For now the venders I pay for are 90% Google, and 10% combination of Chinese models and from the French company Mistral.

I love the new Gemini 3 Flash model - it hits so many sweet-spots for me. The API is inexpensive enough for my use cases that I don’t even think about the cost.

My preference is using local open models with Ollama and LM Studio, but commercial models are also a large part of my use cases.

Workaccount2 12/17/2025||

Really hoping this is used for real time chatting and video. The current model is decent, but when doing technical stuff (help me figure out how to assemble this furniture) it falls far short of 3 pro.

speedgoose 12/17/2025||

I’m wondering why Claude Opus 4.5 is missing from the benchmarks table.

anonym29 12/17/2025|

I wondered this, too. I think the emphasis here was on the faster / lower costs models, but that would suggest that Haiku 4.5 should be the Anthropic entry on the table instead. They also did not use the most powerful xAI model either, instead opting for the fast one. Regardless, this new Gemini 3 Flash model is good enough that Anthropic should be feeling pressure on both price and model output quality simultaneously regardless of which Anthropic model is being compared against, which is ultimately good for the consumer at the end of the day.

bennydog224 12/17/2025||

From the article, speed & cost match 2.5 Flash. I'm working on a project where there's a huge gap between 2.5 Flash and 2.5 Flash Lite as far as performance and cost goes.

-> 2.5 Flash Lite is super fast & cheap (~1-1.5s inference), but poor quality responses.

-> 2.5 Flash gives high quality responses, but fairly expensive & slow (5-7s inference)

I really just need an in-between for Flash and Flash Lite for cost and performance. Right now, users have to wait up to 7s for a quality response.

anonym29 12/17/2025||

I never have, do not, and conceivably never will use gemini models, or any other models that require me to perform inference on Alphabet/Google's servers (i.e. gemma models I can run locally or on other providers are fine), but kudos to the team over there for the work here, this does look really impressive. This kind of competition is good for everyone, even people like me who will probably never touch any gemini model.

oklahomasports 12/17/2025|

You don’t want Google to know that you are searching for like advice on how much a 61 yr old can contribute to a 401k. What are you hiding?

anonym29 12/17/2025||

Why do you close the bathroom stall door in public?

You're not doing anything wrong. Everyone knows what you're doing. You have no secrets to hide.

Yet you value your privacy anyway. Why?

Also - I have no problem using Anthropic's cloud-hosted services. Being opposed to some cloud providers doesn't mean I'm opposed to all cloud providers.

happyopossum 12/17/2025||

> I have no problem using Anthropic's cloud-hosted services

Anthropic - one of GCP’s largest TPU customers? Good for you.

https://www.anthropic.com/news/expanding-our-use-of-google-c...

jijji 12/17/2025||

I tried Gemini CLI the other day, typed in two one line requests, then it responded that it would not go further because I ran out of tokens. I've hard other people complaint that it will re-write your entire codebase from scratch and you should make backups before even starting any code-based work with the Gemini CLI. I understand they are trying to compete against Claude Code, but this is not ready for prime time IMHO.

d4rkp4ttern 12/18/2025||

Curious how well it would do in Gemini CLI. Probably not that good, at least from looking at the terminal-bench-2 benchmark where it’s significantly behind Gemini-3-Pro (47.6% vs 54.2%), and I didn’t really like G3Pro in Gemini-CLI anyway. Also curious that the posted benchmark omitted comparison with Opus 4.5, which in Claude-Code is anecdotally at/near the top right now.

SubiculumCode 12/17/2025||

In Gemini Pro interface, I now have Fast, Thinking, and Pro options. I was a bit confused by that, but did find this: https://discuss.ai.google.dev/t/new-model-levels-fast-thinki...

user_7832 12/17/2025|

Two quick questions to Gemini/AI Studio users:

1, has anyone actually found 3 Pro better than 2.5 (on non code tasks)? I struggle to find a difference beyond the quicker reasoning time and fewer tokens.

2, has anyone found any non-thinking models better than 2.5 or 3 Pro? So far I find the thinking ones significantly ahead of non thinking models (of any company for that matter.)

Workaccount2 12/17/2025||

Gemini 3 is a step change up against 2.5 for electrical engineering R&D.

Davidzheng 12/17/2025|||

I think it's probably actually better at math. Though still not enough to be useful in my research in a substantial way. Though I suspect this will change suddenly at some point as the models move past a certain threshold (also it is heavily limited by the fact that the models are very bad at not giving wrong proofs/counterexamples) so that even if the models are giving useful rates of successes, the labor to sort through a bunch of trash makes it hard to justify.

tmaly 12/17/2025||

Not for coding but for the design aspect, 3 outshines 2.5

More comments...