Top
Best
New

Posted by meetpateltech 6 days ago

Gemini 3 Flash: Frontier intelligence built for speed(blog.google)
Docs: https://ai.google.dev/gemini-api/docs/gemini-3

Developer Blog: https://blog.google/technology/developers/build-with-gemini-...

Model Card [pdf]: https://deepmind.google/models/model-cards/gemini-3-flash/

Gemini 3 Flash in Search AI mode: https://blog.google/products/search/google-ai-mode-update-ge...

Deepmind Page: https://deepmind.google/models/gemini/flash/

1102 points | 579 commentspage 2
caminanteblanco 6 days ago|
Does anyone else understand what the difference is between Gemini 3 'Thinking' and 'Pro'? Thinking "Solves complex problems" and Pro "Thinks longer for advanced math & code".

I assume that these are just different reasoning levels for Gemini 3, but I can't even find mention of there being 2 versions anywhere, and the API doesn't even mention the Thinking-Pro dichotomy.

peheje 6 days ago||
I think:

Fast = Gemini 3 Flash without thinking (or very low thinking budget)

Thinking = Gemini 3 flash with high thinking budget

Pro = Gemini 3 Pro with thinking

sunaookami 6 days ago||
It's this, yes: https://x.com/joshwoodward/status/2001350002975850520

>Fast = 3 Flash

>Thinking = 3 Flash (with thinking)

>Pro = 3 Pro (with thinking)

caminanteblanco 6 days ago||
Thank you! I wish they had clearer labelling (or at the very least some documentation) explaining this.
flakiness 6 days ago|||
It seems:

   - "Thinking" is Gemini 3 Flash with higher "thinking_level"
   - Prop is Gemini 3 Pro. It doesn't mention "thinking_level" but I assume it is set to high-ish.
lysace 6 days ago||
Really stupid question: How is Gemini-like 'thinking' separate from artificial general intelligence (AGI)?

When I ask Gemini 3 Flash this question, the answer is vague but agency comes up a lot. Gemini thinking is always triggered by a query.

This seems like a higher-level programming issue to me. Turn it into a loop. Keep the context. Those two things make it costly for sure. But does it make it an AGI? Surely Google has tried this?

CamperBob2 6 days ago|||
I don't think we'll get genuine AGI without long-term memory, specifically in the form of weight adjustment rather than just LoRAs or longer and longer contexts. When the model gets something wrong and we tell it "That's wrong, here's the right answer," it needs to remember that.

Which obviously opens up a can of worms regarding who should have authority to supply the "right answer," but still... lacking the core capability, AGI isn't something we can talk about yet.

LLMs will be a part of AGI, I'm sure, but they are insufficient to get us there on their own. A big step forward but probably far from the last.

bananaflag 6 days ago||
> When the model gets something wrong and we tell it "That's wrong, here's the right answer," it needs to remember that.

Problem is that when we realize how to do this, we will have each copy of the original model diverge in wildly unexpected ways. Like we have 8 billion different people in this world, we'll have 16 gazillion different AIs. And all of them interacting with each other and remembering all those interactions. This world scares me greatly.

criley2 6 days ago||||
Advanced reasoning LLM's simulate many parts of AGI and feel really smart, but fall short in many critical ways.

- An AGI wouldn't hallucinate, it would be consistent, reliable and aware of its own limitations

- An AGI wouldn't need extensive re-training, human reinforced training, model updates. It would be capable of true self-learning / self-training in real time.

- An AGI would demonstrate real genuine understanding and mental modeling, not pattern matching over correlations

- It would demonstrate agency and motivation, not be purely reactive to prompting

- It would have persistent integrated memory. LLM's are stateless and driven by the current context.

- It should even demonstrate consciousness.

And more. I agree that what've we've designed is truly impressive and simulates intelligence at a really high level. But true AGI is far more advanced.

waffletower 6 days ago|||
Humans can fail at some of these qualifications, often without guile: - being consistent and knowing their limitations - people do not universally demonstrate effective understanding and mental modeling.

I don't believe the "consciousness" qualification is at all appropriate, as I would argue that it is a projection of the human machine's experience onto an entirely different machine with a substantially different existential topology -- relationship to time and sensorium. I don't think artificial general intelligence is a binary label which is applied if a machine rigidly simulates human agency, memory, and sensing.

versteegen 6 days ago||||
> - It should even demonstrate consciousness.

I disagreed with most of your assertions even before I hit the last point. This is just about the most extreme thing you could ask for. I think very few AI researchers would agree with this definition of AGI.

lysace 6 days ago|||
Thanks for humoring my stupid question with a great answer. I was kind of hoping for something like this :).
dcre 6 days ago||||
This is what every agentic coding tool does. You can try it yourself right now with the Gemini CLI, OpenCode, or 20 other tools.
andai 6 days ago|||
AGI is hard but we can solve most tasks with artificial stupidity in an `until done`.
lysace 6 days ago||
Just a matter of time and cost. Eventually...
xpil 6 days ago||
My main issue with Gemini is that business accounts can't delete individual conversations. You can only enable or disable Gemini, or set a retention period (3 months minimum), but there's no way to delete specific chats. I'm a paying customer, prices keep going up, and yet this very basic feature is still missing.
testfrequency 6 days ago||
This is the #1 thing that keeps me from going all in on Gemini.

Their retention controls for both consumer and business suck. It’s the worst of any of the leaders.

strstr 6 days ago|||
For my personal usage of ai-studio, I had to use autohotkey to record and replay my mouse deleting my old chats. I thought about cooking up a browser extension, but never got around to it.
ComputerGuru 6 days ago||
Use it over api.
outside2344 6 days ago||
I don't want to say OpenAI is toast for general chat AI, but it sure looks like they are toast.
Gigachad 6 days ago||
I’ve fully switched over to Gemini now. It seems significantly more useful, and is less of an automatic glaze machine that just restates your question and how smart you are for asking it.
radicality 6 days ago|||
How do I get Gemini to be more proactive in finding/double-checking itself against new world information and doing searches?

For that reason I still find chatgpt way better for me, many things I ask it first goes off to do online research and has up to date information - which is surprising as you would expect Google to be way better at this. For example, was asking Gemini 3 Pro recently about how to do something with a “RTX 6000 Blackwell 96GB” card, and it told me this card doesn’t exist and that I probably meant the rtx 6000 ada… Or just today I asked about something on macOS 26.2, and it told me to be cautious as it’s a beta release (it’s not). Whereas with chatgpt I trust the final output more since it very often goes to find live sources and info.

leemoore 6 days ago|||
Gemini is bad at this sort of thing but I find all models tend to do this to some degree. You have to know this could be coming and give it indicators to assume that it’s training data is going to be out of date. And it must web search the latest as of today or this month. They aren’t taught to ask themselves “is my understanding of this topic based on info that is likely out of date” but understand after the fact. I usually just get annoyed and low key condescend to it for assuming its old ass training data is sufficient grounding for correcting me.

That epistemic calibration is is something they are capable of thinking through if you point it out. But they aren’t trained to stop and ask/check themselves on how confident do they have a right to be. This is a meta cognitive interrupt that is socialized into girls between 6 and 9 and is socialized into boys between 11-13. While meta cognitive interrupt to calibrate to appropriate confidence levels of knowledge is a cognitive skill that models aren’t taught and humans learn socially by pissing off other humans. It’s why we get pissed off st models when they correct ua with old bad data. Our anger is the training tool to stop doing that. Just that they can’t take in that training signal at inference time

andai 6 days ago|||
Yeah any time I mention GPT-5, the other models start having panic attacks and correcting it to GPT-4. Even if it's a model name in source code!

They think GPT-5 won't be released until the distant future, but what they don't realize is we have already arrived ;)

niek_pas 6 days ago|||
That’s funny, I’ve had the exact opposite experience. Gemini starts every answer to a coding question with, “you have hit upon a fundamental insight in zyx”. ChatGPT usually starts with, “the short answer? Xyz.”
scrollop 6 days ago|||
Looking at this they are:

https://artificialanalysis.ai/evaluations/omniscience

https://youtu.be/4p73Uu_jZ10?si=x1gZopegCacznUDA&t=582

pawelduda 6 days ago||
They have been for a while. Had first mover advantage that kept them in the lead but it's not anything others couldn't throw money at, and catch up eventually. I remember when not so long ago everyone was talking how Google lost AI race, and now it feels like they're chasing Anthropic
SyrupThinker 6 days ago||
I wonder if this suffers from the same issue as 3 Pro, that it frequently "thinks" for a long time about date incongruity, insisting that it is 2024, and that information it receives must be incorrect or hypothetical.

Just avoiding/fixing that would probably speed up a good chunk of my own queries.

robrenaud 6 days ago||
Omg, it was so frustrating to say:

Summarize recent working arxiv url

And then it tells me the date is from the future and it simply refuses to fetch the URL.

zhyder 6 days ago||
Glad to see big improvement in the SimpleQA Verified benchmark (28->69%), which is meant to measure factuality (built-in, i.e. without adding grounding resources). That's one benchmark where all models seemed to have low scores until recently. Can't wait to see a model go over 90%... then will be years till the competition is over number of 9s in such a factuality benchmark, but that'd be glorious.
jug 6 days ago|
Yes, that's very good because it's my main use case for Flash; queries depending on world knowledge. Not science or engineering problems, but think you'd ask someone that has a really broad knowledge about things and can give quick and straightforward answers.
primaprashant 6 days ago||
Pricing is $0.5 / $3 per million input / output tokens. 2.5 Flash was $0.3 / $2.5. That's 66% increase in input tokens and 20% increase in output token pricing.

For comparison, from 2.5 Pro ($1.25 / $10) to 3 Pro ($2 / $12), there was 60% increase in input tokens and 20% increase in output tokens pricing.

simonw 6 days ago||
Calculating price increases is made more complex by the difference in token usage. From https://blog.google/products/gemini/gemini-3-flash/ :

> Gemini 3 Flash is able to modulate how much it thinks. It may think longer for more complex use cases, but it also uses 30% fewer tokens on average than 2.5 Pro.

Tiberium 6 days ago||
Yes, but also most of the increase in 3 Flash is in the input context price, which isn't affected by reasoning.
int_19h 6 days ago||
It is affected if it has to round-trip, e.g. because it's making tool calls.
prvc 6 days ago||
Apples to oranges.
meetpateltech 6 days ago||
Deepmind Page: https://deepmind.google/models/gemini/flash/

Developer Blog: https://blog.google/technology/developers/build-with-gemini-...

Model Card [pdf]: https://deepmind.google/models/model-cards/gemini-3-flash/

Gemini 3 Flash in Search AI mode: https://blog.google/products/search/google-ai-mode-update-ge...

simonw 6 days ago||
For anyone from the Gemini team reading this: these links should all be prominent in the announcement posts. I always have to hunt around for them!
meetpateltech 6 days ago||
Google actually does something similar for major releases - they publish a dedicated collection page with all related links.

For example, the Gemini 3 Pro collection: https://blog.google/products/gemini/gemini-3-collection/

But having everything linked at the bottom of the announcement post itself would be really great too!

simonw 6 days ago||
Sadly there's nothing about Gemini 3 Flash on that page yet.
minimaxir 6 days ago||
Documentation for Gemini 3 Flash in particular: https://ai.google.dev/gemini-api/docs/gemini-3
zurfer 6 days ago||
It's a cool release, but if someone on the google team reads that: flash 2.5 is awesome in terms of latency and total response time without reasoning. In quick tests this model seems to be 2x slower. So for certain use cases like quick one-token classification flash 2.5 is still the better model. Please don't stop optimizing for that!
edvinasbartkus 6 days ago||
Did you try setting thinkingLevel to minimal?

thinkingConfig: { thinkingLevel: "low", }

More about it here https://ai.google.dev/gemini-api/docs/gemini-3#new_api_featu...

zurfer 6 days ago||
Yes I tried it with minimal and it's roughly 3 seconds for prompts that take flash 2.5 1 second.

On that note it would be nice to get these benchmark numbers based on the different reasoning settings.

retropragma 6 days ago|||
That's more of a flash-lite thing now, I believe
Tiberium 6 days ago|||
You can still set thinking budget to 0 to completely disable reasoning, or set thinking level to minimal or low.
andai 6 days ago||
>You cannot disable thinking for Gemini 3 Pro. Gemini 3 Flash also does not support full thinking-off, but the minimal setting means the model likely will not think (though it still potentially can). If you don't specify a thinking level, Gemini will use the Gemini 3 models' default dynamic thinking level, "high".

https://ai.google.dev/gemini-api/docs/thinking#levels

Tiberium 6 days ago||
I was talking about Gemini 3 Flash, and you absolutely can disable reasoning, just try sending thinking budget: 0. It's strange that they don't want to mention this, but it works.
andai 6 days ago||
Gemini 3 Flash is in the second sentence.
throwaway127482 5 days ago||
See, this is what happens when you turn off thinking completely.
bobviolier 6 days ago||
This might also have to do with it being a preview, and only available on the global region?
rohitpaulk 6 days ago||
Wild how this beats 2.5 Pro in every single benchmark. Don't think this was true for Haiku 4.5 vs Sonnet 3.5.
FergusArgyll 6 days ago|
Sonnet 3.5 might have been better than opus 3. That's my recollection anyhow
hubraumhugo 6 days ago|
You can get your HN profile analyzed and roasted by it. It's pretty funny :) https://hn-wrapped.kadoa.com
onraglanroad 6 days ago||
I didn't feel roasted at all. In fact I feel vindicated! https://hn-wrapped.kadoa.com/onraglanroad
echelon 6 days ago|||
This is hilarious. The personalized pie charts and XKCD-style comics are great, and the roast-style humor is perfect.

I do feel like it's not an entirely accurate caricature (recency bias? limited context?), but it's close enough.

Good work!

You should do a "show HN" if you're not worried about it costing you too much.

SubiculumCode 6 days ago|||
dang https://hn-wrapped.kadoa.com/dang
WhereIsTheTruth 6 days ago|||
This is exactly why you keep your personal life off the internet
apparent 6 days ago|||
Pretty fucking hilarious, if completely off-topic.
knicholes 5 days ago|||
That cut deep
peheje 6 days ago||
This is great. I literally "LOL'd".
More comments...