Top
Best
New

Posted by MallocVoidstar 13 hours ago

Gemini 3.1 Pro(blog.google)
Preview: https://console.cloud.google.com/vertex-ai/publishers/google...

Card: https://deepmind.google/models/model-cards/gemini-3-1-pro/

578 points | 723 commentspage 5
pawelduda 12 hours ago|
It's safe to assume they'll be releasing improved Gemini Flash soon? The current one is so good & fast I rarely switch to pro anymore
derac 11 hours ago||
When 3 came out they mentioned that flash included many improvements that didn't make it into pro (via an hn comment). I imagine this release includes those.
tucnak 10 hours ago||
Gemini 3 Pro (high) is a joke compared to Gemini 3 Flash in Antigravity, except it's not even funny. Flash is insane value, and super capable, too. I've had it implement a decompiler for very obscure bytecode, and it was passing all tests in no time. PITA to refactor later, but not insurmountable. Gemini 3 Pro (high) choked on this problem in the early stages... I'm looking forward to comparing 3.1 Pro vs 3.0 Flash, hopefully they have improved on it enough to finally switch over.
datakazkn 6 hours ago||
One underappreciated reason for the agentic gap: Gemini tends to over-explain its reasoning mid-tool-call in a way that breaks structured output expectations. Claude and GPT-4o have both gotten better at treating tool calls as first-class operations. Gemini still feels like it's narrating its way through them rather than just executing.
carbocation 6 hours ago|
I agree with this; it feels like the most likely tool to drop its high-level comments in code comments.
attentive 6 hours ago||
A lot of gemini bashing. But flash 3.0 with opencode is reasonably good and reliable coder.

I'd rate it between haiku 4.5 (also pretty good for a price) and sonnet. Closer to sonnet.

Sure, if I am not cost-sensitive I'd run everything in opus 4.6 but alas.

n4pw01f 6 hours ago||
I created a nice harness and visual workflow builder for my Gemini agent chains, works very well. I did this so it would create code the way I do, that is very editable.

In contrast, the vs code plugin was pretty bad, and did crazy things like mix languages

upmind 10 hours ago||
In my experience, while Gemini does really well in benchmarks I find it much worse when I actually use the model. It's too verbose / doesn't follow instructions very well. Let's see if that changes with this model.
josalhor 12 hours ago||
I speculated that 3 pro was 3.1... I guess I was wrong. Super impressive numbers here. Good job Google.
refulgentis 12 hours ago|
> I speculated that 3 pro was 3.1

?

josalhor 11 hours ago||
Sorry... I speculated that 3 deep think is 3.1 pro.. model names are confusing..
alwinaugustin 5 hours ago||
I use gemini if i need to write something in my native language- Malayalam or translation. it works very well in writing in Indian regional languages.
markerbrod 12 hours ago||
Blogpost: https://blog.google/innovation-and-ai/models-and-research/ge...
Murfalo 11 hours ago||
I like to think that all these pelican riding a bicycle comments are unwittingly iteratively creating the optimal cyclist pelican as these comment threads are inevitably incorporated in every training set.
alpineman 11 hours ago|
More like half of Google's AI team is hanging out on HN, and they can optimise for that outcome to get a good rep among the dev community.
kridsdale3 10 hours ago|||
Hello.

(I'm not aware of anyone doing this, but GDM is quite info-siloed these days, so my lack of knowledge is not evidence it's not happening)

alpineman 9 hours ago||
Hello.

Please push internally for more reliable tool use across Gemini models. Intelligence is useless if it can't be applied :)

Barbing 11 hours ago|||
See: fish in bike front basket
impulser_ 12 hours ago|
Seems like they actually fixed some of the problems with the model. Hallucinations rate seems to be much better. Seems like they also tuned the reasoning maybe that were they got most of the improvements from.
whynotminot 11 hours ago|
The hallucination rate with the Gemini family has always been my problem with them. Over the last year they’ve made a lot of progress catching the Gemini models up to/near the frontier in general capability and intelligence, but they still felt very late 2024 in terms of hallucination rate.

Which made the Gemini models untrustworthy for anything remotely serious, at least in my eyes. If they’ve fixed this or at least significantly improved, that would be a big deal.

SubiculumCode 10 hours ago||
Maybe I haven't kept up with how ghatgpt and claude are doing , but 6 monthlatelys ago or so, I thought Gemini was leading on that front.
More comments...