Nano Banana 2: Google's latest AI image generation model

Posted by davidbarker 9 hours ago

Nano Banana 2: Google's latest AI image generation model(blog.google)

467 points | 454 commentspage 3

runamuck 8 hours ago|

I saw an item for sale on Ali Express's video and I thought "Wow, they hired some really attractive actors to pitch their little gadget." 30 seconds in, I realized they used GenAI. Not because it looked AI, but because the production values looked too high and professional for the item. I would get in on this if you sell anything online.

arctic-true 7 hours ago||

One thing I notice is that the voices in video AI are absolute hogwash. Voice AI is great, video AI is great, but AI videos where humans speak give me the feel of really poorly dubbed foreign TV - the timing is not quite right and the facial expressions don’t always match up with the words being spoken.

coffeebeqn 8 hours ago||

They can even combine the models, create the presenters with nano banana and then use that as the reference for a video model and paste in your product

monster_truck 6 hours ago||

Kind of surprised it hasn't been pulled yet. Have seen some very disturbing (grok tier) examples of completely bypassing whatever censors they have in place by simply asking gemini to write the prompt

MaxikCZ 7 hours ago||

I have Google AI Ultra. Where can I test this? They say its in aistudio, which says its a paid model and I need to setup billing (as if paying for Ultra isnt enough). They say its available in antigravity, but I cant seem to find it there?

Sevii 6 hours ago|

Works for me with a Pro sub at https://gemini.google.com/app

pietz 9 hours ago||

I'm officially done with the Nano Banana name. It was fun, but can we go back just calling it Gemini Image?

bonoboTP 9 hours ago||

Name recognition has big value. People remember what an advancement the first banana was. Nowadays it's no longer so unique, ChatGPT's and Grok's image editors are also strong.

PunchTornado 7 hours ago||

I really like it. Nano banana is like the best product name in AI.

aliljet 9 hours ago||

I really really want to see how these images are starting to form into videos. The stills are clearly getting better and better, but what about when you need the stills to organically conform to a keyed script?

Mizza 9 hours ago||

Check out Seedance 2: https://seed.bytedance.com/en/seedance2_0

Nano Banana was technically impressive the first time, but after Seedance it's not really. It's all just an internet pollution machine anyway.

rany_ 8 hours ago||

The page looks promising but how can I try it out?

rabf 7 hours ago||

They have an API.

progbits 9 hours ago|||

I'm seeing more and more AI video memes and they are getting really good. Still just bunch of short clips, long shots are not working well enough, but typical Hollywood movies have few second cuts anyway so this is almost good enough to make a marvel fanfic.

vessenes 9 hours ago||

the workflow right now would be to take this images, make a sequence of them for key "shots" and send them to an I2V model. LTX-2 is the model the r/stablediffusion folks are playing with right now, but there are a fair few.

meowface 9 hours ago||

How does it compare to Nano Banana Pro?

divan 3 hours ago||

What is annoying about Nano Banana, is how bad is experience when you try to iterate or, especially, repeat same task for another photo. After second of third image it starts randomly ansering with complete nonsense like "I'm just a language mode and can't assist with that" or "I can't do that" (with absolutely the same prompt it had no issues 2 photos in a row in the same chat).

It also gaslights me, when I point out on an error. I tried to create a cartoon portrait of the person from photo and use background from another photo. It got wrong the order of photos. I provided filenames and explicitly told which one is for person and which for bg. It generated it wrong again, and all attempts to explain that it got it wrong were met with "No, it's YOU incorrect". So frustrating.

vessenes 9 hours ago||

Interesting they get to rev this with the release of a new flash model. I'm speculating part of the distil pipeline includes the image gen stuff; that seems like internal tooling that will pay dividends over time, if true. New frontier model -> automatic new image model. Even if it's just incremental updates, it's good for both the product cadence and compounding improvements.

WarmWash 9 hours ago||

The confusion here is dense, 3.1 Flash Image is not 3.1 Flash.

The banana models (image) are a different than the mainline models, but the confusingly leverage the same naming scheme.

NitpickLawyer 8 hours ago||

> the distil pipeline

I don't have inside info, but everything we've seen about gemini3.0 makes me think they aren't doing distillation for their models. They are likely training different arch/sizes in parallel. Gemini 3.0-flash was better than 3.0-pro on a bunch of tasks. That shouldn't happen with distillation. So my guess is that they are working in parallel, on different arches, and try out stuff on -flash first (since they're smaller and faster to train) and then apply the learnings to -pro training runs. (same thing kinda happened with 2.5-flash that got better upgrades than 2.5-pro at various points last year). Ofc I might be wrong, but that's my guess right now.

vessenes 3 hours ago||

Interesting. Whatever they are doing it's a bit different than Anthropic and oAI, which is good for the consumer. I'm curious about their ML Ops internally; would be fascinating to learn more.

minimaxir 9 hours ago||

Google updated it early in AI Studio so I've been experimenting:

- Base pricing for a 1024x1024 image is almost 1.6x what normal Nano Banana is ($0.067 vs. $0.039), however you can now get a 512x512 image for cheaper, or a 4k image for cheaper than four 1k images: https://ai.google.dev/gemini-api/docs/pricing#gemini-3.1-fla...

- Thinking is now configurable between `Minimal` and `High` (was not the case with Nano Banana Pro)

- Safety of the model appears to be increased so typical copyright infringing/NSFW content is difficult to generate (it refused to let me generate cartoon characters having taken psychedelics)

- Generation speed is really slow (2-3min per image) but that may be due to load.

- Prompt adherence to my trickier prompts for Nano Banana Pro (https://minimaxir.com/2025/12/nano-banana-pro/) is much worse, unsurprisingly. For example I asked it to make a 5x2 grid with 10 given inputs and it keeps making 4x3 grids with duplicate inputs.

However, I am skeptical with their marquee feature: image search. Anyone who has used Nano Banana Pro for awhile knows that it will strongly overfit on any input images by copy/pasting the subject without changes which is bad for creativity, and I suspect this implementation appears the same.

Additionally I have a test prompt which exploits the January 2025 knowledge cutoff:

    Generate a photo of the KPop Demon Hunters performing a concert at Golden Gate Park in their concert outfits.

That still fails even with Grounding with Google Search and Image Search enabled, and more charitable variants of the prompt.

tl;dr the example images (https://deepmind.google/models/gemini-image/flash/) seem similar to Nano Banana Pro which is indeed a big quality improvement but even relative to base Nano Banana it's unclear if it justifies a "2" subtitle especially given the increased cost.

shostack 8 hours ago||

The pricing changes are interesting. I wonder if at some point they will deprecate the less expensive model to increase their margins.

Original Nano Banana (gemini-2.5-flash-image): $0.039 per image (up to 1024×1024px)

Nano Banana 2 (gemini-3.1-flash-image-preview): $0.045 per 512px image $0.067 per 1K (1024×1024) image $0.101 per 2K image $0.151 per 4K image

Nano Banana Pro (gemini-3-pro-image-preview): $0.134 per 1K/2K image $0.240 per 4K image

So at the most common 1K resolution, NB2 is ~72% more expensive than the original NB ($0.067 vs $0.039), but still half the price of NB Pro ($0.134).

arctic-true 7 hours ago|||

They may be victims of their own success here. At a certain point, if you can consistently make perfect images indistinguishable from reality, you’re done improving. All that’s left to do is make it faster or cheaper or better-aligned - but these aren’t going to show up readily in ways the typical user can understand.

sheept 9 hours ago||

For your knowledge cutoff test, did it failing mean that it generated a generic "Kpop demon hunter" or it rejected the prompt?

minimaxir 9 hours ago||

Generic "Kpop demon hunter". Nano Banana 2 atleast has fun with it, though.

riteshyadav02 9 hours ago|

Would be interesting to see latency vs quality tradeoffs here. Are they targeting consumer-facing generation speed or prioritizing fidelity for professional workflows?

More comments...