Nano Banana 2: Google's latest AI image generation model

Posted by davidbarker 8 hours ago

Nano Banana 2: Google's latest AI image generation model(blog.google)

431 points | 422 commentspage 2

neom 6 hours ago|

I did some tests, my education is in digital imaging technology/film from 20 years ago so I find this stuff fun to follow.

Two what I could consider "interesting prompts" for image gen testing. Did pretty well.

https://s.h4x.club/eDuOzPDd

"A macro close-up photograph of an old watchmaker's hands carefully replacing a tiny gear inside a vintage pocket watch. The watch mechanism is partially submerged in a shallow dish of clear water, causing visible refraction and light caustics across the brass gears. A single drop of water is falling from a pair of steel tweezers, captured mid splash on the water's surface. Reflect the watchmaker's face, slightly distorted, in the curved glass of the watch face. Sharp focus throughout, natural window lighting from the left, shot on 100mm macro lens." - Only major problem i could find at a glance is the clasps don't make sense probably, and the drop of water inside the watch on the cog doesn't make sense/cog mangled into tweezers.

https://s.h4x.club/yAuNPlRk

"A candid photograph taken from behind an elderly woman sitting alone on a park bench in late autumn. She is gently resting one hand on the empty seat beside her, where a man's weathered flat cap and a folded newspaper sit untouched. Fallen golden leaves cover the path ahead. The low afternoon sun casts her long shadow alongside a second, fainter shadow that almost seems to be there, the suggestion of someone sitting next to her, visible only in the light on the ground. Muted, warm color palette, shallow depth of field on the background trees, photojournalistic style." - I don't know why but it internal errored twice on this one but then got there.

tariky 4 hours ago||

This looks like a response to Seedream 5.0 lite that was published two days ago.

I use all those fancy image models editing capabilities for my fast fashion web shop. I must say: product photography for clothing and accessories product is dead. Those models are amazing at style transfering and garment transferring.

We will see how good will be Seedream 5.0 full version.

mattlondon 3 hours ago||

You think they created and launched this in just 2 days? These things take a lot of effort and time to develop...

Tiberium 4 hours ago||

Seedream 5 Lite is honestly extremely disappointing, its text to image is way worse than 4.5, image editing is fine but that's it. It's way, wayy behind NB2.

yakattak 7 hours ago||

I think this tech is cool, from an engineering perspective. I’m trying to figure out if there’s any justification for using it in a business world outside of: “We don’t want to pay an artist.”

You can argue things like code generation are an extension of the engineer wielding it. Image generation just seems like a net negative overall if it’s used at scale.

Edit: By scale, I mean large corporations putting content in front of millions. I understand the appeal for smaller businesses where they probably weren’t going to pay an artist anyway.

alex43578 7 hours ago||

When a company uses a photocopier, they don’t want to pay a scribe.

When a company sends an email or docu-sign, they don’t want to pay a courier.

Technology supplements or replaces jobs, often reducing costs. This is no different.

nindalf 7 hours ago||

Art isn't just a job or a way to make money, like being a courier is.

progbits 7 hours ago||

For corporate art it is. Nobody draws memphis out of passion.

dizlexic 7 hours ago||

The real victims here are going to be the graphic designers who worked for firework importers.

garbawarb 7 hours ago|||

Advertising? "We don't want to pay an artist" goes a long way for a small business with a limited budget.

whynotmaybe 7 hours ago|||

We're using voice generation from clipchamp for our promotional videos.

It's an ethical conundrum because we're not paying anyone, but we don't have the money to pay anyone, and it's good enough for our budget.

But we're getting used to the process of changing a part of the text in a few seconds without any artist involved and for 0$.

I guess that soon we'll be able to create voice sample from know personalities for a few $ with prices based on the popularity of the artist and some sanity check based on the artist preferences.

yakattak 7 hours ago||

I think this is where I see the benefit for small business. I don’t want to speak for you, but I imagine it’s either “no voice over, we can’t afford it.” or “inexpensive AI voice over to make it more accessible and appealing.”

My thought is the large corps that could afford it, still won’t because it’s a cost they don’t need to incur. For them it’s not even a moral conundrum.

rm_-rf_slash 7 hours ago|||

It can also backfire. AI slop ads and marketing material imply cut corners and poor quality products. If a bakery isn’t going to bother touching up its AI slop banner, I don’t expect their cookies to be great either.

gwd 7 hours ago|||

FWIW I've never seen a correlation between a small company's website and the quality of their product. Slick website? Maybe they care for their craft, maybe they're all marketing and no content. Website stuck in 1998? Maybe they're sloppy and don't care; maybe they care about their core product, not a slick marketing brochure. I don't see any reason AI would be different in that regard.

yakattak 7 hours ago||

That’s true. I think it’s more of a problem of getting someone in the door. Anecdotally going to art festivals I’m much more likely to enter the booth of someone who has handcrafted marketing over the person who has generated marketing.

hedora 6 hours ago||

Basic marketing theory says that spending extra to make your ads signals (term of art) to your potential customers that (1) you are successful, since you can afford it and (2) you are confident your product is superior, since you’re effectively paying people to try it, and expect doing this will generate revenue in return.

Much like the star bellied sneetches, when the quality of some ad format becomes untethered from the cost of production and placement, then marketers will flock to some alternative.

YouTube influencers fill[ed] that niche for a while because content milling SEO spam and fake reviews is a lot more expensive if you present the results in video form with good production values. (Not sure how long that will be true, since AI is getting better at short-term video).

switchbak 7 hours ago||||

Every local business I deal with is completely lacking on the online side. They might have square POS terminals and all that stuff, but their website either doesn't exist, sucks (not updated in years) or they throw me to Facebook (also sucks).

This is like the last mile for online presence. The average barber out here doesn't use Squarespace, barely knows how to use Facebook and doesn't touch GenAi. But they can still cut your hair pretty well - tech savvyness doesn't have a huge connection to business competence out here.

awepofiwaop 7 hours ago||||

The amount of lost revenue due to the implication of cut corners needs to be higher than the cost of hiring an artist by enough of a margin that the managers who make the decision start to care, and enough that they're willing to put the effort into hiring an artist.

sekai 6 hours ago||||

> It can also backfire. AI slop ads and marketing material imply cut corners and poor quality products. If a bakery isn’t going to bother touching up its AI slop banner, I don’t expect their cookies to be great either.

Average person won't notice, and would not care either way.

hypeatei 7 hours ago|||

This assumes that models won't improve and you'll always be able to tell that it's "AI slop" ... that seems like a bad bet. Five years ago you'd be laughed out of the room for suggesting that a computer could produce images from a natural language prompt and that it'd be accessible to everyone -- not just corporations with deep pockets.

yakattak 7 hours ago||

Yeah if/when it becomes indistinguishable I think most people won’t care. That being said I do think someone finding out something is AI generated will be met with poor response. Does that ultimately matter? Probably not in a business world.

TomGarden 1 hour ago||

It will be a begrudged but normalized compromise of modern life, like GMO, Auto-Tune and foreign child labour. Which of those it will be perceived as the closest to is hard to say right now

sempron64 7 hours ago|||

Diagrams! So much documentation lacks diagrams because they are hard to make

yakattak 7 hours ago||

True! Though I’d argue diagrams as code like PlantUML or Mermaid are better than an image!

vunderba 7 hours ago||

Agree just from a text search perspective alone that Mermaid even ASCII diagrams are usually preferable.

jedberg 6 hours ago|||

I've been using it to replace things that I used to do for personal projects in photoshop/gimp. Remove a background, add a person, put a letter in here that looks like the same crayon as the other letters.

Things that would take me an hour or so the old way takes three minutes with NB.

But I can see this applying to small businesses. Something that some random person would have to spend on hour photoshopping can be done in a few minutes with NB.

konschubert 7 hours ago|||

I disagree with your premise that everybody should endure friction and cost such that artists can earn a living producing cookie-cutter content.

bonoboTP 7 hours ago|||

Drafting, iteration, mockups. Quite useful during ideation.

yakattak 7 hours ago||

All things traditionally done by artists or artist adjacent roles. I can understand at an individual level, say for a solo gamedev who wasn’t going to pay an artist anyway. That’s not at scale though.

Larian Studios most recently was under fire for this [1]. Like I can see a director going “what would X look like?” and then speeding over to the concept artists for a proper rendition if they liked it. I don’t think this is at scale though. Any large business is just going to get rid of the concept artists.

[1]: https://www.pcgamer.com/games/rpg/baldurs-gate-3-developer-l...

bonoboTP 7 hours ago||

There are many places in general office work where you need some kind of graphics. Slides, reports, info graphics, dataviz. Or academic papers. Some are just illustrations, like a fancy clipart or stock photos, some are drafts for a proper tikz or svg or something that you then redo in draw.io etc. There is much more use for graphics than the use cases where people would ever even consider hiring an actual artist. I've seen good results for iterating on eg model architecture figures quickly between PhD students and supervisors, faster than dragging boxes around and fiddling with tikz. Obviously you don't simply paste the result into the paper. You redo it but it's a good discussion basis. That's for info graphics stuff. But the same can apply to creative stuff, like an event poster, an invitation card to your wedding, storyboards, mood boards, DIY interior design, outfit planning etc etc

yakattak 7 hours ago||

Yeah that’s a good point. I don’t think that’s what I meant by “at scale” but I can see that being useful day to day.

jezzamon 7 hours ago|||

One major thing is photoreal use cases, which artists can't really do. A lot of that is deep fakes / scams but there are some real use cases

yakattak 7 hours ago|||

Isn’t that what photographers are for?

jezzamon 2 hours ago||

Photographers can't take photos of things that don't exist...

RickS 7 hours ago|||

Same answers you'd use beyond "we don't want to pay an engineer". 100x shorter iteration speed, and the associated workflow (stream of microrevisions and spaghetti throwing), top quartile outputs in many langs/styles/contexts without having to source, hire, and maintain a fleet of separate specialists who can quit when they feel like it.

I'm torn on the scale thing. It definitely seems net negative. But I think we collectively underestimate just how deeply sick the existing thing already is. We're repulsed by image gen at scale because it breaks our expectation that images are at least somewhat based on reality, that they reflect the natural world or what we can really expect from a product, from a company, from the future. But that was already a bad expectation: when's the last time you saw a mcdonalds meal that looked like the advert? Or a sub-30$ amazon product that wasn't a complete piece of shit? Advertisements were already actively malicious fantasies to exploit the way our brains react to pictures. They're just fantasies that required whole teams of humans doing weird bullshit with lighting and photoshop, and I'm not sure that's much better. It was already slop. All the grieving we do about the loss of truth, or the extent to which corps will gleefully spray us with mind-breaking waterfalls of outright lies, I think those ships sailed a long time ago. The disgust, deceit, the rage we feel about genAI slop is the way we should have felt about all commercials since at least the 80s IMO.

yakattak 7 hours ago||

> Advertisements were already actively malicious fantasies to exploit the way our brains react to pictures. They're just fantasies that required whole teams of humans doing weird bullshit with lighting and photoshop, and I'm not sure that's much better.

This is a good point. My gut reaction is “well at least someone was paid to do it and can continue to keep society/the economy going ”.

I can see the other side where that’s a soulless job. Not sure what’s worse. Soulless job where your skills apply or even less jobs in a competitive industry.

rafael09ed 7 hours ago|||

It is faster as well

the_mar 7 hours ago|||

a friend of mine was a creative director and a big tech co until recently, she was replaced by AI

zamalek 6 hours ago|||

Sora is already a flop. People are sick of slop and are getting good at identifying it. Grok is the only player that has any semblance of success in the visual gen market, only because they do the one thing that will always make money.

testing22321 7 hours ago|||

> I’m trying to figure out if there’s any justification for using it in a business world outside of: “We don’t want to pay a human.”

You could easily say the same about anytime computers or robots or automation have taken a job away. We’ve been going down this road for decades.

yakattak 7 hours ago||

Those industries (computers, robots) created other jobs though. This doesn’t seem to.

testing22321 12 minutes ago|||

The people who work at google, building this

jedberg 6 hours ago|||

It will. There will be people whose skillset is advanced prompting.

tantalor 7 hours ago||

Won't somebody think of the window replacers?

tiffanyh 2 hours ago||

Keeping track of the different AI product names is so confusing even from a single company.

Why can't Google, for example just call:

  Gemini Image = Nano Banana
  Gemini Video = Veo
  ...

hadley 1 hour ago|

Let alone that Nano Banana 2 is Gemini Image 3.1

jakub_g 7 hours ago||

Since talking images, are there any AI models that can output real transparent gifs/pngs?

And not a (botched) fake white/gray grid background that is commonly used to visualize transparency?

deathanatos 2 hours ago||

This is exactly the use case I just tried, and not only does one get a non-transparent checkerboard, the squares are inconsistent & jumbled.

Example: https://gemini.google.com/share/36d66cad1764

dyates 6 hours ago|||

ChatGPT's image generator has been able to do this since last year. That NBP still can't is baffling. They should at least train it to respond to requests for transparency with a solid colour pink background.

vunderba 4 hours ago||

This. Gpt-image-1/1.5 are the only ones that have this built in - though I'd love to have an insider view if its natively considering the alpha channel or just feeding it through a rembg-style post processor.

RobinL 4 hours ago||

Certainly the initial versions were post processing rather than native. I'd be interested to know if that has changed on subsequent releases.

minimaxir 7 hours ago||

You can output to a plain background and use any number of tools to mask it.

jakub_g 6 hours ago|||

I know. It sounds like a perfect task for AI to do it though (wasn't the whole premise of AI do to mundane things for us), yet they fail to do it, and I need to use an external tool.

minimaxir 6 hours ago||

Alpha is a 4th image channel that 99%+ of images in the training data do not use, so it makes more pragmatic sense to just not allow it.

deathanatos 2 hours ago|||

The output from Nano Banana, even when it is ostensibly drawing "single color" shaded areas, is so jittery that it can be challenging to threshold it.

minimaxir 2 hours ago||

Nano Banana Pro is better at that type of thing, so I suspect Nano Banana 2 will be sufficient.

LeoPanthera 7 hours ago||

It's notable that this model is less advanced that the previous "Pro" model, and also that the Gemini interface is defaulting all requests to "Fast" even if you've previously changed it to Pro.

I guess even Google is running out of GPUs.

geooff_ 3 hours ago||

Funny timing. I just migrated my personal styling app off of Nano Banana.

My main use case is editing user uploads to enhance their clothing images. A large part of it is preserving logo, graphics and other technical details. I noticed over time it felt like Nano Banana has gotten worse at this.

I have a test set of graphic t-shirts that I noticed the model seeming getting worse with it. This combined with price and the terrible experience of their cloud console got me to migrate off.

Scene_Cast2 6 hours ago||

It still seems to have the same pitfalls as all the other image generation models. I ran it through my test prompt (wary of posting it here, lest it gets trained on) - it still cannot generate something along the lines of "object A, but with feature X from Y", where that combo has never been seen in the training data. I wonder how the "astronaut riding unicorn on the moon" was solved...

EDIT: after significant prompting, it actually solved it. I think it's the first one to do so in my testing.

zhyder 4 hours ago||

Model card: https://deepmind.google/models/model-cards/gemini-3-1-flash-...

Pretty close to Gemini 3 Pro Image (aka Nano Banana Pro) in most benchmarks, even without thinking+search, and even exceeding it in 2 most important ones of 'Overall Preference' and 'Visual Quality'. I'm excited about the big jump in Infographics/Factuality (even without thinking+search; I'm surprised that text+image search grounding doesn't make an even bigger dent).

thinkingemote 4 hours ago|

If any AI image generation companies are reading this, I want the image to be in layers which can also be exported, so I can 1) do post processing of my own or 2) arrange for an AI image generation model to process just the layers i specify.

More comments...