Top
Best
New

Posted by SweetSoftPillow 1 day ago

Nano Banana image examples(github.com)
541 points | 243 commentspage 2
rimmontrieu 1 day ago|
Impressive examples but for GenAI it always comes down to the fact that you have to cherry pick the best result after so many fail attempts. Right now, it feels like they're pushing the narrative that ExpectedOutput = LLM(Prompt, Input) when it's actually ExpectedOutput = LLM(Prompt, Input) * Takes where Takes can vary from 1 to 100 or more
antiraza 9 hours ago||
Why is that a bad thing, or even a non-expected thing? If you pick up a paintbrush, you don't always nail each stroke on the canvas -- just because it's programmatic doesn't mean it should be like a calculator.

LLMs and image generators are cross pollinating human language and human visual information -- both really fuzzy mediums.

I think learning how to 'use this instrument' and 'finding the perfect brush stroke' are part of how they are supposed to work (at least in their current form). I also don't know that just because they are showing good outputs from the inputs that this is framing the narrative as one-and-done... I think the rest of the owl is kind in of implied.

raincole 1 day ago|||
ML researchers have been used Top-5 accuracy for a quite long time, especially when it comes to computer vision.

Of course it's a ridiculous index in most use cases (like in self-driving car. Your 4th guess is that you need to brake? Cool...). But somehow people in ML normalized it.

vunderba 1 day ago||
That's why I always record the number of rolls it takes to get to an acceptable result on my GenAI Comparison site for each model - it's a broad metric indicating how much you have to fight to steer the model in the right direction.
foobarbecue 1 day ago||
Man, I hate this. It all looks so good, and it's all so incorrect. Take the heart diagram, for example. Lots of words that sort of sound cardiac but aren't ("ventricar," "mittic"), and some labels that ARE cardiac, but are in the wrong place. The scenes generated from topo maps look convincing, but they don't actually follow the topography correctly. I'm not looking forward to when search and rescue people start using this and plan routes that go off cliffs. Most people I know are too gullible to understand that this is a bullshit generator. This stuff is lethal and I'm very worried it will accelerate the rate at which the populace is getting stupider.
zahlman 1 day ago|
> Most people I know are too gullible to understand that this is a bullshit generator.

I'm more worried about the cases that aren't trying to be info diagrams. There's all this "safety" discourse around not letting people generate NSFW, and around image copyrights etc. but nobody talks about the potential to use things like #11 for fraud. "Disinformation" always gets approached from a political angle instead of one of personal gain.

twaldecker 1 day ago||
One thing that couldn't be done is transparent background. The model just generates the pattern in the background. Not real alpha channel transparency. You can even see artifacts in the pattern.
zahlman 1 day ago||
The training data is presumably full of examples of people using the pattern to indicate transparency (and explaining that they do so — like the input for 50!), and much less of people actually creating such images (if the training data even preserves the alpha channel in the first place).

I think a bigger problem is the "artifacts" you describe (worse than that sounds to me).

lifthrasiir 1 day ago||
Yeah, mangled checkerboard patterns are common when prompted to "remove" the background. It can be worked around by generating multiple images with only the background color varying (e.g. black and white) and reconstructing the alpha channel from their difference, as the model generally prefers to just copy and paste when no other prompts override that preference.
filoeleven 1 day ago||
“Just do more manual work and waste even more energy so you can take yet another manual step and finally get what you wanted.” A real time-saver, that.
lifthrasiir 1 day ago||
Depends on the actual usage, of course. It is indeed not good when you are doing quick editing.
Animats 1 day ago||
I have two friends who are excellent professional graphic artists and I hesitate to send them this.
metaphor 1 day ago||
My wife is a professional graphic artist and I sent it to her without hesitation...if only for the awareness.
kertoip_1 1 day ago|||
I think it might be the same as with programmers. It might look like AI Agents can do all the programming, but when you actually try to use it do do things it quickly turns out to be not so much reliable.
raincole 1 day ago|||
Given Case 16, they might switch to a career making scientific diagrams.
SweetSoftPillow 1 day ago||
They better learn it today than tomorrow. Even though it's might be painful for some who does not like to learn new tools and explore new horizons.
mitthrowaway2 1 day ago|||
Maybe they're better off switching careers? At some point, your customers aren't going to pay you very much to do something that they've become able to do themselves.

There used to be a job people would do, where they'd go around in the morning and wake people up so they could get to work on time. They were called a "knocker-up". When the alarm clock was invented, these people lose their jobs to other knockers-up with alarm clocks, they lost their jobs to alarm clocks.

non_aligned 1 day ago||
A lot of technological progress is about moving in the other direction: taking things you can do yourself and having others do it instead.

You can paint your own walls or fix your own plumbing, but people pay others instead. You can cook your food, but you order take-out. It's not hard to sew your own clothes, but...

So no, I don't think it's as simple as that. A lot of people will not want the mental burden of learning a new tool and will have no problem paying someone else to do it. The main thing is that the price structure will change. You won't be able to charge $1,000 for a project that takes you a couple of days. Instead, you will need to charge $20 for stuff you can crank out in 20 minutes with gen AI.

GMoromisato 1 day ago||
I agree with this. And it's not just about saving time/effort--an artist with an AI tool will always create better images than an amateur, just as an artist with a camera will always produce a better picture than me.

That said, I'm pretty sure the market for professional photographers shrank after the digital camera revolution.

AstroBen 1 day ago|||
I don't know if "learning this tool" is gunna help..
mustaphah 1 day ago||
In a side-by-side comparison with GPT-4o [1], they are pretty much on par.

[1] https://github.com/JimmyLv/awesome-nano-banana

thisOtterBeGood 1 day ago|
Wow. GPT-4o is exceptionally biased towards pop culture. Harry Potter, Aragon, a sith lord, Elon Musk...
aussiegreenie 18 hours ago||
I am not very good with graphics. Yesterday, I used nano Banana to create an image for the front cover of a report. It took about 5 minutes. Normally, I would have spent at least an hour and still would not have gotten as good an image.

The Gemini models save me about an hour a day.

throwaway2037 1 day ago||
Does anyone else cringe when they see so many examples of sexualised young women? Literally, Case 1/B has a women lifting up her skirt to reveal her underwear. For an otherwise very impressive model, you are spoiling the PR with this kind of immature content. Sheesh. I guess that confirms it: I am a old grumpy man! I count 26 examples with young women, and 9 examples with men. The only thing missing was "Lena": https://en.wikipedia.org/wiki/Lenna
shermantanktop 1 day ago||
My first reaction was the same, before I even knew what these demos represented. And of course I too am a grumpy old man.
yomismoaqui 1 day ago|||
Sex drives technology (even if we don't like it)

VHS, online payments, video streaming... As the old song say it "the internet is porn"

GNaLVEre 1 day ago|||
I had to scroll down way too long for someone to point this out. Its messed up how casually racialised all these image gen examples are towards young asian women.
ants_everywhere 1 day ago|||
wait until you learn what prehistoric sculptors spent their time carving

I read your comment before checking the site and then I saw case one was a child followed by a sexy maid and I thought "oh no dear god" before I realized they weren't combining them into a single image.

AlecSchueler 1 day ago||
> wait until you learn what prehistoric sculptors spent their time carving

Careful not to project your own ideas onto prehistoric sculpture.

ants_everywhere 1 day ago||
the archeological evidence is rather consistent and clear. I'm aware of critiques trying to change the interpretation of what the female figures are for, but nobody denies that they are naked female figures. And the critiques don't seem to have found much purchase among archeologists.
AlecSchueler 1 day ago||
> the archeological evidence is rather consistent and clear.

What are you referring to?

> but nobody denies that they are naked female figures.

No, but the suggestion above that they were the prehistoric equivalent to cartoons of school girls lifting their skirts hasn't been the dominant theory for about thirty years.

> And the critiques don't seem to have found much purchase among archeologists.

This is simply incorrect. They became part of the general archeological discourse as far back as the 1990s and are now a normal part of any such discussion. Multiple theories now coexist and to frame those critical of the original Venus ideas as being somehow more fringe than the fertility/pornography theories is just misleading.

ants_everywhere 1 day ago||
I assume you're referring to the Catherine McCoid decolonizing gender stuff from the 90s? That is still talked about, but I'm not aware of it being taken seriously as a theory.

There are multiple theories yes, but they aren't substantially varied.

We also have a whole lineage of art from the prehistoric age to today and more figures than we did in the 1990s. Art from every period includes nude representations of women. The more recent art (which we are able to say more about)have connections to goddesses and fertility/reproduction/sex. The continuity of art suggests there should be a continuity of explanation. But the McCoid theory handles the oldest art as a special case different in kind from art that didn't come long after.

Even among the competing hypotheses, they're more closely related than many people realize. This is because religion, sex and fertility were more closely related in the ancient world than they are today. See, for example, temple prostitution.

The one outlier among the current theories I'm aware of is that the figures are supposed to show you what obese people look like. The evidence for that isn't great. For example the 2012 Dixson paper is based on having college students rate the statues for attractiveness, which seems like it's going to tell you nothing useful about the statues. But even they say the statues were about survival and reproduction, e.g.

> They may, instead, have symbolized the hope for survival and for the attainment of a well-nourished (and thus reproductively successful) maturity, during the harshest period of the major glaciation in Europe.

AlecSchueler 19 hours ago||
> I assume you're referring to the Catherine McCoid decolonizing gender stuff from the 90s?

Amongst others.

> That is still talked about, but I'm not aware of it being taken seriously as a theory.

I'm not sure what to say to this because you're essentially arguing that your own ignorance is representative of the reality in the field. You recognise that these questions have been part of the discourse now for a third of a century but at the same time suggest it's all done in jest? I really don't know how to read this.

> We also have a whole lineage of art from the prehistoric age to today

We very much do not. There are many gaps, especially significant ones in pre-history and you're skipping multiple millennia to stretch a connection to temple prostitution, as well as ignoring the very clearly evident variation in the representations of women more recently across geographies.

> Even among the competing hypotheses...

Well we can end it here because the salient point is that pornographic representations of women is no longer the dominant theory and you seem to accept that.

krapp 1 day ago|||
I mean, what do you think the most common application of AI image generation is going to be?
HeartStrings 1 day ago|||
[flagged]
FearNotDaniel 1 day ago||
[flagged]
sunaookami 1 day ago||
>normalising paedophilia these days

These arguments are so tiring, always arguing in bad faith. It's government-level "think of the children" arguments when it's about a simple drawing.

Jackson__ 1 day ago||
I wish open source models would go this route of quality. Instead, every single release since and including flux dev have had some of the worst AI look I've seen so far. Sure these models might produce less mangled bodies, but in terms of actual aesthetics they lack behind even SD1.5 while needing >10x the amount of parameters.
mohsen1 1 day ago||
I'm furnishing a new apartment and Nano Banana has been super useful for placing furniture I want to purchase in rooms to make a judgment if things will work for us or not. Take a picture of the room, feed Nano Banana with that picture and the product picture and ask it to place it in the right location. It can even imagine things at night or even add lamps with lights on. Super useful!
eig 1 day ago|
While I think most of the examples are incredible...

...the technical graphics (especially text) is generally wrong. Case 16 is an annotated heart and the anatomy is nonsensical. Case 28 with the tallest buildings has the decent images, but has the wrong names, locations, and years.

vunderba 1 day ago||
Yeah I think some of them are really more proof of concept than anything.

Case 8 Substitute for ControlNet

The two characters in the final image are VERY obviously not in the instructed set of poses.

SweetSoftPillow 1 day ago||
Yes, it's Gemini Flash model, meaning it's fast and relatively small and cheap, optimized for performance rather than quality. I would not expect mind-blowing capabilities in fine details from this class of models, but still, even in this regard this model sometimes just surprisingly good.
More comments...