ChatGPT Images 2.0 - Hacker News

Posted by wahnfrieden 2 days ago

System card: https://deploymentsafety.openai.com/chatgpt-images-2-0/chatg...

1037 points | 963 commentspage 6

szmarczak 2 days ago|

Wow, the difference between AI and non-AI images collapses. I hate the future where I won't be able to tell the difference.

Flere-Imsaho 2 days ago||

I wake up everyday, read the tech news, and usually see some step change in AI or whatever. It's wild to think I'm living through such a massive transformation in my lifetime. The future of tech is going to be so different from when I was born (1980), I guess this is how people born in 1900 felt when they got to see man land on the moon?

> Wow, the difference between AI and non-AI images collapses. I hate the future where I won't be able to tell the difference.

Image generation is now pretty much "solved". Video will be next. Perhaps things will turn out the same as chess: in that even though chess was "solved" by IBM's Deep Blue, we still value humans playing chess. We value "hand made" items (clothes, furniture) over the factory made stuff. We appreciate & value human effort more than machines. Do you prefer a hand-written birthday card or an email?

toraway 2 days ago|||

"Solved" seems a tad overstated if you scroll up to Simonw's Where's Waldo test with deformed faces plus a confabulated target when prompted for an edit to highlight the hidden character with an arrow.

Flere-Imsaho 2 days ago||

It's "solved" in that we have a way forward to reduce the errors down to 0.00001% (a number I just made up). Throwing more compute/time/money at these problems seems to reduce that error number.

abraxas 2 days ago||||

As someone born in 1975 I always felt until the last couple of years that I had been stuck in a long period of stagnation compared to an earlier generation. My grandmother who was born in the 1910s got to witness adoption of electricity, mass transit, radio, television, telephony, jet flights and even space exploration before I was born.

Feels like now is a bit of a catchup after pretty tepid period that was most of my life.

cubefox 2 days ago||

You will likely witness strongly superhuman AI, which dwarfs any changes your grandmother saw.

dag100 2 days ago|||

Chess exists solely for the sake of the humans playing it. Even if machines solved chess, people would rather play chess against a person than a machine because it is a social activity in a way. It's like playing tennis versus a person compared to tennis against a wall.

Photographs, videos, and digital media in general, in contrast, are used for much, much more than just socializing.

gekoxyz 2 days ago||

Well, for some of these images for the first time I can't tell that they are AI generated

rambojohnson 2 days ago||

Just tried it and got six fingers and half a thumb on a simple portrait. Mickey Mouse stuff.

thevinter 2 days ago||

Every time a new image gen comes out I keep saying that it won't get better just to be surprised again and again. Some of the examples are incredible (and incredibly scary. I feel like this is truly the point where understanding if something is AI becomes impossible)

lehmacdj 2 days ago|

So do you think there will be a better image model in a year?

throw310822 2 days ago|||

I'll bite: no I don't think so. If the examples are not cherry-picked and by "image model" we mean just the ability to generate pictures, this looks like parity with human excellence, there isn't much space for further improvement. The images don't just look real, they look tasteful- the model is not just generating a credible image, it's generating one that shows the talent of a good photographer/ designer/ artist.

Vachyas 2 days ago|||

I'm honestly unsure what could be improved at this point.

Consistency? So it fails less often?

Based on the released images, (especially the one "screenshot" of the Mac desktop) I feel like the best images from this model are so visually flawless that the only way to tell they're fake is by reasoning about the content of the image itself (ex. "Apple never made a red iPhone 15, so this image is probably fake" or "Costco prices never end in .96 so this image is probably fake")

thevinter 2 days ago|||

There is definitely room for improvement: https://gist.github.com/simonw/88eecc65698a725d8a9c1c918478a...

Especially when it comes to detailed outputs or non-standard prompts.

I do believe it will get even better - not sure it will happen within a year but I wouldn't be incredibly surprised if it did.

vunderba 2 days ago|||

Yep. “Where’s Waldo” has been a classic challenge for generative models for a while because it requires understanding the entire concept (there’s only one Waldo), while also holding up to scrutiny when you examine any individual, ordinary figure.

I experimented with the concept of procedural generation of Waldo-style scavenger images with Flux models with rather disappointing results. (unsurprisingly).

Vachyas 2 days ago||||

That's a good example, actually.

If you asked me what I expected, since this one has "thinking", it'd be that it would've thought to do something like generate the image without Waldo first, then insert Waldo somewhere into that image as an "edit"

throw310822 2 days ago|||

I wonder if at this point you could just ask the agent to iteratively refine the image in smaller portions.

RobinL 2 days ago||||

I'm been impressed when testing this model today, but it still can't consistently adhere to the following prompt: make me an image of a pizza split into 10 equal slices with space in between the them, to help teach fractions to a child.

It doesn't reliably give you 10 slices, even if you ask it to number them. None of the frontier models seem to be able to get this right

jinushaun 2 days ago||||

Cost? Speed?

vunderba 2 days ago|||

> I'm honestly unsure what could be improved at this point.

That's because you're focusing a little bit too much on visual fidelity. It's still relatively trivial to create a moderately complex prompt and have it fail miserably.

Even SOTA models only scored a 12 out of 15 on my benchmarks, and that was without me deliberately trying to "flex" to break the model.

Here's one I just came up with:

  A Mercator projection of earth where the land/oceans are inverted. (aka land = ocean, and oceans = land)

nickandbro 1 day ago||

I asked it to make me a xkcd comic:

https://chatgpt.com/s/m_69e8cc31dac48191a09bb9c00d5aa3fe

kinda funny, I guess

hersko 1 day ago|

Lol this is pretty funny

baalimago 2 days ago||

"Benchmarks" aside, do anyone actually use these image models for anything?

medlazik 2 days ago||

Look around? It's everywhere. Try talking to a graphic designer looking for a job theses days. Companies didn't wait for these tools to be good to start using them.

razorbeamz 2 days ago|||

Here in Japan every fucking food truck uses them for pictures of their menu, which really pisses me off because it's not representative of their food at all.

sumedh 2 days ago|||

People are using them for creating marketing material for their business.

croisillon 2 days ago||

MAGA to show how terrible Europe is ;)

PDF_Geek 2 days ago||

The free tier for ChatGPT feels pretty much nerfed at this point. I’m barely getting 10 prompts in before it drops me down to the basic model. The restrictions are getting ridiculous. Is anyone else seeing this?

etothet 2 days ago||

I would love to see prompt examples that created the images on the announcement page.

DauntingPear7 2 days ago|

You can by changing the view before the gallery

fsloth 1 day ago||

Do note the images will be sterilized and safe.

"Hey give me a comic of how to create a rocket engine i can build at home"

Unlimited creativity will be shackled by safety.

Still pretty amazing.

codebolt 2 days ago||

Anyone test it out for generating 2D art for games? Getting nano banana to generate consistent sprite sheets was seemingly impossible last time i tried a few months ago.

hersko 1 day ago|

I'm still looking for a free tool to convert images to 3d models well.

franze 2 days ago|

the tragedy of image generating ai is that it is used to massively create what already exists instead of creating something truly unique - we need ai artists - and yeah, they will not be appreciated

franze 2 days ago||

so yeah a smart move of openai would be to sponsor artists - provokant ones, junior ones, with nothing to lose - but that cell in the spreadsheet will be too small to register and will prop. never happen

weezing 1 day ago||

Why would we need AI artists tho?

More comments...