ChatGPT Images 2.0 - Hacker News

Posted by wahnfrieden 1 day ago

ChatGPT Images 2.0(openai.com)

Livestream: https://openai.com/live/

System card: https://deploymentsafety.openai.com/chatgpt-images-2-0/chatg...

989 points | 880 commentspage 5

fsloth 6 hours ago|

Do note the images will be sterilized and safe.

"Hey give me a comic of how to create a rocket engine i can build at home"

Unlimited creativity will be shackled by safety.

Still pretty amazing.

samiwami 1 day ago||

do they have anything similar to SynthID, or are they just pretending that problem doesn't exist?

I know this is probably mega cherry-picked to look more impressive, but some of the images are terrifyingly realistic. They seem to have put a lot of effort into the lighting.

alextheparrot 1 day ago||

> Integrating an imperceptible, robust, and content-specific watermark

From the system card someone linked elsewhere in the discussion

ai-tamer 23 hours ago||

Zhao et al. 2023 showed any imperceptible watermark is provably removable by generative regeneration: pass the image through an img2img or VAE, the model reconstructs it visually identical but starts from a different latent. Watermark gone. SynthID and similar schemes do hold up well against normal sharing: recompression, crops, color tweaks, Twitter's pipeline. That covers most users. But the asymmetry is stuck — normally a GPU and a bit of motivation should be enough to strip it. Right? Got a tool to share? ;-)

ajam1507 18 hours ago|||

> do they have anything similar to SynthID, or are they just pretending that problem doesn't exist?

At least they aren't pretending that a solution exists.

Legend2440 1 day ago|||

I think we are just going to have to accept that realistic images can be easily fabricated now.

Seeing is not believing anymore, and I don't think SynthID or anything like it can restore that trust in images.

Barbing 14 hours ago|||

It's going to mess up accountability.

Some politician will be recorded doing something & he'll have his people release a thousand photos/videos of him doing crimes. And they'll say, look, it's a smear campaign.

This is just one stupid example, but people will have better schemes.

Also global coordinated releases of fake content and hypertargeted possibly abusive content. Virtual kidnappings will take off, automated & scaled.

userbinator 14 hours ago||

Some politician will be recorded doing something & he'll have his people release a thousand photos/videos of him doing crimes. And they'll say, look, it's a smear campaign.

And his enemies will do the same, hopefully resulting in less blind trust for everyone in the population, which can only be a good thing.

Barbing 4 hours ago||

I would’ve paused image models for now until we’ve better educated our less-savvy neighbors.

pstuart 21 hours ago|||

Hopefully the arms race will balance out with improved AI image detection, but I can see how that will never be guaranteed to be reliable.

losvedir 18 hours ago|||

I feel like asking the image generators to mark AI images is the wrong way to go about it. It's like trying to maintain a blocklist. It seems better to me to have the major camera manufacturers or cell phones cryptographically sign their images as real.

93po 18 hours ago||

I feel like this idea comes up often and in my opinion it doesn't solve anything. Take a picture of an AI image and you've made this approach useless. Which then goes to the argument of "well you'll see it's a picture of a picture" to which I will say there are plenty of ways to make this not appear so, and the ultimate form of this argument is that you can eventually project light directly into the photosensors, or otherwise hack the input between the photosensors and the rest of whatever digital magic that turns light into a JPG on your phone.

daemonologist 17 hours ago||

SynthID survives basic transforms including screenshots/photos, although it can of course be defeated. Even still it helps with the laziest fakes, which there seem to be a lot of - I've seen several quite widespread misinformative images over the past couple months that failed a synthID check.

Anyways I think approaching the problem from both directions is probably good.

swingboy 21 hours ago||

Maybe a stupid question, but does the SynthID still exist if you screenshot and crop your generated image? What if you screenshot, rotate _just_ a bit, and crop? Or apply some other effect to the image like adjusting the coloring a little bit, adding some blur, etc.

alextheparrot 21 hours ago||

The paper they published last year goes over some of these transformations: https://arxiv.org/pdf/2510.09263

JimsonYang 20 hours ago||

> you can make your own mangas

No you can’t.

You still have the studio ghibili look from the video. The issue of generating manga was the quality of characters, there’s multiple software to place your frame.

But I am hopeful. If I put in a single frame, can it carry over that style for the next images? It would be game changing if a chat could have its own art style

Oras 21 hours ago||

My test for image models is asking it to create an image showing chess openings. Both this model and Banana pro are so bad at it.

While the image looks nice, the actual details are always wrong, such as showing pawns in wrong locations, missing pawns, .. etc.

Try it yourself with this prompt: Create a poster to show opening game for Queen's Gambit to teach kids to play chess.

lxgr 21 hours ago||

It almost nailed it for me (two squares have both white and black color). All pieces and the position look correct.

tempaccount5050 21 hours ago||

What move? Who's turn is it? Declined or accepted? Garbage in, garbage out.

bogtap82 21 hours ago|||

In some cases I would agree with this, but image model releases including this one are beginning to incorporate and market the thinking step. It is not a reach at this point to expect the model to take liberties in order to deliver a faithful and accurate representation of your request. A model could still be accurate while navigating your lack of specificity.

timacles 20 hours ago||||

Kasparov vs Karpov ‘87 Olympiad. Move 6

dudul 21 hours ago|||

What do you mean? Parent clearly describes the Queen's Gambit. 1.d4 d5 2.c4 There is no room for ambiguity here.

kuboble 17 hours ago||

King Indian Defense would be a better prompt as Queen's Gambit can now refer to e.g. some scene from Netflix series.

nickandbro 7 hours ago||

I asked it to make me a xkcd comic:

https://chatgpt.com/s/m_69e8cc31dac48191a09bb9c00d5aa3fe

kinda funny, I guess

hersko 6 hours ago|

Lol this is pretty funny

RigelKentaurus 23 hours ago||

If every single image on their blog was generated by Images 2.0 (I've no reason to believe that's not the case), then wow, I'm seriously impressed. The fidelity to text, the photorealism, the ability to show the same character in a variety of situations (e.g. the manga art) -- it's all great!

vunderba 18 hours ago||

I decided to run gpt-image-2 on some of the custom comics I’ve come up with over the years to see how well it would do, since some of them are pretty unusual. Overall, I was quite impressed with how faithful it adhered to the prompts given that multi-panel stuff has to maintain a sense of continuity.

Was surprised to see it be able to render a decent comic illustrating an unemployed Pac-Man forced to find work as a glorified pie chart in a boardroom of ghosts.

https://mordenstar.com/other/gpt-2-comics

rambojohnson 11 hours ago||

Just tried it and got six fingers and half a thumb on a simple portrait. Mickey Mouse stuff.

mvkel 17 hours ago||

I wonder if this confirms version 1 of some kind of "world model."

It has an unprecedented ability to generate the real thing (for example, a working barcode for a real book)

ghstinda 7 hours ago|

Humans have a new tool to make porn.

More comments...