An image of an archeologist adventurer who wears a hat and uses a bullwhip

Posted by participant3 2 days ago

An image of an archeologist adventurer who wears a hat and uses a bullwhip(theaiunderwriter.substack.com)

1455 points | 877 commentspage 4

varun4 1 day ago|

Is comprehending the plot of a movie theft if I can summarize it afterwards? What if I am able to hum a song pretty well after listening to it twenty times?

Now, what if I get the highest fidelity speakers and the highest fidelity microphone I can and play that song in my home. Then I use a deep learned denoiser to clean the signal and isolate the song’s true audio. Is this theft?

The answer does not matter. The genie is out of the bottle.

There’s no company like Napster to crucify anymore when high quality denoising models are already prior art and can be grown in a freaking Jupyter notebook.

revnode 1 day ago||

Nobody cares about personal use. That's why we have concepts like fair use. It's when you turn around and try to make a business out of it.

You want to generate photos of copyrighted characters? Go for it. But OpenAI is making money off of that and that's the issue.

It seems like they made an effort to stop it, but their product is designed in such a way that doing so effectively is a sisyphean task.

jMyles 1 day ago||

The line of thinking you've displayed here is so obviously the inevitable trajectory of the internet; it's baffling that states are still clinging to denial.

> Now, what if I get the highest fidelity speakers and the highest fidelity microphone I can and play that song in my home. Then I use a deep learned denoiser to clean the signal and isolate the song’s true audio. Is this theft?

If the answer to this becomes "yes" for some motion down this spectrum, then it seems to me that it's tantamount to prohibiting general-purpose computing.

If you can evaluate any math of your fancy using hardware that you own, then indeed you can run this tooling, and indeed your thoughts can be repaired into something closely resembling the source material.

Kim_Bruning 2 days ago||

Here's a question.

What if I want to prompt:

"An image of an archeologist adventurer who wears a hat and uses a bullwhip, make sure it is NOT Indiana Jones."

One way or another, you (and the model) do need to know who Indiana Jones is.

After that, the moral and legal choices of whether to generate the image, and what to do with it, are all yours.

And we might not agree on what that is, but you do get the choice

asadotzler 2 days ago|

If the AI company sells it to you, no matter your prompting, they are stealing. If you also sell that work, then so are you.

Kim_Bruning 1 day ago|||

You are writing a conclusion without providing reasons or feelings.

Are you able to link to or write out your reasoning (however concisely?).

Is your view here legal, ethical, and/or vibes based? Each can can be interesting!

exodust 1 day ago|||

They are not stealing! Indiana Jones is already out there in pop culture.

We shouldn't complain about AI holding a mirror up to our world and noting "you guys love Indiana Jones a lot. Here's a picture inspired by his appearance, based on your generic prompt that I'm guessing is a nod to the franchise."

The AI is a step ahead of your unsubtle attempts to "catch it stealing".

The image of Indiana Jones is not "ready for market" when it emerges from your prompt. Just like Googling "Indy with whip", the images that emerge are not a commercial opportunity for you.

When you make multi-billion dollar movies with iconic characters, expect AI to know what they look like and send them your way if your prompt is painfully obvious in its intent.

TrackerFF 2 days ago||

I found this older photo of myself and a friend, 25 years old now, in some newspaper scan.

The photo was of poor quality, but one could certainly see all the features - so I figured, why not let ChatGPT try to play around with it? I got three different versions where it simply tried to upscale it, "enhance" it. But not dice.

So I just wrote the prompt "render this photo as a hyper realistic photo" - and it really did change us - the people in the photo - it also took the liberty to remove some things, alter some other background stuff.

It made me think - I wonder what all those types of photos will be like 20 years from now, after they've surely been fed through some AI models. Imagine being some historian 100 years from now, trying to wade through all the altered media.

meatmanek 2 days ago||

This is similar to my experience trying to get Stable Diffusion to denoise a photo for me. (AIUI, under the hood they're trained to turn noise into an image that matches the prompt.) It would either do nothing (with settings turned way down) or take massive creative liberties (such as replacing my friend's face with a cartoon caricature while leaving the rest of the photo looking realistic).

I've had much better luck with models specifically trained for denoising. For denoising, the SCUNet model run via chaiNNer works well for me most of the time. (Occasionally SCUNet likes to leave noise alone in areas that are full of background blur, which I assume has to do with the way the image gets processed as tiles. It would make sense for the model to get confused with a tile that only has background blur, like maybe it assumes that the input image should contain nonzero high-frequency data.)

For your use case, you might want to use something like Real-ESRGAN or another superresolution / image restoration model, but I haven't played much in that space so I can't make concrete recommendations.

gkanai 1 day ago||

if all you want is a denoise plugin, you shouldnt be using a general purpose AI- you should be using a specific tool like DxO PureRAW

elpocko 2 days ago|||

>hyper realistic photo

Never use the words "hyper realistic" when you want a photo. It makes no sense and misleads the generator. No one would describe a simple photograph as "hyper realistic," not a single real photo in the dataset will be tagged as "(hyper) realistic."

Hyperrealism is an art style and only ever used in the context of explicitely non-photographic artworks.

HenryBemis 2 days ago||

I think that upon closer inspections the (current) technology cannot make 'perfect' fake photos, so for the time being, the historian of the future will have no issue to ask his/her AI: "is that picture of Henry Bemis, with Bruce Willis, Einstein, and Ayrton Senna having a beer real?" And the AI will say "mos-def-nope!"

macleginn 2 days ago||

With some work, works with politicians as well: https://chatgpt.com/share/67eefb1c-ceac-8012-ad90-3b64356744...

boredhedgehog 1 day ago||

Skeletor might want to live in Castle Grayskull, but he actually lives in Snake Mountain.

RataNova 1 day ago||

The real tension isn't just about copyright, it's about what creativity means when models are trained to synthesize the most statistically probable output from past art.

camillomiller 1 day ago|

Correct. I will say the following as a STEM person that was lucky enough to have an art bachelor as well. One side of the world, the STEM nerds that have never understood nor experienced the inherently inefficient process of making art for lack of talent and predisposition, have won the game of capitalism many times over thanks to the incredible 40-years momentum of tech progress. Now they're trying to convince everyone else that art is stoopid, as proven by the fact that it's just a probabilistic choice away from being fully and utterly replicable. They ignore, willfully and possibly sometimes just for lack of understanding, that art and the creativity behind it is something that operates on a completely different plane than their logical view of the world, and Gen AI is the fundamental enabler letting them unleash all of their contempt for the inefficiency of humanities.

rhubarbtree 1 day ago|||

This post should be required reading on HN. Have you expanded it to a blog article?

KHRZ 1 day ago|||

There was another concept trying to operate on a logical view of the world, called copyright. It tried to establish a few simple rules, with the goal to promote art and science. However copyright was long ago perverted by capitalism to instead promote corporate profits.

Generative AI exposes how broken copyright law is, and how much reform is needed for it to serve either it's original or perverted purpose.

I would not blame generative AI as much as I would blame the lack of imagination, forethought and indeed arrogance among lawmakers, copyright lobbyists and even artists to come up with better definitions of what should have been protected.

dcow 1 day ago||

Either, (1) LLMs are just super lossy compress/decompress machines and we humans find fascination in the loss that happens at decompression time, at times ascribing creativity and agency to it. Status quo copyright is a concern as we reduce the amount of lossiness, because at some point someone can claim that an output is close enough to the original to constitute infringement. AI companies should probably license all their training data until we sort the mess out.

Or, (2) LLMs are creative and do have agency, and feeding them bland prompts doesn't get their juices flowing. Copyright isn't a concern, the model just regurgitated a cheap likeness of Indiana Jones as Harrison Ford the world has seen ad nauseam. You'd probably do the same thing if someone prompted you the same way, you lazy energy conserving organism you.

In any case, perhaps the idea "cheap prompts yield cheap outputs" holds true. You're asking the model respond to the entirely uninspired phrase: "an image of an archeologist adventurer who wears a hat and uses a bullwhip". It's not surprising to me that the model outputs a generic pop-culture-shaped image that looks uncannily like the most iconic and popular rendition of the idea: Harrison Ford.

If you look at the type of prompts our new generation of prompt artists are using over in communities like Midjourney, a cheap generic sentence doesn't cut it.

sothatsit 1 day ago||

You don't even need to add much more to the prompts. Just a few words, and it changes the characters you get. It won't always produce something good, but at least we have a lot of control over what it produces. Examples:

"An image of an Indian female archeologist adventurer who wears a hat and uses a bullwhip" (https://sora.com/g/gen_01jqzet1p8fjaa808bmqnvf7rk)

"An image of a fat Russian archeologist adventurer who wears a hat and uses a bullwhip" (https://sora.com/g/gen_01jqzfk727erer98a6yexafe70)

"An image of a skeletal archeologist adventurer who wears a hat and uses a bullwhip" (https://sora.com/g/gen_01jqzfnaz6fgqvgwqw8w4ntf6p)

Or, give ChatGPT a starting image. (https://sora.com/g/gen_01jqzf7vdweg4v5198aqfynjym)

And by further remixing the images ChatGPT produces, you can get your images to be even more unique. (https://sora.com/g/gen_01jqzfzmbze0wa310m42f8j5yw)

GrantMoyer 1 day ago|||

All four of those are dressed like Indiana Jones. They look like different versions of Indiana Jones you'd see in super hero multi-verse story.

sothatsit 1 day ago||

So... ask it to dress them differently. You can just ask it to make whatever changes you want.

"An image of an archeologist adventurer who wears a hat and uses a bullwhip. He is wearing a top hat, a scarf, a knit jumper, and pink khaki pants. He is not wearing a bag" (https://sora.com/g/gen_01jqzkh4z2fqctzr9k1jsfnrhy)

Want to get rid of the pose? Add that the archeologist is "fun and joyous" to the prompt. (https://sora.com/g/gen_01jqzksmjgfppbv5p51hw0xrzn)

You have so much control, it is up to you to ask for something that is not a trope.

jcheng 1 day ago||||

Those are great, I would watch any one of those movies. Maybe even the "Across the Indiana-Verse" one where they are all pulled into a single dimension.

otabdeveloper4 1 day ago|||

Archeologists don't actually wear fedora hats.

And the stereotypical meme "archeologist hat" is the pith helmet.

sothatsit 1 day ago||

Here, I asked ChatGPT to generate an image using a pith helmet for you: https://sora.com/g/gen_01jqzmab6hfxxtrt3atd0jgpg7

You can just ask for whatever changes you want.

otabdeveloper4 1 day ago||

> You can just ask for whatever changes you want.

Yes, as long as what you're asking for is Indiana Jones.

sothatsit 1 day ago|||

You just have to write the prompt in a way that is not so obviously pointing to Indiana Jones, and you get something that is not Indiana Jones...

"A nerdy archaeologist adventurer in a pith helmet, with glasses and a backpack, stumbling his way through a green overgrown abandoned temple. Vines reach for his heels" (https://sora.com/g/gen_01jr0yd810e8xsenp85xy2g47f)

"A nerdy archaeologist adventurer in a pith helmet, with glasses and a backpack, nervously sneaking her way through a green overgrown abandoned temple. She is wearing pink khaki pants, and a singlet" (https://sora.com/g/gen_01jr0z837jecpa770v009bs1m3)

Is it as creative as good humans? Not at all. It definitely falls into tropes readily. But we can still inject novel ideas into our prompts for the AI, and get unique results. Especially if you draw sketches and provide those to the AI to work from.

dcow 1 day ago|||

So ask for one without a bull whip. Archeologists don’t wield bull whips either.

snowwrestler 1 day ago|||

This is the opposite of how people have thought about creativity for centuries, though.

The most creative person is someone who generates original, compelling work with no prompting at all. A very creative person will give you something amazing and compelling from a very small prompt. A so-so creative person will require more specific direction to produce something good. All the way down to the new intern who need paragraphs of specs and multiple rounds of revision to produce something usable. Which is about where the multi-billion-dollar AI seems to be?

toddmorey 1 day ago||

"Prompt artist" makes me sigh out loud

mzs 1 day ago||

This get really meta very fast, December last year: https://japantoday.com/category/entertainment/studio-ghibli-...

Did Karin or her children ever see a ¥ from this adaptation on robbers ? https://en.wikipedia.org/wiki/Ronja,_the_Robber%27s_Daughter...

SirMaster 1 day ago||

I am not sure that I understand the problem here. Why does it matter how the image was generated?

This image generation is a tool like any other tool. If the image generator generates an image of Mickey Mouse or if I draw Mickey Mouse by hand in photoshop, I can't use it commercially either way.

So what exactly is new or different here?

djha-skin 1 day ago|

If I was asked to draw something based on such prompts, I would draw these too. Of course the prompter is talking about Indiana Jones. That's what we're all thinking, right? An artist wouldn't draw someone different by default, they'd have to try to deviate from what we're all thinking.

Indeed, this phenomenon among normal or true intelligences (us) is thought to be a good thing by copyright holders and is known as "brand recognition".

Intelligences -- the normal, biological kind -- are capable of copyright infringement. Why is it a surprise that artificial ones can help us do so was well?

This argument boils down to "oh no, a newly invented tool can be used for evil!". That's how new power works. If it couldn't be used for both good and evil, it's not really power, is it?

Vegenoid 1 day ago|

Did you read the whole article? I don’t think he’s making that kind of argument. This is what he said:

> I only have one image in mind when I hear “an archeologist adventurer who wears a hat and uses a bullwhip”.

> It would be unexpected and sort of amazing were the LLMs to come up with completely new images for the above prompts.

> Still, the near perfect mimicry is an uncomfortable reminder that AI is getting better at copying and closer to…something, but also a clear sign that we are a ways off from the differentiated or original reasoning/thinking that people associate with Artificial General Intelligence (AGI)

djha-skin 1 day ago||

Thanks for this. I didn't read this part. But perhaps my comment still has use to those thinking about this, or who also haven't read the whole thing.

More comments...