Zuckerberg 'Personally Authorized and Encouraged' Meta's Copyright Infringement

Posted by spankibalt 4 days ago

Zuckerberg 'Personally Authorized and Encouraged' Meta's Copyright Infringement(variety.com)

https://apnews.com/article/meta-mark-zuckerberg-ai-publisher...

489 points | 452 comments

ben_w 4 days ago|

A lot of people would be very pleased if this leads to Zuckerberg getting even the statutory minimum damages ($750?) on each infringement.

The previous infringement case with Anthropic said that while training an AI was transformative and not itself an infringement, pirating works for that purpose still was definitely infringement all by itself. The settlement was $1.5bn, so close to $3k for each of the 500k they pirated, so if Zuckerberg pirated "millions" (plural) it is quite plausible his settlement could be $6bn.

qingcharles 4 days ago||

What's frustrating is all those kids who got criminal charges for running MP3 sites back in the day [1], and this guy rips off every piece of media in existence and will walk away literally because he's too rich to be charged.

[1] See, e.g. https://en.wikipedia.org/wiki/Oink%27s_Pink_Palace#Legal_pro...

shrubby 3 days ago|||

https://pluralistic.net/2025/04/23/zuckerstreisand/

Cory Doctorow wrote a nice summary of the Zuckerstreisand book by Sarah Wynn-Williams.

"First, Facebook becomes too big to fail.

Then, Facebook becomes too big to jail.

Finally, Facebook becomes too big to care."

gherkinnn 3 days ago|||

> Eventually, [Zuckerberg] manages to sit next to Xi at a dinner where he begs Xi to name his next child. Xi turns him down.

That Mark never fails to deliver.

malshe 3 days ago||

When I read that I felt bad for his then unborn child who was already being used by his father for pushing his nefarious business on to a dictator

matheusmoreira 3 days ago||||

> When Wynn-Williams give birth to her second child, she hemorrhages, almost dies, and ends up in a coma.

> Afterwards, Kaplan gives her a negative performance review because she was "unresponsive" to his emails and texts while she was dying in an ICU.

Holy shit.

qingcharles 3 days ago||||

Thank you, that was the quote I was thinking of, but couldn't remember.

HackerThemAll 2 days ago||||

I would replace "too big" with "too rich". Other than that, I agree.

AgentME 3 days ago|||

I liked Doctorow better before he cheered for stricter copyright enforcement.

fsflover 3 days ago||

He has always been against hypocrisy.

falsemyrmidon 3 days ago||||

https://en.wikipedia.org/wiki/Capitol_Records%2C_Inc._v._Tho...

24 songs and was at one point $80k per song, almost 20 years ago. Let's let Zuck off with an even 100k per infringement.

cestith 3 days ago||

It was his decision and he conspired with his employees to do it for profit. The statutory maximum IIRC is around $250k per work, on the criminal enforcement side. If the rights holders can show actual damages greater than that they can sue civilly for those damages plus some fixed amount per work.

matheusmoreira 3 days ago||||

Definitely what pisses me off the most. All these "pirates"? Arrested. Why isn't the copyright industry raiding the homes of these tech billionaires then? Why isn't SWAT pointing guns at their faces while the squad seizes all of their computers and equipment? Why aren't these CEOs in cuffs?

chadgpt2 3 days ago||

Because society is based on power structures and the people at the top of power structures generally do not arrest themselves.

nadermx 3 days ago||||

I just don't see why everyone seems to not be cheering that perhaps we are not going to go back to the days where all those kids are going to be re charged. It almost feels like everyone wants to go back to labels carpet bombing students with lawsuits[0]

[0] https://w2.eff.org/IP/P2P/riaa-v-thepeople.html

davkan 3 days ago|||

As someone who’s engaged in private piracy basically my entire life I’ve never even considered venturing into gray areas of licensing when procuring for my company. In fact I’ve done the opposite and rooted it out wherever I’ve found it.

It just seems obvious to me that a profit seeking venture should be held to a higher standard when it comes to infringing on the property rights of other companies and individuals, especially if they seek to enforce their own.

Those kids weren’t hypocritically enforcing their own property rights and making employees sign ndas while downloading shit from tpb.

justacrow 3 days ago||||

Do you think if there was a mass movement of students moving off Spotify and downloading MP3s, they would _not_ be charged today?

The hypocrisy is what has at least me upset

csa 3 days ago||||

> I just don't see why everyone seems to not be cheering that perhaps we are not going to go back to the days where all those kids are going to be re charged. It almost feels like everyone wants to go back to labels carpet bombing students with lawsuits

It’s currently just as bad but in a different way, imho.

The ability for labels (or whoever owns the rights) to wantonly invoke automated DMCA copyright strikes and demonetization on social media channels like YouTube is borderline criminal to me.

Their lobby did a great job getting them more than they deserved (specifically with regards to the facilitation of capricious invoking of DMCA), but the abuse of the rules limits the growth of the creator economy in very unhealthy ways.

themafia 3 days ago||||

False dichotomy. We can obviously have both. We can destroy corporations that rely on copyright to exist and then abuse that system to profit. We can also ignore college students and minor contributory copyright infringement.

The difference in scope here should be obvious.

We can similarly punish drug dealers while not punishing drug users. In fact it's already policy in large parts of the USA.

nadermx 3 days ago||

To quote another user in this thread

"Thats such a non sequitur. This isnt a weed legalisation argument, its "Do we make IP worse for everyone, because you dont like some people benefiting from fair use"."

armada651 3 days ago||

When corporations were posed with this question numerous times in the past, their answer has always been an emphatic "Yes!".

Teever 3 days ago||||

Because the 'perhaps' there is a load-bearing word that is doing a lot of work and it's going to be come crashing down sooner or later.

Of course some kids are going to be charged for this kind of shit, it's still a rules for thee but not for me world, the 'not for me' folks are just a hell of a lot more brazen about it.

watwut 3 days ago|||

Because there is no reasonable expectation that we are not going back to those days. In fact, we are more likely to go back to those days then not.

Those students are not Zuckenberg. They will not be treated as Zuckenberg. The legal theories that apply to them dont apply to Zuckenberg and vice versa. They do not have money to mount defense and if they do, they will be in debt till the end of their lives.

borgai 3 days ago||||

didn't all of this ai stuff happen because they gave away llama? worth it imo

NoMoreNicksLeft 4 days ago||||

What's frustrating is that I don't even consider infringement to be a crime. Why are you all so upset about this, rather than his real crimes?

matheusmoreira 3 days ago|||

I'm a copyright abolitionist. I don't care at all that they're training AIs on copyrighted works. I care a lot that they're not getting relentlessly hunted down by the copyright industry for it like all the "pirates" that came before them. The copyright industry has actually ruined lives by litigating their "infringement" nonsense. It's only fair that they go after this guy as well.

His constant violation of people's privacy is also horrendous and worthy of condemnation, but that's not directly related to the copyright infringement matter. It's a separate issue.

thrance 3 days ago|||

Yes, wanting the law to be applied fairly isn't incompatible with also seeking to change it.

NoMoreNicksLeft 2 days ago||||

"X shouldn't be illegal at all" and "I want this person or company I hate to be ruined for having done X" aren't mutually reasonable positions. Even less so when the person or company you hate has committed real crimes. Grow up.

matheusmoreira 2 days ago||

They are perfectly reasonable positions. "X shouldn't be illegal" implies that it is, in fact, currently illegal. Therefore Facebook and its executives and especially its CEO should absolutely suffer the full consequences of violating those laws, just like all of those people the copyright industry ruthlessly prosecuted.

Anything less than this means it's rules for thee and not for me. Laws cease to have meaning when people realize and internalize the idea that they are just tools of the elite to keep the poors in line instead of proper instruments of justice that apply to everyone equally. That's an extremely dangerous thing for the public to realize and internalize, for obvious reasons.

NoMoreNicksLeft 2 days ago||

>They are perfectly reasonable positions. "X shouldn't be illegal" implies that it is, in fact, currently illegal.

"We were just following the rules" got people justifiably hanged not so many years ago. There must be principle behind what it is you would enforce, or you're not one of the good guys. If you give a shit about "currently illegal", I won't spend any more time listening to or worrying about what you think should be legal.

bombcar 3 days ago|||

If this was guaranteed to result in either Facebook being completely destroyed, or copyright abolished, I’d be ride-or-die for either outcome.

But we all know it’ll be a slap on the wrist for Meta and nothing will change.

ethbr1 4 days ago||||

You get Al Capone on the charge you can make stick.

timcobb 3 days ago||

Right but Al Capone did jail time, here Zuck gets to break and enter into people's homes, take their stuff, then haggle for it after-the-fact, all the while keeping the civilization-domination apparatus that he built using the stuff he stole? That is super not fair. Ordinary people could certainly not get away with that.

ethbr1 3 days ago||

The US justice system doesn't start from fair. It starts from what you can prove to the letter of the law.

And when you're targeting someone / something with unlimited lawyers, you'd better have ironclad evidence that exactly that happened in exactly the way the claim is written.

timcobb 3 days ago||

Okay, sure, but I'm talking about being satisfied. I understand reality and that I may not get the satisfaction I would like. And specifically the example of Al Capone who was, yeah, got for tax evasion, but at least was treated ultimately like the criminal he was.

ethbr1 3 days ago||

I mean, he was sentenced to 11 years and served 7 1/2.

But untreated (at the time, no penicillin) syphilis turned him into a mental pre-teen after his release, so I guess the universe serves some justice where the laws of the land do not.

hsuduebc2 4 days ago||||

I'm kinda being upset because on top of his ridiculously amoral and sometimes illegal behavior there are people which lives were ruined because they shared few mp3 files. Now this person once again — have absolutely no responsibility for his actions even for something so idiotic like copyright infringement when others were severely punished.

stubish 4 days ago||||

Lets define more things society doesn't want to happen as not-crimes so we can do more of them.

verisimi 3 days ago||||

Principles and law (that determines 'crime', a legal word) are not the same thing.

qingcharles 4 days ago||||

Why not both?

bix6 4 days ago||||

What are his real crimes?

archagon 3 days ago||||

Because the rich can do it and we can’t.

protocolture 3 days ago||

I do it literally all the time.

cestith 3 days ago||

You pirate other people’s works for profit all the time?

NoMoreNicksLeft 2 days ago||

Since when has Meta profited off of this shit? I don't know why their stock price hasn't cratered yet, but it's not because they're raking in the big cash training LLMs on 40 yr old cookbooks.

cestith 2 days ago||

The Metaverse wasn’t a great success either, but do you think the motive wasn’t profit? Do you think Meta is a public benefit corporation?

NoMoreNicksLeft 2 days ago||

We could have a conversation about that. Or rather, I could have a conversation about it with someone intelligent, who can actually appreciate nuance and detail. You've got some sort of mental caricature of mustache-twirling billionaires, where everything is "profit".

It's pretty clear that Meta wasn't about profit, given that no amount of "sunk cost" could explain what happened. That had more to do with self-aggrandizement and his belief that he was some sort of digital messiah that would get to usher humanity into another world.

cestith 1 day ago||

Surely the board and shareholders were lead to believe it was for a profit motive. Nobody puts billions of dollars of stock positions into a worse version of Second Life because they think it’s a world changer. They do it to make more money than putting their investments elsewhere. The company said repeatedly that the Metaverse was where trade and business would increasingly happen in the future.

A bad bet doesn’t mean the motive wasn’t to win the bet.

j-bos 4 days ago|||

It's the increase in emotionality, principles loosely held, it allows a particular goal they get tossed, Tbc this extends far beyond the current topic and commenters.

_s_a_m_ 3 days ago|||

I a just world he should end forever in jail for the things he has done

grebc 4 days ago|||

Nothing will happen to him/Meta while DJT is president.

He bought the best protection around for breaking the law.

dehrmann 4 days ago|||

I'm not sure what Trump's levers are with this since it's a civil matter. There's no DOJ--it's publishers and an individual vs. Meta.

kevin_thibedeau 4 days ago||

He likes sham investigations of attorneys general.

GolfPopper 4 days ago|||

[flagged]

stackghost 4 days ago||

[flagged]

LastTrain 4 days ago||

Psst. The Epstein Files are the distraction…

fzil 4 days ago||

i thought the iran war was distraction from the epstein files. i'm losing all track of all these distractions.

alex1138 3 days ago||

[flagged]

timcobb 3 days ago|||

Okay but... I am very unimpressed by this. How is it that he then gets to still be an AI monopolist/hegemonist? How's that fair? He basically force-acquired all this stuff without asking, now he's haggling for it later. Where are the criminal charges? Where is the deprivement of, if not freedom, then equity assets.

utopiah 3 days ago|||

Here I am, finally cheering for IP lawyers. /$

gloxkiqcza 4 days ago|||

For context, his net worth is ~$220 billion.

azinman2 4 days ago||

And meta's worth is much more than that. He's not personally paying.

ben_w 4 days ago||

A company being "worth" some amount doesn't mean it has that much money and real property; it means there exist people willing to buy shares, on the margin, at a price which works out like that. One of the common (very rough) approximations is that a business is worth as much as the profit it's expected to make over the next 20 years. But one of the reasons (there are many) that this is only a rough guide, is that if you tried to sell too much of a big company all in one go, it usually depresses the price a lot, and the other way around (trying to buy a whole company) tends to raise the price a lot; both effects are because most people have different ideas about how much any given company is really worth despite that rough guide, and trade their shares at different prices while you're doing it. You may note this is a circular argument, this is indeed part of the problem.

IIRC, Facebook's cash is more like $81-82 billion.

Nevermark 3 days ago|||

Yes it is a different kind of worth, but it is not worth less because of it.

This common argument to not take market cap valuations seriously doesn't hold.

True, Meta as an entire entity is not liquid. A forced sale in entirety would produce a massive reduction in compensation. But that is a highly unlikely and contingent reduction.

It is also true that if you have Meta's equivalent in cash, the value of the cash is likely to drop, while the value of Meta likely to grow, over any appreciable time. In that sense, $X cash is worth much "less" than the $X market cap.

These seeming contradictions are the result of different practical tradeoffs in structures of wealth. Not because market caps reflect misleading or overstated accounting.

butlike 3 days ago||

Would it be accurate to say market cap valuations are intrinsically valuable because they drive people to buy shares by projecting success?

Nevermark 3 days ago|||

Having a market cap? You mean a non-zero market cap?

Or do you mean a greater vs. lesser market cap? As compared to what?

If market cap was intrinsic value underlying itself, the business would be irrelevant. That is a circular “origin” of value that even novice investors would want to sell out of. That doesn’t work.

Success that matters for investors isn’t evidenced by a high market cap. But by a market cap not keeping up with business growth. I.e. shares becoming undervalued. By actual/predicted growth increasing faster than cap, or cap falling faster than actual/predicted downturns.

wasabi991011 3 days ago|||

No, market cap is a representation of the expected future success, but share cost depends on this expectation. Higher expected future success, higher share cost. So, the only reason to buy shares is if you expect the market cap to increase.

(I think, someone please correct me of I'm wrong?)

financetechbro 4 days ago||||

Zuck can just take out loans against his equity. He doesn’t need to sell any of it to benefit from Metas “worth”

litoE 4 days ago||

Plus, the money he borrows is not taxable. If he sold stock he would have to pay taxes before he could spend the income. Sure, he now owes money to someone, but he can refinance those loans again and again, and live tax-free the rest of his life while we, poor working stiffs, pay the taxes that built the airport where he parks the private jet he bought with the money he borrowed.

naniwaduni 4 days ago||

People seem to get the weird idea that borrowing against their stock holdings is some special thing rich people get to do with products that the rest of us don't have access to. It's not. Margin loans are widely available to the tune of ff+1%ish or lower, and if your brokerage's publicly offered rates are probably a ripoff, they're almost certainly negotiable. The bar for access to "institutional" rates is basically 100k, the regulatory requirement for portfolio margin.

Yes, there are specialized products catered to billionaires. But those aren't getting them better rates than someone with a $200k portfolio (Zuck is not conventionally a less risky borrower than the Options Clearing Corporation!). They exist to work around the fact that some borrowers can't just casually liquidate their stock on the open market, let alone at face value. By all accounts these products are more expensive than retail.

Mostly this is an expensive (but maybe still less expensive than taxes, depending on the rate environment—it's more of a no-brainer in ZIRPland) way to diversify out of a single-stock portfolio without selling by adding leverage. At Zuck's age, it's still very unlikely to make sense to borrow instead of sell to spend. He's been known to pay real taxes in the past, they just look small relative to his imputed wealth growth because rich people don't spend a lot relative to their wealth growth because they, quite by definition, have a lot of wealth.

_DeadFred_ 4 days ago|||

I think people take issue with the taxes loophole. They have GAINED from the VALUE of their stocks, but they don't pay taxes on that. It should be law if you realize value from stocks you pay capital gains on those stocks. So if a loan is collateralized by $1,000,000 worth of stock value taxes should be paid on $1,000,000.

grebc 4 days ago|||

I wouldn’t exactly call it a loophole as such. And you can’t just Willy Nilly tax loan values.

Any asset a bank is willing to take is collateral has the same issue, it’s just very pronounced in this instance.

If you take your idea at face value, anyone who borrows against their property to renovate/upgrade would be up for tax.

naniwaduni 4 days ago|||

The trouble is that a bank is not lending against the nominal value of the stock as collateral. That number is almost entirely fictional. Taxation of capital gains at time of sale is less a loophole than a reflection of the difficulty of assigning a fair price to assets that are not perfectly liquid.

Also, you'd totally gut retail home equity lending as collateral damage, with disastrous social policy consequences.

bombcar 3 days ago|||

I’ve never seen it explained as to how it’s different in kind from a home equity loan - you still need income from something to pay the loan back (and if you say you pay it from the loan proceeds you’re just donating to a bank with extra steps).

litoE 3 days ago||

It's very simple: if the terms are satisfactory and against an agreed upon collateral (e.g. shares) banks will give you a loan that does not require periodic payments. The interest on the loan does accumulate of course, and is just added to the principal that the borrower owes. The bank is happy as long as the value of the collateral is higher than the current outstanding loan. If the loan is in danger of going "under water" the bank can either liquidate the collateral to pay itself, or the borrower can renegotiate the loan and deposit additional shares.

It's similar to a reverse mortgage. Say Fred and Wilma own a house worth $4MM with no mortgage on it. With a reverse mortgage a bank will lend them $2MM. Fred and Wilma make no payments and continue to live in their house, spending the $2MM while the interest on that loan just increases the amount they owe the bank. After both Fred and Wilma have passed away the house is sold and the proceeds are used to pay back the outstanding loan. If there's still money left over, it goes to their heirs. If the sale comes up short, the bank loses money, which is why these reverse mortgages are typically less than 50% of the value of the house and they typically have higher interest rates than conventional mortgages. From Fred and Wilma's point of view, they can use the value of their house now, while continuing to live in it. They essentially spend their children's inheritance.

dylan604 4 days ago||||

At the same time, isn't Zuck's worth based on his shares of evilCorp while evilCorp's shares are what you just said. Ergo, the Zuck isn't worth all that either???

ben_w 4 days ago||

Yup. All the headlines following the pattern "${billionaire} {gains|loses} ${x} billion this week" are mostly just fluff, the marginal share price of any given stock wanders all over the place even without forced sales or people trying to buy them out.

There's some interesting exceptions, like how Musk has managed to sell Tesla shares totalling more or less as much as the business itself has made in total lifetime revenue; but even then, Musk's theoretical net worth is very different from how much he could get if he was forced to sell all his shares suddenly.

Owner-CEOs like Musk and Zuckerberg get all the effects of such randomness, but the only examples I can think of such people getting into billion-dollar legal troubles tend to be examples which go on to sink their companies completely, so I'm not sure what impact a fine of "merely" 10% of cash reserves would do to investor confidence as expressed in share price. And this is not the only legal case Meta's facing right now.

ScoobleDoodle 4 days ago||

It doesn't seem to be mostly just fluff to me.

MacKenzie Scott (Jeff Bezos' ex wife) show it can be turned into real money. As of December 2025 She had given away $7.1 billion in 2025 charitable donations, and $26.3 billion since 2019.

In reality there is the ability to execute on the shares to turn them into real money.

Jeff Bezos holds less than 10% of Amazon stock himself. Which is a huge amount of money, and a not insignificant amount of which can be turned into "real" money and even with some decline is still a phenomenal amount.

In that same time period the stock valuation has more than doubled.

bombcar 3 days ago|||

It’s real and unreal at the same time - as is true of many non-cash wealth.

You have a house? You can sell it next month for a certain price, sell it tomorrow for a bit less.

You own every house in your town? You can still sell a few for “full price” but liquidation of all of them is going to be a shock to the market.

cestith 3 days ago|||

She is in fact on top of more value in shares than when she started giving away money.

thomastjeffery 4 days ago|||

That's why billionaires use shares as collateral to get loans. It's money once removed, and it continues to be spendable so long as the share price stays high.

I sincerely doubt that Meta's share price would crash as a result of Zuckerberg getting an expensive judgement.

goofy_lemur 3 days ago|||

I would be very pleased if Zuckerberg got away with it. I don't copyright or infringe, but honestly, he was the one guy of the big guys who released everything as open source.

If he did the right thing, then we should all support his choice to use it under fair use.

Freedom means that the state shouldn't punish a public benefactor.

It makes me furious to see programmers fighting against an open source hero.

If it was closed source for Meta profit, I understand.

But they gave it away free, so it infuriates me that people support damages for a public benefactor.

Churches and schools get free money from the government. We need a rule that open AI (not the company, I mean the actuality), can torrent whatever they want because it's for the public good.

Otherwise the rich companies win and can pay their sources and the small guys are screwed.

If Meta has to pay for their training data, they will need to profit from it and won't be able to offer it free.

Nobody in their right mind would ever support the publishers here.

bamboozled 4 days ago|||

There will be not a single consequence for any of this.

nielsbot 4 days ago||

In a just system there would be jail time (if found guilty). Barring that a modest fine. Say, $1T.

LastTrain 4 days ago||

That’ll keep him from even thinking of doing something like that again! /s

jcalvinowens 4 days ago||

I had to block meta's ASN on my personal cgit server a few weeks ago because they were ignoring robots.txt and torching it. Like hundreds of megabytes of access logs just from them, spread around different network blocks to clearly try and defeat IP based limiting. I couldn't believe it.

dawnerd 3 days ago||

I had to last year too, nonstop crawling, random urls that didn't exist. It looked like they were trying to proxy user queries through to a search endpoint too. The ASN matched so I know it wasn't someone spoofing them.

bflesch 4 days ago|||

IMO ASN-based blocking should be much more common, but unfortunately it is not supported as a first-class configuration option in many common tools.

jcalvinowens 4 days ago|||

Yeah, I dont know how anybody stays sane without it. I have a list of over a thousand ASNs I blackhole at this point...

Mine is a daily bash cronjob that fetches a text-based database and uses grep to build an nftables-apply script with all the IPs for the blocked ASNs. I keep meaning to share it, but it's embarrassingly messy I haven't had time to clean it up...

noxvilleza 4 days ago|||

It's been a real game of cat and mouse over the last few years. I used to do daily iptables updates to block repeat scrapers on my small niche stats site I run. About 5-6 ago it become more common to see broader ranges - so I started blocking ASNs which worked great (esp for the regulars like Alibaba, Tencent, compromised DigitalOcean/OVH, ...). In the last 2-3 years though the overall bot traffic has skyrocketed - it's easy to spot bot activity after the fact (no requests to the CDN for static assets, user agent changes from one request to the next, predictable ID enumeration, etc) but not in a real time. They're also often using residential-based proxies and Cloudflare bot detection has become pretty bad.

jcalvinowens 3 days ago|||

Arms races suck. I've managed to find a few L7 tricks to catch the residential proxies and serve them an empty 200, but there are obvious trivial workarounds on the other end and if I start talking about them in public they won't last long... I wish I could share :/

pixel_popping 3 days ago|||

Cloudflare is so easy to defeat and almost everyone in the scrapping industry is selling solutions that automatically bypass, hcaptcha solving is also really cheap nowadays.

dlivingston 3 days ago|||

It would still be useful to share as an example and reference point. People can use Claude Code / etc. to re-write it to their specific situation.

Henchman21 3 days ago||||

It would break the internet to make this available to the average person. A large swath would actively choose to block stuff like: all of Meta, Alphabet, Apple, Amazon, etc etc etc.

Anyhoo, now you mention it this is the tack I am going to take in my own network, thanks!

chadgpt2 3 days ago||

Nah, they'd just pay botnet operators a few thousand bucks for proxy services.

walrus01 4 days ago||||

It's a real pain in the ass because in the absence of ASN based blocking, you often have to give something a long list of IP ranges in CIDR notation, and be certain you don't "miss" even one ipv4 /23 or /24 or a crawler will get through.

electronoora 3 days ago|||

[dead]

hsuduebc2 4 days ago|||

Hey, how do you identify them? Is there a service to recognize which of these companies scrapped you?

jcalvinowens 3 days ago||

Every few weeks I run my nginx access logs through a script that uses the same textual ASN database to tally them up and spit out a summary report. There are many different sources for periodic textual ASN databases you can parse with UNIXy tools.

websap 4 days ago||

[flagged]

jesse_dot_id 4 days ago|||

The world would be a much better place if these kinds of engineers had a spine.

websap 4 days ago|||

Yeah they’d have to use it to stand at the back of the unemployment line. Companies don’t care, someone more desperate will take the job.

dlivingston 3 days ago||

Are you one of those engineers building said crawlers, by any chance?

scottyah 4 days ago||||

Some spines are just crooked, and the extra rigidity would hurt more than help.

debo_ 4 days ago|||

"One moment: reticulating spines..."

ttoinou 4 days ago|||

They could even feed 20 kids

glaslong 3 days ago||

All those lawsuits against students who downloaded but didn't even redistribute mp3s. Less than a fair use transformation. Just the file download itself. ... Lesson learned: those students should have stolen millions instead!

watwut 3 days ago||

The real distinguishing criterion is whether you are super rich or not.

butlike 3 days ago|||

That may have been an information shaping campaign. If the end-user can get prosecuted, the discourse turns from a positive to a negative connotation inherently, which helps curb the behavior by the powers-that-be.

glaslong 9 hours ago||

So if we want to run this strategy, I suppose we need to fine Meta out of existence, as an example to the rest.

Steeeve 3 days ago|||

Actually, it was the uploads that were a problem. Not downloads.

spwa4 2 days ago||

You mean the courts found a technical legal explanation for why very large scale copyright infringement by multinationals was legal, but infringement by individuals of multinational owned copyrights was to be HEAVILY punished?

No shit.

Of course, the excuse doesn't even apply: the offense of the tech bosses is not training these models (they had that declared legal the second it became clear only the big companies would be training big models), the offense of all these tech companies is running a piracy site. Taking copyrighted works, storing them, reproducing them and then publishing the results to third parties, in many cases for payment, and organizing this practice knowingly, willingly as a company. Paying others to help them do it. This is the worst copyright offense one can possibly commit. It is what one public prosecutor referred to in the Nappster case as "organizing a criminal cartel to violate criminal law on a huge scale".

Tech bosses weren't sued for downloading, in other words, they were sued for uploading. For asking payment for publishing copyrighted works, without any money going to the authors.

When Kim Dotcom did that, in the words of the US Attorney general, this is "charges of criminal copyright infringement, racketeering, and money laundering" (you see, getting paid for criminal activity is money laundering, a charge that was also made against teenagers selling warez cds)

ChatGPT tells me, unaware and unwilling to discuss the INCREDIBLE unfairness, that in the US, first-time offenders can face up to 5 years in prison, while repeat offenders can face up to 10 years PER OFFENCE. ChatGPT is unwilling to discuss it.

The courts are also unwilling to discuss this, but no worries! New technicality: only a public prosecutor gets to ask ...

Dario Amodei wilfully committed large scale copyright infringement, as did all the tech bosses from Musk to Bezos ... and "strangely" nobody in any court even mentioned how much 10 years times 500,000 is, despite systematically threatening that punsihment repeatedly in the cases against teenagers.

Note that the law is extremely clear that company management IS NOT shielded if ordering criminal actions (violating criminal law, as opposed to violating a contract). In that case, company management carries full criminal culpability, INCREASED from if they did it themselves. Of course, this is only ever applied for refusing to pay tax or court fees.

If the law were applied alike and fairly to individuals and tech bosses, Amodei would have to be VERY lucky the human race still exists by the time his corpse leaves prison.

strathmeyer 3 days ago||

[dead]

modeless 4 days ago||

Funny how people are suddenly on Elsevier's side. It's clear to me that AI training is transformative fair use under existing law. Maybe this will be the case to prove it.

eloisius 4 days ago||

I find it grating that so many AI boosters try to frame pushing back against the AI industry as a sudden about-face for everyone that spent the last 20 years pushing back against the copyright industry. I’m also in favor of decriminalizing or legalizing small amounts of pot for personal use. That doesn’t mean I’m behind industrialized narcotic production on such a huge scale that it that it starts to distort the economy, and companies looking for new ways to add methamphetamine to every goddamn product.

protocolture 3 days ago|||

>I find it grating that so many AI boosters try to frame pushing back against the AI industry as a sudden about-face for everyone that spent the last 20 years pushing back against the copyright industry.

What do you think the outcome of tightening fair use is going to be? Do you think its going to be most effectual against these big evil AI companies we are meant to fear? Or is it going to end up putting more individual creators on the end of Disneys pitchforks?

Like if you support creating a gun to kill a monster, that's great. But you need to understand that weapons rarely only target the person you want them to. And its unlikely that any bill that specifically targets a certain size or profit margin is going to make it all the way into law without being generalised to the approval of large IP holders.

Its much much (much) better to look at this as an opportunity to erode IP laws for everyone, than to make them worse and hope that your particular enemies are the only ones that are affected.

>That doesn’t mean I’m behind industrialized narcotic production on such a huge scale that it that it starts to distort the economy, and companies looking for new ways to add methamphetamine to every goddamn product.

Thats such a non sequitur. This isnt a weed legalisation argument, its "Do we make IP worse for everyone, because you dont like some people benefiting from fair use".

citadel_melon 3 days ago||

One could imagine a different legal standards for recreational, research, and commercial uses.

warkdarrior 3 days ago||

> One could imagine a different legal standards for recreational, research, and commercial uses.

Meta used allegedly stolen copyrighted materials to train a model they shared for free with the whole world. Is this a recreational use?

watwut 3 days ago||

No it is not recreational use. And no, they are not freely sharing it. It is use to build a monopoly, make hones competition impossible and plan charge as much possible on it.

It is the same playbook everytime. We dont have to be naive and pretend meta is doing something for other peoples benefit.

protocolture 3 days ago||

>And no, they are not freely sharing it

Are you unable to access this page?

https://www.llama.com/llama-downloads/

Or this one?

https://lmstudio.ai/models/meta/llama-3.3-70b

>It is use to build a monopoly

How?

>We dont have to be naive and pretend meta is doing something for other peoples benefit.

Meta benefits from the current war of open model competition, but we also benefit from it. In particular, participating in all this makes it hard for them to pull the ladder up when the market changes. They will have to justify why whatever new hotness is better than these existing models already on our hard drives.

butlike 3 days ago||||

Tell me more about these methamphetamine products. Inquiring minds would like to know!

dfxm12 4 days ago||||

It would be disingenuous framing because the argument against copyright stems from a belief that information should be free. Meta does not do things in this spirit. There's no about face needed...

AnthonyMouse 4 days ago||

> It would be disingenuous framing because the argument against copyright stems from a belief that information should be free. Meta does not do things in this spirit.

Don't they? They release the llama model weights, they do things like this:

https://www.opencompute.org/wiki/Open_Rack/SpecsAndDesigns

They also make significant contributions to Linux and are the originators of popular open source projects like zstd and React.

They make their money from selling ads, not selling licenses.

xigoi 3 days ago||

They only released the weights because someone leaked them.

AnthonyMouse 3 days ago||

Someone leaked the llama 1 weights before they were released. That doesn't explain why they would release the subsequent versions except that they wanted to.

2ndorderthought 4 days ago|||

Speaking of ai and meth, have you seen videos of the palantir CEO Alex karp? Dude looks like he's regularly getting the same meth shots Hitler used to get.

But I hear you. One of my biggest tells that someone can't be reasoned with is when they resort to whataboutism without any consideration for how 2 situations can actually be different even if there is some commonality. It's a powerful bad faith argument technique. When that style of argument comes up I nod my head and walk away. Some people are just doomed.

chungusamongus 4 days ago||

[flagged]

2ndorderthought 4 days ago||

I am not s copyright maximalist, but I would tell you be careful of a world where copyright and IP is meaningless. Might as well let any other country/company one shot your entire industry.

chungusamongus 4 days ago||

Slippery slope, false dilemma, etc. What other fallacies do you have in your utility belt, batman?

2ndorderthought 4 days ago||

How did you know I was Bruce wayne?

malfist 3 days ago||

Where's my goddamn electric car Bruce?

2ndorderthought 3 days ago||

BYD sells them for super cheap.

nadermx 4 days ago|||

I also find it funny, I said this regarding the other thread and article[0]

'"They then copied those stolen fruits"

How are these fruits "stolen" if they still have what was allegedley stolen?

Dowling v. United States, 473 U.S. 207 (1985): The Supreme Court ruled that the unauthorized sale of phonorecords of copyrighted musical compositions does not constitute "stolen, converted or taken by fraud" goods under the National Stolen Property Act

And even if, arguendo, sure its stolen. The purpose of copyright is to "To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries"

And you would be hard pressed to prove that LLM's haven't advanced the arts and sciences, so at bare minimum transformative, ie fair use.'

[0] https://news.ycombinator.com/item?id=48026207#48029072

Johnny555 4 days ago|||

>How are these fruits "stolen" if they still have what was allegedley stolen?

If you write a book and I take it and embed its knowledge into my product that is so pervasive that no one needs to buy your book any more (and I don't even credit you so no one knows where that knowledge came from), to you really still have what was stolen? And I didn't even buy a copy of your book to copy it.

AnthonyMouse 3 days ago|||

> If you write a book and I take it and embed its knowledge into my product that is so pervasive that no one needs to buy your book any more (and I don't even credit you so no one knows where that knowledge came from), to you really still have what was stolen?

The trouble with this analogy is that it proves too much.

Suppose you write a book, and so does someone else, but they have better marketing than you and then people in the market for that genre buy theirs instead of yours. Let's even stipulate that the existence of their book actually lowers your sales, because people who want that kind of book already bought theirs by the time they find out about yours and then some people don't have time to read or can't afford to buy both.

Notice that we haven't yet said a word about the contents of either book. They could be completely independent and they've never even heard of you or your book -- they "didn't even buy a copy of your book to copy it". All we know is that they're the same genre and the existence of theirs is costing you sales. By that logic all competition would thereby be "stealing", and that can't be right.

Which implies that you don't have a property right to the customers.

jones1618 2 days ago|||

I like your argument, not because it is a good analogy for AI but because it is a good contrast. Copyright isn't a guarantee or magic force field blocking fair competition. It is a permeable buffer against lazy knockoffs and time-boxed so that buffer doesn't choke all future creativity.

People on this thread need to focus on what "derivative" and "fair use" mean and understand both are measured on a somewhat fuzzy spectrum, subject to interpretation.

In a perfectly fair world AIs/MLs could vacuum up all human knowledge, fair and square. (In an ideal world, they would do that adhering to polite opt-in/opt-out agreements with copyright holders. We can dream). Input isn't theft.

On output, two magic genies would stand at the gate, the Derivative Genie and Fair Use Genie and review anything spat out by the AI/ML. If it crossed agreed upon thresholds the Genies would bar the gates and issue a stern warning to prompt again (or maybe the AL/ML would auto-adjust the prompt and try again).

So, if your prompt asked for a 300-word poem about thrash metal mosh pit dancing and it spat out a poem where 85% of it match one of the handful of available mosh pit poems in its database, the Derivative Demon would block the output and raise an alarm.

On the other hand, if you asked for a line by line analysis of a famous mosh pit dancing poem (by name) or maybe asked for a satirical spoof of said poem, the Fair Use Demon would overrule the Derivative Demon and give the output a pass.

That's as fair as this could get, especially if you add one more thing: An Appeals Court (maybe corporate, maybe 3rd party, maybe state run) with a Settlement Pool. If a copyright holder could prove the Genies let pass something they shouldn't, the AL/ML would fix that. If real damage is done, the creator would get a settlement from the pool.

The point is that the Input Genie is out of the bottle. Creators just look foolish trying to squeeze it back in. Better, they should focus on making the output Genies and the Appeals process as effective and fair as possible for everyone.

m4x 3 days ago||||

A better analogy would be that you do original research or work and produce a valuable book. Somebody else looks at your work, decides it has value, and reproduces it in a new book under their name. The new book is cheaper, or easier to find, or for whatever reason displaces your original book created through your own research and investment. Now somebody else is profiting off your creativity or work, without payment or even acknowledgement.

I'm not sure how this plays out legally, but it certainly seems unethical

AnthonyMouse 3 days ago|||

So for example, when Disney sees value in public domain stories like Cinderella, Rapunzel/Tangled or Snow White, and they make movies out of them, profiting from the creativity and work of the Brothers Grimm without paying anything to their estate, or high school plays do Shakespeare, that seems unethical to you?

Would it be fair for Greece to do retroactive term extensions all the way back to Plato and then sue anyone who copies the idea of having a university or uses the Platonic solids or distributes religious texts that incorporate the dualistic theory of the soul?

m4x 3 days ago||

Your examples, as you say, are all public domain. Are all the works we train LLMs on public domain too? Was the original book in my analogy in the public domain? What do you think about training on material that isn't yet in the public domain?

AnthonyMouse 2 days ago||

You're framing this as an ethical question, but copyright term lengths are essentially arbitrary. They're set by the government, as are the boundaries of fair use. At which point you're making a circular argument. That it's bad if it's illegal and that it should be illegal because it's bad. So what happens if someone argues the opposite? That it's not unethical if it's fair use and then it should be fair use because it's not unethical.

m4x 3 hours ago||

I'm not making a circular argument, nor one based on legality. You explicitly changed your example to use "public domain" content, and ignoring the legal specifics of that it's clear that's a separate category of content. Most people have no ethical issue with remixing or using content that has already done the rounds and delivered most of its immediate value to the creator. This is very different to your earlier examples with books, framed as two contemporary pieces of media competing with each other.

Letting companies train LLMs on the "classics" is very different to training on contemporary media where the creator still depends on it.

blks 3 days ago|||

Why are you talking about this case that case nothing to do with the topic at hand? The comment you’re replying to gives a very clear and narrow analogy, and you’re talking about something else.

AnthonyMouse 3 days ago||

How is it something else? It's the same analogy. The problem with it is that the harm from the alleged theft doesn't require any use of the original material in order to happen, since that "harm" is competition rather than expropriation.

The attempt to distinguish them is through copying, but that's the part that isn't depriving anyone of anything.

blks 3 days ago||

The main point here is _using_ copyrighted materials to create a commercial product, that you then sell, that may be used as alternative or substitute for the original materials. You’re missing that point and talking about two independent projects competing.

AnthonyMouse 2 days ago||

Because the competition is the only source of alleged harm, but people can do that even if they don't copy anything. There isn't actually a property right to the customers. You can lose sales to someone else whether they copied anything or not.

blks 2 days ago||

So what that you can loose sales even without crimes being committed? This somehow makes it okay to profit off someone’s work and ignore licenses?

DiogenesKynikos 3 days ago||||

What if I read your book (and a bunch of other books), and use what I learned to write my own book? Have I "stolen" your book?

Facts are not copyrightable. Only your particular way of expressing those facts is copyrightable.

throwawayIche9j 4 days ago|||

Yes. That's not to say that something damaging wasn't done, but nothing was stolen. Stealing/theft requires deprivation of property. It's like receiving a normal nonlethal punch in the face and calling it murder. Murder requires someone dying.

> Theft [...] is the act of taking another person's property or services without that person's permission or consent with the intent to deprive the rightful owner of it. --- https://en.wikipedia.org/wiki/Stealing

KPGv2 3 days ago|||

My God, I can't believe chodes are still playing this "how many angels can you fit on the head of a pin" navel gazing semantic argument. Thirty years at least, it was all you saw on fin de ciecle Slashdot from anyone with a six-digit UID. No one cares about your hyper literalist meaning of "theft," that's not the goddamn point. Christ, this place looks like Reddit more and more.

This isn't a court of law. We don't have to talk like lawyers. If you replaced "theft" with "copyright infringement" in the comment you had such a problem with, what meaningfully changes besides we all have about five additional brain cells?

visarga 3 days ago|||

Even the case for copyright infringement is weak. LLMs are not copying machines, we already have copying machines at much lower price, almost zero, and perfect fidelity and much faster than generating it probabilistically. So it makes no economic sense to spend billions on training and inference to make a copier. In fact the value of LLMs is where they do not copy but apply knowledge a new situation.

AnthonyMouse 3 days ago|||

> If you replaced "theft" with "copyright infringement" in the comment you had such a problem with, what meaningfully changes besides we all have about five additional brain cells?

The obvious difference that copyright is subject to fair use and various other limitations that personal property isn't.

2ndorderthought 3 days ago||

Ever hear of Aaron swartz?

AnthonyMouse 2 days ago||

Aaron Swartz was charged under the CFAA, which isn't even copyright law, and the prosecution was widely condemned as draconian overreach.

skeeter2020 4 days ago|||

>> Stealing/theft requires deprivation of property

maybe you should look up the definition of property, which is a set of legally recognized rights over a thing, typically including:

* possession (what you're focusing on)

* use

* exclusion

* transfer

The last 3 seem like they have been breached, in legally that's theft.

jasomill 4 days ago|||

Violation of these rights may be criminal without meeting the strict legal definition of theft.

This can even extend to stealing physical property.

Depending on local laws, stealing a car may not actually be theft if the defendent can prove they intended to return it before the owner got home from work, though it would certainly be considered theft in the colloquial sense of the term, and they would still be guilty of a lesser offense like civil and/or criminal conversion.

throwawayIche9j 4 days ago||

> Depending on local laws, stealing a car may not actually be theft if the defendent can prove they intended to return it before the owner got home from work

I doubt there's even one place where the law works like that.

KPGv2 3 days ago||

> I doubt there's even one place where the law works like that.

In a lot of places, that's how it works. A key element of theft is the intent to permanently deprive someone of property.

This is why joyriding isn't classified as auto theft and is instead a lesser offense. It's because joyriding is an intent to temporarily deprive, while GTA is an intent to permanently deprive.

In some jxns (the UK is one), there is a tort called trespass to goods, and an example of this would be "stealing" someone's property to deliver to another location for them to use there. The tort of conversion is similar: interference with someone's property right to treat it as your own (silent as to length of time).

2ndorderthought 3 days ago||

Yea in the us if someone tries to steal your car and you are in it or threatened by it you can shoot them dead or something like that (ianal) You may have a court day but in many situations no punishment will follow.

throwawayIche9j 4 days ago|||

Theft is not the breach of any property right. It's specifically the deprivation of property without consent. Yes, I have checked the definition in my jurisdiction.

Getting punched in the face also violates rights, yet isn't murder. Murder is specifically about dying.

rustystump 4 days ago|||

You forget that laws are made by people and at anytime they can change interpretations are arbitrary, roe vs wade today but not tomorrow.

People seem to think what ai is today is theft. If enough people agree, it will be theft. Big companies dont like this and push the other way. An objectiveness doesnt exist here. It is too wiggly

odo1242 4 days ago|||

You’re splitting hairs over a definition that isn’t relevant here (theft and copyright infringement are different things) to defend something that even you agree is bad.

throwawayIche9j 4 days ago||

It isn't splitting hairs. The damages are completely different in nature.

With theft, the entire damage is the deprivation. It could be an heirloom or some other object that may have been entrusted to you, something that can never be replaced, memorabilia of loved ones. Something that you may have needed in your posession to survive (e.g. a car to go to your job).

With a given copyright violation, the damage is that maybe[1] you made less profit than you could have. The potential for profit is not property. Profit isn't guaranteed.

[1] The loss is not certain, because there's no guarantee that the ones consuming the copyrighted content could have even afforded it.

2ndorderthought 4 days ago||||

Cool cool cool. So all the code and data you send to anthropic and chatgpt should be mass distributable to forward other peoples arts and science? All your meeting notes with ai summarizers, slack chats with bots? Might as well put your entire company and all plans for it on github mit licensed. Ill take a peek, see if there's anything valuable to me in that. Don't worry you can keep it all on your github too. It's still yours afterall. Copilot will be training on it too though btw

IAmLiterallyAB 4 days ago||

That's a privacy violation, not relevant.

2ndorderthought 4 days ago||

No it's not. You exposed that data to an LLM. Should have read the fine print. The laws around that don't make sense to me anymore so therefore I own that stuff now. That's how this works right? You do know chatgpt etc can read everything you write, right?

Also social media profile pics. Great way to get faces for deep fake ads. Most people are just 1 phone call away from being voice cloned. Our likeness isn't all that important either if you think about it.

Maybe meta will clone your writing style and sign into your meta account and message your friends telling them about this awesome new product. Meta owns the account and you uploaded data to it.

Our_Benefactors 4 days ago||

Literally none of these things are defensible positions, so nobody will take you seriously.

2ndorderthought 4 days ago||

Many of the things I wrote are already happening. The others probably are but haven't been reported yet.

collabs 4 days ago||

I think Anthorpic has pledged to not use team and enterprise user's data for training purposes. I don't mind if they do some verification or whatever as long as it doesn't end up in the responses it gives others.

2ndorderthought 3 days ago|||

I have an amazing timeshare for sale and you seem like someone who would really see the opportunity this provides. How are your financials?

HWR_14 3 days ago||||

What Silicon Valley company over a decade old has respected the limitations on using data that they agreed to? At least any valuable data.

KPGv2 3 days ago|||

yes yes and google pledged "don't be evil"

Don't be naïve. A corporation would tear the flesh from your body if it meant a better quarterly earnings report.

2ndorderthought 3 days ago||

Having seen someone die at work, this is factual. The comments made during and after were eye opening.

albedoa 4 days ago|||

You were swiftly corrected about your misunderstanding under your original comment. Reposting it here, removing the quote farther from its context, and hoping to not be downvoted again is very weird!

nadermx 3 days ago||

I don't see how me quoting the actual complaint the news was about, in both threads, was me being swiftly corrected. If you where to base it on upvotes then this one shows I'm right and you got swiftly corrected here. In both cases it was relevant as both threads where not yet merged and about the same complaint. And held two positons on front page and I was adding to the discourse.

protocolture 3 days ago|||

>It's clear to me that AI training is transformative fair use under existing law.

I wouldn't even go that far. Its an entirely new product. Its like the guy who sold you the keyboard demanding royalties for the software you built.

That the person who wrote the book couldn't predict a new use case for the book in training LLMs, is irrelevant. The book isn't in the LLM. Its not being sold with the LLM. Its one of billions of tools used to create the LLM.

People try and sell this as the AI companies extracting value from the poor little IP holders like Disney. Its maddening. That content is your cultural heritage. It already belongs to you, just some idiot has been granted a lifetime of exclusive exploitation. An LLM is trained on data you already own. Disney et al wants to exploit the new technology to extract even more money out of stuff created often decades ago.

At absolute worst its reverse engineering, which was supposed to be fair use protected in the US but apparently that's been somewhat eroded.

xigoi 3 days ago|||

> The book isn't in the LLM.

An LLM is essentially a lossy compression of the training data. The book absolutely is in there, it’s just mangled to the point of unrecognizability.

protocolture 3 days ago||

The wood tends to have an impression of the hammer that hits it. The book isn't in there, the weights are just shaped by what tools were used to form it.

When large quantities of source material are replicable by prompting its a bug not a feature.

legacynl 2 days ago||

That's just semantics. The wood would be there without the hammer, the LLM wouldn't be here without the copyrighted works it's based on.

protocolture 2 days ago||

No, thats just semantics.

>The LLM Wouldnt be here without the copyrighted works

Google wouldn't be here if it hadn't scraped every copyrighted website and used them to form a searchable graph of the internet but we only hear complaints about them when they reproduce entire news articles.

gizajob 3 days ago|||

If my book isn’t in your LLM, then prove it and don’t use my book to train your LLM.

protocolture 3 days ago||

>don’t use my book to train your LLM.

What makes you think you are entitled to tell people what they can and cant do with data they purchased (or otherwise acquired) from you. Extremely honest question. I just cant put myself in your shoes.

Like if I had written anything useful I would be overwhelmingly flattered that my content be considered so worthy for inclusion.

Your profile suggests that you are a philosopher. Did you get into philosophy hoping to exploit the publishing industry to the extent that you can squeeze every cent out of your thoughts, and deny their potential uses downstream?

Its actually crazy how bad things are, I am usually keen on capitalism and exclusivity, but the whole thing with LLMs, I see people pushing hard to tighten the grip of intellectual property. I see people making 50 cents a month on Kindle Unlimited suddenly shocked that someones LLM generated output might be ever so slightly influenced by weights ever so slightly influenced by their work, seemingly thinking they might get some big payday out of it.

Give me a tiny little wedge of understanding of your thought process. Your book is right now, doing a greater social good on your behalf than me running around and removing all the trash from my neighborhood, and the benefits of that social good are going to accrue long after you and I are gone. Your work is now going to live on, in a very tiny way, in these systems forever. I am honestly envious.

If anything, I would be trying to get bad writing removed from LLM training data. Things that I dont want to influence others. But as a potentially honest promoter of your work, you want it removed?

Whats the number? If not 1:1 exactly what you charge for the book, what do you think the proper compensation for slightly influencing training weights you should receive?

eloisius 3 days ago|||

> What makes you think you are entitled to tell people what they can and cant do with data they purchased

Hundreds of years of copyright law. I bought a copy of Windows, but I’m not allowed to modify that data with a cracker and sell a bootleg DVD of it.

I should edit to clarify that I’m not a big fan of Lars Ulrich or Disney, but I don’t think we’re going to get a win here for the recreational IP pirates. What’s more likely is that we’ll end up with some Frankenstein law that favors both Mikey Mouse and OpenAI, and you and I will neither get free movies nor the ability to earn a living off of our creative labor.

protocolture 3 days ago||

I mean, the comparable situation would be, being allowed to sell something you created on Windows.

But in abstract you should absolutely be able to modify and sell windows.

eloisius 3 days ago||

To continue your analogy, I had to pay for Windows before I was allowed to create something with it, or acquire a license for under terms they set forth. If AI companies stopped at the public domain, then my argument wouldn't really hold up, but they didn't do that. They acquired everyone's copyrighted works without regard for the license and now they're, in the most charitable interpretation, using them to create derivative works.

And before you give me an analogy about how someone could listen to Pink Floyd and then produce works inspired by their influence yada yada: Someone is a human being with human rights, and if we're going to start pretending that training an LLM is in any way analogous to human consumption and creativity, and not an industrial process that encodes input data into a digital artifact, then let's start by saying LLMs have human rights and cannot be owned by a company that charges for access to them.

protocolture 3 days ago||

>To continue your analogy, I had to pay for Windows before I was allowed to create something with it, or acquire a license for under terms they set forth.

Yep and so far it looks like the issue with the meta case is they didnt pay for the book. Not that they used it in training data.

>in the most charitable interpretation, using them to create derivative works.

Yeah in the same way I use a hammer to create a derivative table.

>Someone is a human being with human rights, and if we're going to start pretending that training an LLM is in any way analogous to human consumption and creativity.

I dont care about that. Its simply a tool being built using existing tools. Like using a jigsaw to make a step ladder.

legacynl 2 days ago||

> Yep and so far it looks like the issue with the meta case is they didnt pay for the book. Not that they used it in training data.

Let's not sane-wash what they did here, they didn't just 'forgot to pay for the books', they deliberately and illegally downloaded and used material that wasn't theirs to use.

If you or I did that, we would be jailed or sued into destitution. In a fair world we either should change copyright laws (allowing for anyone to freely pirate all media), or Zuckerberg needs to go to jail.

protocolture 2 days ago||

>Let's not sane-wash what they did here, they didn't just 'forgot to pay for the books', they deliberately and illegally downloaded and used material that wasn't theirs to use.

Yes. Forgot is your word.

But lets face it, there wouldn't be a case to answer for if they had paid retail for each book, torn them up and scanned them and trained on that data.

>Zuckerberg needs to go to jail.

I am comfortable with that but would prefer updating copyright.

gizajob 3 days ago|||

A million dollars please.

It’s called a copyright notice. Same as a license. If you’re running a commercial business you can’t legally just take that piece of work and reuse it. Pick any book off your shelf and pretty well every one of them will have words to the effect of:

All rights reserved. No part of this publication may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical methods, without the prior written permission of the publisher, except in the case of brief quotations embodied in critical reviews and certain other noncommercial uses permitted by copyright law. For permission requests, write to the publisher, addressed "Attention: Permissions Coordinator," at the address below.

Same as every piece of commercial software has a license which has to be abided by. Same as use of Meta’s service has terms and conditions which HAVE to be agreed to.

So yeah they’re free to break that license but they’re also free to be sued by IP holders for breaking it at scale.

protocolture 3 days ago||

Well its not a solved issue in terms of law. But even still, I would have expected you to understand that I wasnt speaking legally.

conception 4 days ago|||

Illegally obtaining copyrighted materials is usually the issue not the transformation part

akerl_ 4 days ago||

Looking at the complaint ( https://publishers.org/wp-content/uploads/2026/05/2026-05-05... ), that seems like the part that's got the most solid foundation, especially given that while torrenting the books, they were also seeding to other peers.

The items they call out around training the models (and attempting to claim that each subsequent model generation should count as an additional instance of infringement) seem far less grounded in the current court interpretations of AI training.

King-Aaron 4 days ago|||

Absorb all "our" IP without consent, in doing so remove "our" own source of revenue, and then repackage it as their own product. Not really fair use IMO.

visarga 3 days ago||

How does that work? Is it a kind of infringement without substantial similarity?

King-Aaron 3 days ago||

I find it hard to think of a reasonable analogy. But it's like coming into your house, stealing all your belongings, and then building a new house with all your shit inside and then selling it back to you.

brendoelfrendo 3 days ago|||

I think this completely misses the point... the point is that Meta pirated the media they used to train their model.

I am not a fan of US copyright law, but if I torrented millions of books, I would be facing a felony charge in criminal court and a (with statutory damages as high as $150,000 per title in cases of willful infringement) multi-billion dollar lawsuit in civil court.

In my opinion, this has nothing to do with whether or not AI training is transformative and this fair use, and everything to do with whether or not the laws apply to everyone equally. If Facebook isn't forced to pay billions and elect a sacrificial executive to serve prison time, then I will remain angry.

rvz 4 days ago|||

> It's clear to me that AI training is transformative fair use under existing law. Maybe this will be the case to prove it.

That is not what this case is about. It is more about the illegal violation and piracy of copyrighted content done by Meta for commercial use and Zuck knew they were doing it.

Why did Anthropic settle [0] with a multi-billion dollar payout to authors after commercializing their LLMs that was trained off of copyrighted content that was illegally obtained and kept without the authors permission?

There's a reason why they (Anthropic) did not want it to go to trial. (Anthropic knew they would lose and it would completely bankrupt them in the hundreds of billions.)

AI boosters will do anything to justify the mass piracy and illegal obtainment of copyrighted material for commercial use (not research) which that is not fair use in the US. There is no debate on this. [0]

[0] https://images.assettype.com/theleaflet/2025-09-27/mnuaifvw/...

visarga 3 days ago||

I think copyright is far for being the most important aspect related to AI, it's geopolitical and economical. And even if it was the most important, there is only a case to be made for 1. that copy used to train models and 2. rare or induced regurgitation by targeted prompting.

The original work is not replicated identically, why would we replicate a work when it can be more easily seen in original or replaced with an alternative options online. We use AI to produce new outputs to new situations. We already have had drives and networking for plain copying.

whattheheckheck 4 days ago|||

If i could ask for a summary from an llm vs buy a book id go with the summary. That eats into commercial use and the supreme court case sided with Gerald Ford when a newspaper published a small gist of his autobiography because it ate into the sales

Larrikin 4 days ago|||

Every single Wikipedia article of a book or TV show has this summary. Ford should have lost.

whattheheckheck 1 day ago||

Probably, Educational purposes is strong component of fair use doctrine

2ndorderthought 4 days ago|||

Yea nope. I like the full book without any loss of information. Even if I don't want to read the entire book. LLMs love to respond even when something is outside of their training set.

__loam 3 days ago|||

It's not settled law so I'm not sure how that's clear to you.

jacquesm 3 days ago|||

I think both Elsevier and the people that appropriate IP for training commercially deployed AIs purpose without the consent of the author(s) should be legal.

stiray 4 days ago|||

It actually depends on evilness of the company. Elsevier is just less evil that Zuckerberg and Meta, while publishers are even less problematic. I dont think there is anything funny in that.

Or anything to defend on Meta. If they go out of business, humanity profits.

4k0hz 4 days ago|||

Elsevier is shitty to people doing stuff that (imo) should be allowed. Meta is making money doing the same thing and not getting the same shittiness from Elsevier.

Elsevier at least works within the (admittedly broken) system, Meta does not.

blks 3 days ago|||

When you use millions of copyrighted materials to bundle together to produce a commercial product, I wouldn’t call that a fair use. Especially when licensing of such material doesn’t explicitly allow that, the material wasn’t even purchased on consumer markets and your commercial product may be a competitor/analogue to the copyrighted material.

Not even going to all GPL stuff, that in a better world should have screwed all the slop companies

matheusmoreira 3 days ago|||

The enemy of my enemy, and all that.

stackghost 4 days ago|||

I'm not on Elsevier's side, but I still think it's bullshit that giant companies are allowed to do things at a scale that I'd go to prison for.

platevoltage 3 days ago||

That's always going to be true for the Capitalist class.

stackghost 3 days ago||

And yet I continue to rage against the dying of the light.

happytoexplain 4 days ago|||

"Funny" is how dishonest snipes are framed. It such a common trope of internet quips, it's wearing me out. Can we please try to just format our disagreements without the snideness?

nullsanity 4 days ago|||

[dead]

platevoltage 3 days ago||

Such a garbage take. This is not a parody or a critique. Mark Zuckerberg is not Weird Al Yankovic.

Telaneo 4 days ago||

Looking forward to the personal liability.

I've wondered what the legalese justification for letting liability evaporate as it does so often with corps. So far the reasons I'm left with are 'shrugs' and 'the relevant provision (seemingly? apparently?) simply don't apply', neither of which are any good.

I was going to make a joke about how we should attach magnets to Aaron Swartz' corpse, since that'd make for a pretty potent energy source, given how fast he must be spinning. But honestly, I think he would have seen this sort of thing coming, given how his case was handled and how things really haven't gotten any better.

Aurornis 3 days ago||

The handling of Aaron Swartz’s case was a travesty, but he wasn’t indicted for piracy. The charges were for fraud, unlawfully accessing a protected computer, and damaging a computer.

In the years since the basis of the case has been forgotten and replaced with an assumption about piracy, but it was a case about unlawful access.

Telaneo 3 days ago|||

Given Swartz's intent and actions, I mostly don't care about the difference. Someone taking the clean copy of Star Wars (that probably doesn't exist) of Disney's servers and putting that up on the internet for everyone to download is mostly just doing copyright infringement. The computer equivelent of breaking and entering is a sideshow and a means to get to some files and little more (assuming they didn't do much more). The reasons I care about breaking and entering is because what usually follows is stealing and a violation of privacy (or if the case of computers, the latter, as well as the computer being turned against my own interests) neither of which is the case when it comes to what Swartz did. The breaking of the door itself isn't some sinful act, that should be punishable in and of itself.

The law doesn't see it that way, but it is not the ground truth.

j-bos 3 days ago|||

Yeah, cpaa is a loaded gun pointed at anyone who's ever touched a computer.

woah 4 days ago||

Alternate reality Aaron Swartz escaped canonization and is now running an AI/crypto startup that pays you to upload training data with his YC alum buddies

Telaneo 4 days ago||

Every now and then, I feel like we live in the worst possible world. Then I realise it could be much worse.

This does not comfort me.

soundworlds 4 days ago||

I should hope that if Zuckerberg isn't severely punished for this, it at least sets a legal precedent for every other person to do the same with immunity.

All the Aaron Schwartzes of the future could freely share scientific papers with the world.

agnosticmantis 4 days ago|

Willing to bet they'll lobby for regulatory capture and raise the drawbridge for the little guys.

pessimizer 4 days ago||

Shouldn't this stuff trigger RICO? Why do torrent site operators get led off in cuffs for running operations that usually lose money, but Zuck doesn't?

RICO specifically cites "criminal infringement of a copyright" as laid out in 18 U.S. Code § 2319. If the CEO tells his employees to download hundreds of thousands of works illegally in order to carry out his money-making scheme, how is that not organized crime even if (dubiously) LLM training on the material is fair use?

-----

RICO: https://www.law.cornell.edu/uscode/text/18/part-I/chapter-96

Definitions: https://www.law.cornell.edu/uscode/text/18/1961

> As used in this chapter — (1) “racketeering activity” means (A)[...]; (B) any act which is indictable under any of the following provisions of title 18, United States Code: [...], section 2319 (relating to criminal infringement of a copyright),[...]

18 U.S. Code § 2319 - Criminal infringement of a copyright: https://www.law.cornell.edu/uscode/text/18/2319

-----

edit:

> 18 U.S. Code § 1962 - Prohibited activities

> (c) It shall be unlawful for any person employed by or associated with any enterprise engaged in, or the activities of which affect, interstate or foreign commerce, to conduct or participate, directly or indirectly, in the conduct of such enterprise’s affairs through a pattern of racketeering activity[...].

https://www.law.cornell.edu/uscode/text/18/1962

From the lawsuit:

“Meta — at Zuckerberg’s direction — copied millions of books, journal articles, and other written works without authorization, including those owned or controlled by Plaintiffs and the Class, and then made additional copies of those works to train Llama,” the suit says. “Zuckerberg himself personally authorized and actively encouraged the infringement. Meta also stripped [copyright management information] from the copyrighted works it stole. It did this to conceal its training sources and facilitate their unauthorized use.”

alex1138 4 days ago||

> Meta also stripped [copyright management information] from the copyrighted works it stole. It did this to conceal its training sources and facilitate their unauthorized use.

WTF

stopbulying 4 days ago|||

[dead]

stopbulying 4 days ago||

[dead]

motbus3 4 days ago||

I know personally a case of a engineer who was told to do something despite all the legal problems because the company had lawyers for a reason

Telaneo 4 days ago|

I'd love for that to come out during discovery when the lawsuit hits, but it probably never will. Blowing the whistle is also not a great option in this economy, although I wish more people did.

nojvek 3 days ago||

If it wasn’t written down and there’s no recordings of it, hard to prove.

Telaneo 3 days ago||

Given Zuckerberg's prevoius rounds in court, it's liktly to be in an email somewhere. Or maybe he's learned to have the incriminating_shit@facebook.com inbox only keep the latest 30 days or whatever.

28304283409234 4 days ago||

So... "move fast and steal things"?

lm411 4 days ago||

When the AI scrapers were just getting started, that is basically what I thought - their plan was to scrape / suck up everything they possibly could before people realized what was happening and blocked them.

The rate at which they were spidering and scraping was so far beyond what any other supposedly legit spider was doing, it seemed like the logical explanation.

pseudalopex 4 days ago|||

Move fast and break laws.

mil22 4 days ago|||

It started at the top and at the beginning.

vips7L 4 days ago|||

The biggest theft from the working class that has ever happened.

platevoltage 3 days ago|||

In Mark's case, he still breaking things too.

eowln 3 days ago|||

Steal things? What is this, the “you wouldn’t pirate a car” argument again? I thought we were well over that.

MengerSponge 4 days ago|||

Always Has Been

dmitrygr 4 days ago|

Who will be the first to implement a one-layer three-weight model and add it to BitTorrent? Let it “train” on all downloaded files. That makes it fair use. Am I doing this right?

More comments...