Posted by latexr 3 hours ago
The hypocrisy in how copyright is enforced for AI companies vs everybody else is pretty infuriating though. We have courts ruling against people for downloading youtube videos to enable them to use clips for fair use purposes (https://torrentfreak.com/ripping-clips-for-youtube-reaction-...) while Nvidia is free to violate the DMCA in the exact same way to take youtuber's content in full (https://www.tomsguide.com/ai/nvidia-accused-of-scraping-80-y...).
please! you can go to anna's archive right now and do what they did. i find it truly strange to victimise oneself to such a degree!
As the title said "Techno-cynics are wounded techno-optimists"
but no company did this.
https://www.tomshardware.com/tech-industry/artificial-intell...
> Facebook parent-company Meta is currently fighting a class action lawsuit alleging copyright infringement and unfair competition, among others, with regards to how it trained LLaMA. According to an X (formerly Twitter) post by vx-underground, court records reveal that the social media company used pirated torrents to download 81.7TB of data from shadow libraries including Anna’s Archive, Z-Library, and LibGen. It then used this information to train its AI models.
> Aside from those messages, documents also revealed that the company took steps so that its infrastructure wasn’t used in these downloading and seeding operations so that the activity wouldn’t be traced back to Meta. The court documents say that this constitutes evidence of Meta’s unlawful activity, which seems like it’s taking deliberate steps to circumvent copyright laws.
> so that its infrastructure wasn’t used in these downloading and seeding operations so that the activity wouldn’t be traced back to Meta.
(emphasis added)
If you'd like it from another source using different words, https://masslawblog.com/copyright/copyright-ai-and-metas-tor... has
> According to the plaintiffs’ forensic analysis, Meta’s servers re-seeded the files back into the swarm, effectively redistributing mountains of pirated works.
and specifically talks about that being a problem.
I will grant that until/unless the cases are decided, this is allegedly, so we'll see.
Do you think that OpenAI or Anthropic should get a pass for using torrents if they used special BitTorrent clients that only leached? Do you think the RIAA would be cool with me if I did the same?
> There is no dispute that Meta torrented LibGen and Anna's Archive, but the parties dispute whether and to what extent Meta uploaded (via leeching or seeding) the data it torrented. A Meta engineer involved in the torrenting wrote a script to prevent seeding, but apparently not leeching. See Pls. MSJ at 13; id. Ex. 71 ¶¶ 16–17, 19; id. Ex. 67 at 3, 6–7, 13–16, 24–26; see also Meta MSJ Ex. 38 at 4–5. Therefore, say the plaintiffs, because BitTorrent's default settings allow for leeching, and because Meta did nothing to change those default settings, Meta must have reuploaded “at least some” of the data Meta downloaded via torrent. The plaintiffs assert further that Meta chose not to take any steps to prevent leeching because that would have slowed its download speeds. Meta responds that, even if it reuploaded some of what it downloaded, that doesn't mean it reuploaded any of the plaintiffs’ books. It also notes that leeching was not clearly an issue in the case until recently, and so it has not yet had a chance to fully develop evidence to address the plaintiffs’ assertions.
They did leeching but not seeding. https://caselaw.findlaw.com/court/us-dis-crt-n-d-cal/1174228...
> If I a civilian did this I would face time in prison
no if you had leeched its is very unlikely that you would face time in prison.
Wrong. Michael Clark testified under oath that they tried to minimize seeding and not that they prevented it entirely. His words were: "Bashlykov modified the config setting so that the smallest amount of seeding possible could occur" (https://storage.courtlistener.com/recap/gov.uscourts.cand.41...)
They could have used or written a client that was incapable of seeding but they didn't.
> no if you had leeched its is very unlikely that you would face time in prison.
Not the one who claimed that, but if I think it's fair to say that doing what they did, at that scale, could easily result in me (and most people) being bankrupted by fines and/or legal expenses.
Do you not think an engineer who went to such efforts to disable seeding wouldn’t go the full extent? Why not?
This is such a troll statement.
Anybody could be OpenAI, all you need is anna archive and couple of PC's. all you losers could have been billionaires if you'd just do it.
“Scratch any cynic and you will find a disappointed idealist.”
The strawberry thing has been solved and LLM's have moved way beyond that helping in mathematics and physics. Its easy for the blog author to pick this but lets try something different.
It would be a good idea to come up with a question that trips up a modern LLM like GPT with reasoning enabled. I don't think there exists such a question that can fool an LLM but not fool a reasonably smart person. Of course it has to be in text.
It seems like every couple weeks there's some embarrassing failure of AI that gets quickly patched, but just because AI companies scramble to hide the failures of their technology doesn't mean they haven't failed in ways that shouldn't have been possible if they were what they claim them to be.
> I don't think there exists such a question that can fool an LLM but not fool a reasonably smart person.
An example was on the front page here just a few days ago.
https://s3.eu-central-2.wasabisys.com/mastodonworld/media_at...
Until someone invents an LLM that has any actual understanding of the words it outputs (which doesn't seem likely to happen in my lifetime) these things are going to keep happening, just like how it's impossible to get them to stop hallucinating. The limitation is intrinsic to what they are. We call these chatbots AI, but there is no intelligence there that didn't come from the humans whose words were used to train them.
Every few weeks I see the same thing.
Come up with an example that trips up ChatGPT.
It seems to me that this take will start to resonate with more and more people
Cynicism is the mind's way of protecting itself from repeating unproductive loops that can be damaging. Anyone who ever had a waking dream come crashing down more than once likely understands this.
It doesn't necessarily logically follow that you wholesale reject entire categories of technology which have already shown multiple net positive use cases just because some people are using it wastefully or destructively. There will always be someone who does that. The severity of each situation is worth discussing, but I'm not a big fan of the thought-terminating cliché.
There's understandably some concerns over how it will impact people's jobs in the future, but that's a societal issue, not a problem with the technology.
I think the problem people have is with how that technology was created by people looking to privately profit from the hard work of others without compensation, how it is massively destructive to the environment, how it is being used to harm others, and how the people controlling it are indifferent to the harms they cause at best and at worst are trying to destroy or undermine our society. These are valid concerns to have and it's only natural for it to impact people's attitudes towards the technology as it's been implemented and how its used today.
By putting capital ahead of everything else of course capitalism gives you technological progress. If we didn't have capitalism we'd still be making crucible steel and the bit would cost more than the horse [1] -- but if you can license the open hearth furnace from Siemens and get a banker to front you to buy 1000 tons of firebricks it is all different, you can afford to make buildings and bridges out of steel.
Similarly, a society with different priorities wouldn't have an arms race between entrepreneurs to spend billions training AI models.
[1] an ancient "sword" often looks like a moderately sized knife to our eyes
The history of how steel got cheap is not really capital-based. It wasn't done by throwing money at the problem, not until the technology worked. The Bessemer Converter was a simple, but touchy beast. The Romans could have built one, but it wouldn't have worked. The metallurgy hadn't been figured out, and the quantitative analysis needed to get repeatability had to be developed. Once it was possible to know what was going into the process, repeatability was possible. Then it took a lot of trial and error, about 10,000 heats. Finally, consistently good steel emerged.
That's when capitalism took over and scaled it up. The technological progress preceded the funding.
If it goes into a codified state system, it's regulated, resulting in a lack of motivation to take risks to make it better.
What do investors want? Returns on their investment right.
So, as an investor do you throw your money blindly at a high risk endeavor that is likely to fail due to competition, or
Do you invest in setting up a limited rent seeking market that guarantees income in the future.
Unregulated free market capitalism always turns into one large bully that dominates over everyone else because one large bully that dominates over everyone else is a very effective system. Vote based governments such as democracy are a means of attempting to ensure that said government are somewhat controlled by the people and not by a king/corporations in the first place.
For instance on Matt Stoller's blog there are endless articles about how private equity is buying up medical practices, veterinary practices, cheerleading leagues, all sorts of low-risk, high-reward rollups. You also see things like the current AI bubble where there is very much an "arms race" where it seems quite likely that investors are willing to risk wasting their money because of the fear of missing out.
Some other kind of social system is going to face the same trade-offs and note that "communism" in the sense of the USSR and China might not be a true alternative. I mean, Stalin's great accomplishment was starving his peasants to promote rapid industrialization (capital formation!) so they could fight off Germany and then challenge the US for world supremacy. People who are impressed with China today are impressed that they're building huge solar farms, factories that build affordable electric cars, have entrepreneurial companies that develop video games and social media sites, etc. That is, they seem to out-capitalize us.
LLMs are amazing math systems. Give them enough input and they can replicate that input with exponential variations. That in and of itself is amazing.
If they were all trained on public domain material, or if the original authors of that material were compensated for having the corpus of their work tossed into the shredder, then the people who complain about it could easily be described as Luddites afraid of having their livelihood replaced by technology.
But you add in the wholesale theft of the content of almost every major, minor, great and mediocre work of fiction and non-fiction alike to be shredded and used as logical paper mache to wholesale replace the labor of living human beings for nickles on the dollar and their complains become much more valid and substantial in my opinion.
It's not that LLMs are bad. It's that the people running them are committing ethical crimes that have not been formally illegalized. We can't use the justice system to properly punish the people who have literally photocopied the soul of modern media for an enormously large quick buck. The frustration and impotence they feel is real and valid and yet another constant wound for them in a life full of frustrating constant wounds, which in itself is a lesser but still substantial portion of what we created society to guard the individual against.
It's a small group of ethically amoral people injuring thousands of innocent people and making money from it, mind thieves selling access to their mimeographs of the human soul for $20/month, thank you very much.
If some parallel of this existed in ancient Egypt or Rome, surely the culprits would be cooked alive in a brazen bull or drawn and quartered in the town square, but in the modern era they are given the power and authority and wealth of kings. Can you not see how that might cause misery?
All that being said, if the 20 year outcome of this misery is that everyone ends up in an GAI assisted beautiful world of happiness and delight, then surely the debt will be paid, but that is at bet a 5% likely outcome.
More likely, the tech will crash and burn, or the financial stability of the world that it needs to last for 20 years will crash and burn, or WWIII will break out and in a matter of days we will go from the modern march towards glory to irradiated survivors struggling for daily survival on a dark poisoned planet.
Either way, the manner in which we are allowing LLMS to be fed, trained, and handled is not one that works to the advantage of all humanity.
I think it's even worse than that - they are committing actual crimes that many people were punished severely for in the previous decades, (for example, https://en.wikipedia.org/wiki/Capitol_Records,_Inc._v._Thoma...)
The author doesn't understand Marx but merely parrots leftist talking points. Marx strongly claims that without change in technology, feudalism would not have changed to capitalism.