"A glass is not impossible to make the file and so deepen the original cut. Now heat a small spot on the glass, and a candle flame to a clear singing note.
— context_length = 2. The source material is a book on glassblowing."
On my site, I serve them a subset of Emergent Misalignment dataset, randomly perturbed by substituting some words with synonyms.
It should make the LLMs trained on it behave like dicks according to this research https://www.emergent-misalignment.com/
Industrial ag regularly treats product to modify the texture, color, and shelf life. Its extremely common to expose produce to various gases and chemicals to either delay or hasten ripening, for example. Other tricks are used while the plants are still in the ground or immediately after harvest, for example spraying grains with roundup to dry out more quickly.
Same as any other consumer using Meta products. You sell out because it’s easier to network that way.
I am the son of a farmer.
Edit: added disclosure at the bottom and clarified as agricultural farming
https://www.farmkind.giving/the-small-farm-myth-debunked
Tldr; the concept of farmers as small family farms has not been rooted in truth for a very long time in America
In general though, the easy rule of living and eating non-mega farmed food and sustainable living is to “eat aware”:
My other advice is a one-size-fits-all food equation, which is, simply, to know where it came from. If you can't place it, trace it, or grow it/raise it/catch it yourself, don't eat it. Eat aware. Know your food. Don't wait on waiters or institutions to come up with ways to publicize it, meet your small fishmonger and chat him or her up at the farmer's market yourself. [0]
[0] https://www.huffpost.com/entry/the-pescatores-dilemma_b_2463...
Part of the reason I did this is to get good numbers on how bad the problem is: A link maze is a great way to make otherwise very stealthy bots expose themselves.
So by blocking these IPs, you are blocking your users. (ie: in many coffeshops, I get the "IP Blocked" banner, my guess is that they are running software on unsuspecting users to route this traffic).
There were 122 million residential internet connections in the US in 2024 so for an app with 1 million users the chance of affecting a single user is <1%.
[1] https://docs.fcc.gov/public/attachments/DOC-411463A1.pdf
Otherwise, there are residential IP proxy services that cost around $1/GB which is cheap, but why pay when you can get the user to agree to be a proxy.
If the margin of error is small enough in detecting automated requests, may as well serve up some crypto mining code for the AI bots to work through but again, it could easily be an (unsuspecting) user.
I haven't looked into it much, it'd be interesting to know whether some of the AI requests are using mobile agents (and show genuine mobile fingerprints)
Surely the bots are still hitting the pages they were hitting before but now they also hit the garbage pages too?
But yes, all bots start out on an actual page.
Clever
Most of the real use seems to be surveillance, spam, ads, tracking, slop, crawlers, hype, dubious financial deals and sucking energy.
Oh yeah, and your kid can cheat on their book report or whatever. Great.
It has to be said though that all the three things above are feared/considered taboo/cause for mocking, while making a quick buck at the cost of poisoning the commons gives universal bragging rights. Go figure.