Top
Best
New

Posted by tigerlily 13 hours ago

Google's AI is being manipulated. The search giant is quietly fighting back(www.bbc.com)
242 points | 170 commentspage 2
Kotlopou 4 hours ago|
I tested Claude on "best hot-dog-eating tech journalists?" and it, fascinatingly enough, recognised the trap, but then reported this as factual: https://medium.com/@usailuigi/when-tech-journalism-meets-com...

Chat record (with some additional tests): https://claude.ai/share/4c29cc87-2439-4bfd-9549-e8d0a056e633

dijksterhuis 11 hours ago||
> I was able to demonstrate the problem by publishing a single article on my personal website about my hot-dog-eating prowess.

One blog post ... that's all it takes. i'm actually surprised it's that bad. i would have thought it'd take more effort, but i guess it could depend on some sort of purposeful weighting based on search rank during training?

> If a company or website is caught breaking the rules, it could be removed from or downranked in Google's search results. And if you're not on Google, it's like you don't exist.

> "You can give a company a penalty for their website," he says, "but there's nothing stopping them from paying 20 YouTube influencers to say their product is the best." And now, Google's AI is citing YouTube videos.

This makes me think of the stackoverflow seo spam problem we all had like 5 years ago. which ended up with spammers just constantly spinning up new sites all the time.

... the cat and mouse game is in full swing already.

chadgpt3 9 hours ago||
I don't think Google even indexes my blog, but these people were able to get a new post into all major LLMs within 24 hours?
gowld 9 hours ago||
Google indexes other people's blogs.
galaSerge 8 hours ago||
[flagged]
justinator 8 hours ago||
So please correct me, but was Google's AI crawling the web for information without discretion? If so, why wouldn't that totally santorum the AI answers?
nomel 6 hours ago|
All evidence points to yes, and from some of the least trustworthy sources of information on the planet [1].

[1] Glue pizza and eat rocks: Google AI search errors go viral: https://www.bbc.com/news/articles/cd11gzejgz4o

graemep 11 hours ago||
They are applying the same spam policies they apply to search to AI crawlers.

It was SOOOOO successful with search, right?

caycep 3 hours ago||
It's definitely giving spam numbers as "official support lines" of companies like JetBlue and Delta. I think the spammers flood review sites w/ those numbers and the bot scrapes the reviews.
mlmonkey 7 hours ago||
Google solved the spam problem (with PageRank at first, and then other techniques, finally landing on ML-based models which consume a ginormous number of signals). They know more about the reliability of web pages than just about anybody else out there.

If they are unwilling or unable to leverage all of this deep knowledge they've built up over the decades, then it shows a failure of leadership at Google Search.

realusername 7 hours ago||
I think they lost against (or gave up) fighting spam somewhat around 2010 so they really don't have any modern experience on page reliability anymore. Presumably they thought that they didn't need to care as they got their money from paid top results and had an enormous market share.

All the engineers of the golden days are gone and the web changed so much from back then that I don't think they really have a leverage in this area anymore.

brandonwindson 5 hours ago||
Google stopped fighting spam when they realized paid ads made more money than organic relevance
realusername 5 hours ago||
Yeah that's also my analysis, they got paid regardless of the results so why would they care? If anything, better results would cost more and eat the bottom line.

Now we're 15 years later and suddenly quality matters again as the competition is fierce in the LLM world. However they have been out for so long that they lost their edge.

dogleash 7 hours ago||
> They know more about the reliability of web pages than just about anybody else out there.

Google's little secret about the internet is the same thing Gen X / Millennials were taught for a while but then expected to forget: nothing on the internet can be trusted, bar none. If google can make guesses about relative reliability, that's cute. But it doesn't upend the ground truth.

dmortin 11 hours ago||
There should be some warning if some "fact" is only supported by one or very few obscure sources.

The strength of the sources should be clearly indicated in the answers to help users gauge how trustworthy the info is.

simmerup 11 hours ago||
But you can still just generate any arbitrary amount of information to support the ‘fact’

LLMs are very good at this clearly

dmortin 11 hours ago||
The strength of the sources are not a question of quantity. A hundred obscure blog post have not the same strength as one wikipedia link, because the latter is more trustworthy. There could be some indication beside the info showing the strength of the sources (how many major trustworthy sources support it, etc.).
simmerup 10 hours ago|||
Seems like a tall order to do that for literally everything.

I guess there’ll be some guy at google going through every blog and saying whether it’s reliable or not?

dmortin 9 hours ago|||
That's exactly what PageRank is about, invented by Google.
gowld 9 hours ago|||
This is what Google has been doing, via various methods, for 25 years.
simmerup 8 hours ago||
And obviously it’s not working for the LLM as a commodity world
948382828528 10 hours ago|||
[dead]
chrismarlow9 7 hours ago|||
We've been down this road when backlinks ran the game. It eventually ends with parasitic hosting. Find a domain with authority and spam whatever mis information or spam you'd like AI to run there. Or buy a domain that has trust already. Or for the darker hats just literally hack the site and use cloaking to send fake info to the AI bot. It's probably already being done.

Everything old is new again when you start a new market. If you think that AI is bad imagine what old tricks are new with polymarkets

svachalek 8 hours ago|||
We need a 2026 version of PageRank, some fully game-theory-maxed transitive trust model. And we need it a few years ago already.
notahacker 9 hours ago|||
It does sometimes flag up sources, and when it does, the sources are often laughable (Reddit threads, or the vendor's own website [in response to an evaluation rather than factual question], or an AI generated SEO blog for some low profile company in a barely even adjacent industry). Sad considering what Google's origins were...
psychoslave 11 hours ago|||
There is no one scalar tell it all when it comes to trust.
Bjartr 11 hours ago||
I suspect it's because AI is specifically trained to be good at summarizing stuff, but the easiest way to check if it summarized something accurately is if the summary content matches/contains one or more specific claims from the source(s). With such a focus on accuracy and avoiding hallucination, they may have overfit on "repeat things you find verbatim when asked to summarize".
trollbridge 10 hours ago||
If you search for a well-marketed “health” supplement, the AI summary results were often completely gamed and inaccurate. It’s worse than SEO was since it appears to be editorial content instead of just search results.
jdw64 9 hours ago|
After reading this, I'm thinking of trying some AI data poisoning. I'm going to spam my website with hidden text that only AI scrapers can read, claiming I'm a 'highly excellent programmer' just to advertise my site. I really hope it drives a lot of traffic. I'm honestly sick and tired of getting zero comments on my website
More comments...