Posted by jruohonen 9 hours ago
I do agree that AI is killing tons of things right now. This monster must be stopped; it is worse than Skynet in that it really, really sucks. Things started to decay before AI took over, though - for instance, Google search has been garbage since years. It was useful before that.
I used to compare the decay of google search with how youtube search works. You search for, say, "ninja cats". You get some results about cats. Perhaps also ninjas. After like 10 or 20 results, you suddenly get other videos that are totally unrelated, but you may click on it. That's addictive design. People click on it suddenly when it is interesting to them - but this also takes them away from their original search. Something similar happened to google search. The UI is total crap, it shows semi-related videos (I don't want to watch videos when I search for a specific term), some ads for companies (Google is milking it here) and then also useless entries such as "other people searched for sick grannies instead, do you want to search for this as well" and similar UI-ruining components. Without ublock origin I'd be quite lost already - lo and behold, Google killed ublock origin because it threatened their business model (another reason to use ublock origin; we really need to get rid of Google. It is no longer a useful corporation - just greedy).
No one wants to make a bet like that, so they don’t. That’s why RSS doesn’t get pushed or used more often.
These evangelists want to make it sound like all we need to do is get everyone on board with RSS and we’ll all just hold hands and share the web.
People don’t browse the web, there’s like 10 websites, that’s the whole internet.
Everything else is just asteroids and abandoned space stations.
If you have the key to the paywall, then you can create a feed hydrator to fetch the content to the feed.
We can do better than that: an LLM can ingest unstructured data and turn it into a feed. You shouldn’t need someone else to comply with a protocol just to ingest their data.
I don’t get why people keep fantasizing about a system that gave consumers no control. Scrape the website directly. You decide what’s in the feed, not them.
An LLM can try to do that, yes. But LLMs are lossy compression. RSS feeds are accurate, predictable, and follow a pre-defined structure. Using LLMs to ingest data which can easily be turned into an parseable data structure seems strange: use the LLM to do the "next part" of the formula (comprehension, decision making, etc)
There is also LLMs.txt https://llmstxt.org/ eg https://joshua.hu/llms.txt / https://joshua.hu/llms-full.txt
The only thing you have to do is ensure it can reliably get the response html. Maybe MCP browser + proxy or mirror to seem more human.
I built this for myself. The idea is that each feed is a url + title + a prompt to tell the LLM how to extract the links you want so that it generalizes over all websites.
And each feed item is a canonicalized url + title + a local copy of the content at that url which is an improvement over RSS since so many RSS feeds don't even contain the content.