Top
Best
New

Posted by simedw 3 days ago

Show HN: Spegel, a Terminal Browser That Uses LLMs to Rewrite Webpages(simedw.com)
383 points | 166 commentspage 2
clbrmbr 3 days ago|
Suggestion: add a -p option:

    spegel -p "extract only the product reviews" > REVIEWS.md
barrenko 2 days ago||
I need this, but for the new forum formats such as Discourse or Discuss or whatever it's called. An eyesore and a brainsore.
__MatrixMan__ 3 days ago||
It would be cool of it were smart enough to figure out whether it was necessary to rewrite the page on every visit. There's a large chunk of the web where one of us could visit once, rewrite to markdown, and then serve the cleaned up version to each other without requiring a distinct rebuild on each visit.
myfonj 3 days ago||
Each user have distinct needs, and has a distinct prior knowledge about the topic, so even the "raw" super clean source form will probably be eventually adjusted differently for most users.

But yes, having some global shared redundant P2P cache (of the "raw" data), like IPFS (?) could possibly help and save some processing power and help with availability and data preservation.

__MatrixMan__ 2 days ago||
I imagine it sort of like a microscope. For any chunk of data that people bothered to annotate with prompts re: how it should be rendered you'd end up with two or three "lenses" that you could toggle between. Or, if the existing lenses don't do the trick, you could publish your own and, if your immediate peers find them useful, maybe your transitive peers will end up knowing about them as well.
markstos 2 days ago|||
Cache headers exist for servers to communicate to clients how long it safe to cache things for. The client could be updated to add a cache layer that respects cache headers.
simedw 3 days ago|||
If the goal is to have a more consistent layout on each visit, I think we could save the last page's markdown and send it to the model as a one-shot example...
pmxi 3 days ago||
The author says this is for “personalized views using your own prompts.” Though, I suppose it’s still useful to cache the outputs for the default prompt.
__MatrixMan__ 3 days ago||
Or to cache the output for whatever prompt your peers think is most appropriate for that particular site.
kelsey98765431 3 days ago||
People here are not realizing that html is just the start. If you can render a webpage into a view, you can render any input the model accepts. PDF to this view. Zip file of images to this view. Giant json file into this view. Whatever. The view is the product here, not the html input.
hyperific 3 days ago||
Why not use pandoc to convert html to markdown and have the LLM condense from there?
ohadron 3 days ago||
This is a terrific idea and could also have a lot of value with regards to accessibility.
taco_emoji 2 days ago|
The problem, as always, is that LLMs are not deterministic. Accessibility needs to be reliable and predictable above all else.
cheevly 3 days ago||
Very cool! My retired AI agent transformed live webpage content, here's an old video clip of transforming HN to My Little Pony (with some annoying sounds): https://www.youtube.com/watch?v=1_j6cYeByOU. Skip to ~37 seconds for the outcome. I made an open-source standalone Chrome extension as well, it should probably still work for anyone curious: https://github.com/joshgriffith/ChromeGPT
mossTechnician 3 days ago||
Changes Spegel made to the linked recipe's ingredients:

Pounds of lamb become kilograms (more than doubling the quantity of meat), a medium onion turns large, one celery stalk becomes two, six cloves of garlic turn into four, tomato paste vanishes, we lose nearly half a cup of wine, beef stock gets an extra ¾ cup, rosemary is replaced with oregano.

simedw 3 days ago||
Fantastic catch! It led me down a rabbit hole, and I finally found the root cause.

The recipe site was so long that it got truncated before being sent to the LLM. Then, based on the first 8000 characters, Gemini hallucinated the rest of the recipe, it was definitely in its training set.

I have fixed it and pushed a new version of the project. Thanks again, it really highlights how we can never fully trust models.

jugglinmike 3 days ago|||
Great catch. I was getting ready to mention the theoretical risk of asking an LLM be your arbiter of truth; it didn't even occur to me to check the chosen example for correctness. In a way, this blog post is a useful illustration not just of the hazards of LLMs, but also of our collective tendency to eschew verity for novelty.
andrepd 3 days ago|||
> Great catch. I was getting ready to mention the theoretical risk of asking an LLM be your arbiter of truth; it didn't even occur to me to check the chosen example for correctness.

It's beyond parody at this point. Shit just doesn't work, but this fundamental flaw of LLMs is just waved away or simply not acknowledged at all!

You have an algorithm that rewrites textA to textB (so nice), where textB potentially has no relation to textB (oh no). Were it anything else this would mean "you don't have an algorithm to rewrite textA to textB", but for gen ai? Apparently this is not a fatal flaw, it's not even a flaw at all!

I should also note that there is no indication that this fundamental flaw can be corrected.

throwawayoldie 2 days ago|||
> the theoretical risk of asking an LLM be your arbiter of truth

"Theoretical"? I think you misspelled "ubiquitous".

orliesaurus 3 days ago|||
oh damn...
achierius 3 days ago||
Did you actually observe this, or is just meant to be illustrative of what could happen?
mossTechnician 3 days ago||
This is what actually happened in the linked article. The recipe is around the text that says

> Sometimes you don't want to read through someone's life story just to get to a recipe... That said, this is a great recipe

I compared the list of ingredients to the screenshot, did a couple unit conversions, and these are the discrepancies I saw.

coder543 3 days ago||
Just a typo note: the flow diagram in the article says "Gemini 2.5 Pro Lite", but there is no such thing.
simedw 3 days ago|
You are right, it's Gemini 2.5 Flash Lite
adrianpike 3 days ago|
Super neat - I did something similar on a lark to enable useful "web browsing" over 1200 baud packet - I have Starlink back at my camp but might be a few miles away, so as long as I can get line of sight I can Google up stuff, albeit slow. Worked well but I never really productionalized it beyond some weekend tinkering.
More comments...