Posted by timeproofs 5 hours ago

Show HN: Turning noisy webpages into clean JSON for LLMs

Most webpages are hard for LLMs to read properly.

HTML mixes real content with navigation, footers, cookie banners, scripts, ads, and layout noise. This makes prompts larger, chunking worse, and RAG pipelines less reliable.

AI2JSON is a small public API that converts any public webpage into a clean, deterministic JSON structure: - main content only - ordered sections - stable output - SHA-256 hash for change detection

No summary, no interpretation — just a minimal contract between the web and AI systems.

You can paste a URL and instantly compare: - what an LLM sees with raw HTML - vs the same content as structured JSON

Free sandbox, no API key. I’m mainly looking for developer feedback: does this actually improve your AI workflows?

Demo: https://ai2json-c14gadm9x-jeason1.vercel.app

1 points | 0 comments