Top
Best
New

Posted by pmaze 1 day ago

Show HN: I used Claude Code to discover connections between 100 books(trails.pieterma.es)
I think LLMs are overused to summarise and underused to help us read deeper.

I built a system for Claude Code to browse 100 non-fiction books and find interesting connections between them.

I started out with a pipeline in stages, chaining together LLM calls to build up a context of the library. I was mainly getting back the insight that I was baking into the prompts, and the results weren't particularly surprising.

On a whim, I gave CC access to my debug CLI tools and found that it wiped the floor with that approach. It gave actually interesting results and required very little orchestration in comparison.

One of my favourite trail of excerpts goes from Jobs’ reality distortion field to Theranos’ fake demos, to Thiel on startup cults, to Hoffer on mass movement charlatans (https://trails.pieterma.es/trail/useful-lies/). A fun tendency is that Claude kept getting distracted by topics of secrecy, conspiracy, and hidden systems - as if the task itself summoned a Foucault’s Pendulum mindset.

Details:

* The books are picked from HN’s favourites (which I collected before: https://hnbooks.pieterma.es/).

* Chunks are indexed by topic using Gemini Flash Lite. The whole library cost about £10.

* Topics are organised into a tree structure using recursive Leiden partitioning and LLM labels. This gives a high-level sense of the themes.

* There are several ways to browse. The most useful are embedding similarity, topic tree siblings, and topics cooccurring within a chunk window.

* Everything is stored in SQLite and manipulated using a set of CLI tools.

I wrote more about the process here: https://pieterma.es/syntopic-reading-claude/

I’m curious if this way of reading resonates for anyone else - LLM-mediated or not.

469 points | 140 commentspage 4
Aurornis 1 day ago|
It’s interesting how many of the descriptions have a distinct LLM-style voice. Even if you hadn’t posted how it was generated I would have immediately recognized many of the motifs and patterns as LLM writing style.

The visual style of linking phrases from one section to the next looks neat, but the connections don’t seem correct. There’s a link from “fictions” to “internal motives” near the top of the first link and several other links are not really obviously correct.

pmaze 1 day ago||
The names & descriptions definitely have that distinct LLM flavour to them, regardless of which model I used. I decided to keep them, but as short as possible. In general, I find the recombination of human-written text to be the main interest.

There's two stages to the linking: first juxtaposing the excerpts, then finding and linking key phrases within them. I find the excerpts themselves often have interesting connections between them, but the key phrases can be a bit out there. The "fictions" to "internal motives" one does gel for me, given the theme of deceiving ourselves about our own motivations.

reedf1 1 day ago||
Well even the post itself reads to me as AI generated
itsangaris 1 day ago||
surprised to that "seeing like a state" didn't get included in the "legibility tax" category
JimmyJamesJames 1 day ago||
Like this initial step and its findings.

#1: would a larger dataset increase the depth and breadth of insight ( go to #2) #2: with the initial top 100, are there key ‘super node’ books that stand out as ones to read due the breadth they offer. Would a larger dataset identify further ‘super node’ books.

podgorniy 10 hours ago||
Cool stuff. Thanks for conceptualizing, implementing and sharing
amelius 1 day ago||
Makes me wonder, how well could an LLM-based solution score on the Netflix prize?

https://en.wikipedia.org/wiki/Netflix_Prize

(Are people still trying to improve upon the original winning solution?)

froil 11 hours ago||
Do you have details of the tech stack? Really loved it..
simonw 11 hours ago|
There's a useful write-up of that here: https://pieterma.es/syntopic-reading-claude/#how-its-impleme...
sciences44 1 day ago||
Love the originality here - makes you curious to explore more.

Solid technical execution too. Well done!

dev_l1x_be 1 day ago||
Claude code is good for arranging random things into categories, with code, configuration and documentation files it is barely goes into random rabbit holes or hallucinates categories for me.
adsharma 23 hours ago||
This is GraphRAG using SQLite.

Wouldn't it be good if recursive Leiden and cypher was built into an embedded DB?

That's what I'm looking into with mcp-server-ladybug.

threecheese 22 hours ago|
Where did you come across Leiden partitioning? I’m facing a similar use case and wonder what you’re reading.
fittingopposite 10 hours ago|
Pretty new graph clustering algorithm (published in 2019). Original publication which is actually fairly readable: https://www.nature.com/articles/s41598-019-41695-z
More comments...