Show HN: HomeBrew HN – Generate personal context for content ranking

Posted by azath92 2 days ago

Show HN: HomeBrew HN – Generate personal context for content ranking(www.hackernews.coffee)

TLDR: Build a quick HN profile to see how little context LLMs need to personalise your feed. Rate 30 posts once, get a permanent ranked homepage you can return to.

Our goal was to build a tool that allowed us to test a range of "personal contexts" on a very focused everyday use case for us, reading HN!

We are exploring use of personal context with LLMs, specifically what kind of data, how much, and with how much additional effort on the user’s part was needed to get decent results. The test tool was a bit of fun on its own so we re-skinned it and decided to post it here.

First time posting anything on HN but folks at work encouraged me to drop a link. Keen on feedback or other interesting projects thinking about bootstrapping personal context for LLM workflows!

122 points | 48 comments

gwintrob 2 days ago|

This is a really cool idea. I love that you expose the personal profile as markdown. Reminds me of this article and exposing the system prompt: https://koomen.dev/essays/horseless-carriages/. Well done!

akkartik 2 days ago||

I wish I could paste my profile into a new device without needing to sit through the quiz again.

azath92 2 days ago||

Thats a great point, and a good articl. I think the example they bring up as "good" and what we are leaning into here, is the idea of transparency and agency of being able to see and modify your profile.

The tension we have been finding is that we dont want to require people to "know how to prompt" to get value out of having a profile, hence our ongoing thinking around how to bootstrap good personal profiles from various data sources.

As Koomen notes, a good profile feels like it could be the best weapon against "AI slop" in a case I want something sharp and specific. But getting to that requires knowing how to prompt most of the time.

gwintrob 2 days ago||

Makes sense. Bootstrapping the prompt based on some sample articles is smart.

NitpickLawyer 2 days ago||

Funny, I did the swipe thing and then the first result was this post with a [dive] tag. No idea if cheeky or if it actually got that from my choices, but I had a laugh anyway. Neat PoC!

edit: ooh, I see what the swiping did:

## Analysis of user's tech interest: The user demonstrates a strong interest in advanced technical topics, particularly in the realm of artificial intelligence, machine learning, and low-level systems programming/security (e.g., kernel exploitation). They are drawn to articles that involve practical application, model creation, and deep dives into complex technical architectures. Their interest in "Show HN" articles suggests an appreciation for new, innovative projects, especially those with a technical or AI focus. They show less interest in general hardware announcements (like new microcontrollers), historical tech accounts, or very niche, non-AI/ML/security-related programming topics.

Yeah, that's pretty much spot on. Wonder if there's a way to match that against the topics I actually commented on, but at a glance it's pretty cool!

azath92 2 days ago|

Nice, great to see where it nails it, and has the right feel! Using your own comments directly, or in the first instance allowing users to paste in sections of their commented on or upvoted history is a great next step. Not sure yet whether the flow should be in place of or in addition to doing the survey.

huem0n 2 days ago||

I've been wanting this for a while! As an alpha prototype its great. Skip/Skim/Deep dive feels like the right breakdown to me. Having a different color but same UI feels right too.

Other than quality of life stuff (multiple pages for example), I'd like to see it continually learn.

A few things got miscategorized and I'd love for it to naturally correct that with additional input from me.

azath92 2 days ago|

Hey so cool to hear! the skim/skip/dive mapped to how we use things, but thats sample size of 3 haha.

The idea of having some kind of thumbs up/down on what you see after getting recs, that gets added to your preferences, or being able to do another round of preferences (rather than just re-doing them like we have now) is for sure on our next steps if we continue with. Were not quite sure what the feedback loops will be yet (we did look at adding whole webhistory for example but that felt like a bit much and pretty invasive).

For the miscats, on a meta level what we are generally interested in is whether they come from compression of the preferences into your user profile (essentially if more or better data is the path to better context for such a specific usecase, or whether there is more bang for buck optimizing the various prompts. Keen to hear if its obvious from looking at your profile what was the case.

If we get serious with this evals are a must next step. We are only 2 days in at the moment :)

pxc 2 days ago||

As it is currently written, this is less useful the more niche your interests are. I think for such users, looking at their comment history or upvoted history might be useful instead of or in addition to just sampling recent, popular articles.

In my case, none of the topics I most like to read about and discuss on HN (package management, software freedom, next-gen CLI tools, next-gen shells, philosophy, desktop Linux, functional programming, hacker history, literate programming, Emacs, bitching about common development practices, programming language design, configuration languages) managed to appear in the 30-post sample I used. The profile it wrote for me was pretty good considering that, but definitely not great.

The assessment was also mistaken about my degree of interest in "low level" technical details like binary file formats (in fact it's rather low, although it has gradually increased over time), and my degree of interest in theoretical computer science issues (in fact it's high, but all of the theoretical papers in the sample were about machine learning, which was not an area of academic focus for me).

I do really like the simplicity and customizability of this (exposing the profile as Markdown and making it editable is awesome), and the quality of the results is very good given the tiny input size. But if your primary interests are not super aligned with the mainstream on HN, you won't get a chance to demonstrate that you like them. If users could type a few terms to say what their biggest interests are before running through the samples, this could work even better for people like me.

It would also be interesting if this could work based on article contents and not just headlines. Sometimes I open something and close it immediately, or I open it undecided as to whether I will skim or read closely.

azath92 2 days ago|

This is great to hear in such detial. One of the first cabs of the rank to improve this would be greater user control over what preferences to include, and/or smarter selection of the pool to select. This kind of focused preference is super important, especially as i use hackernews, and clearly for you and i suspect others as well.

In fact I would pose that I have a couple of disparate interests or "profiles" that i would like to have greater control over/support in generating, that are non overlapping sets of topics and types of content. The ability to have greater agency in creating them and managing them is something we are keen to explore.

The article comments one is a toughie, as LLM use skyrockets when you scrape and consume content from the links. It would be awesome to include it, but would likely need to be paid, just from a cost perspective.

Really appreciate the detail here, this makes it easier to turn your examples into a test/eval/feature case.

pxc 2 days ago||

>I have a couple of disparate interests or "profiles" that i would like to have greater control over/support in generating, that are non overlapping sets of topics and types of content

This sounds like a great feature! My appetites for different clusters content certainly vary according to my mood! Perhaps "mood" would actually be a cute-but-clear name for such distinct/multiple profiles. :)

> The article comments one is a toughie, as LLM use skyrockets when you scrape and consume content from the links. It would be awesome to include it, but would likely need to be paid, just from a cost perspective.

Hm. That is a good (and in retrospect, obvious) point. If it makes the feed a lot better, I think it could certainly be worth it for some users. If it only makes a small difference, maybe not. It might be interesting for you to experiment and write about, since what kind of difference it will make isn't obvious (at least to me) up front.

azath92 1 day ago||

yeah we are thinking a lot right now about good language for how to separate the notion of a "profile" as representative of who i am, from the concept of a profile that represents what mode im currently operating in right now. In the context of browsing, mood actually captures it quite well i agree. As we think wider to the different ways we might want a profile to help guide what content we receive or results we get when using other tools, that language isnt quite so clear yet.

We will have to do some combo of much more internal testing, construct evals, or just capture more info about peoples usage coupled with an ability to provide feedback in order to even get a handle of such a nuanced thing as "good" with a tool like this. Likely info capture and user feedback would be a first port of call for a substantive change, internal testing is always ongoing, but such a low sample size.

oulipo 2 days ago||

I like the idea, but for me the displayed rankings were not particularly good, perhaps it needs a bit more data

Also I know that depending on the days / weeks / mood I will want to read different content from HN, so I guess there should still be like 30% of "random articles" in each category just to create some noise

derbOac 1 day ago||

I felt similarly.

The generated page was really off for me — I had read most of the posts it ranked, at least a little, and most recommended as skips were some of my favorite recent submissions, and vice versa with dives.

On the other hand I'm not sure I'd want to use something like this much as something I like about HN are the pleasant surprises. Maybe as a side page or something if I were really in a rush?

azath92 2 days ago||

yeah seeing so many people using this its clear we should add some way for people to indicate when things felt off vs good, so that we can start tweaking the system, maybe with some evals.

We played around with the idea of a "fun" or "random" category, but ultimately didn't include it in this little first demo, as we found it super hard to have it not be just literally random (although that might not be a bad thing as you say)

On the topic of different moods and headspaces, thats one of the things more broadly we are really thinking about outside of this demo, and hadn't really considered for here but should. What different data can we use (in this case maybe just a different survey for different "profiles"), and how can a user manage those different profiles and front pages will be questions to answer.

Id be really interested to know if anyone has done topic grouped or themed frontpages for hackernews, as this would map well to that concept. ill have a look.

password4321 2 days ago||

As far as rating posts: user favorites are public, and you could ask for a copy+paste of a few pages of upvoted stories if someone is not using the favorites feature. The stories that have been commented on are also a pretty strong public signal.

azath92 2 days ago|

this is an angle we honestly didn't think about (we are pretty much long time lurkers) but accessing existing HN content is a great idea! I didn't even know there was a page of upvoted submissions :) It doesn't look like thats available via the API, but a copy paste of some text should work just aswel, all we pass through is titles and urls to the LLM anyway to generate the profile so its much the same.

More generally a next feature we want for ourselves is a way to add just some generic text and "update" the profile with that, rather than generate it fresh exclusively off of the 30 examples. This circles back to us using this as a focus point to think about what data is enough to generate a good user profile, and what good is.

joseda-hg 2 days ago||

Given the nature of the small pool(And the way they naturally exclude / includes topics), I'd strongly prefer if it had some way of adding more than 30 samples, maybe keep track of each set calibration taken and compare?

incomingpain 2 days ago||

Checked it out and it's offering 2 skim, and the rest are skip?

I had an expectation that it'd go through posts and give me stuff i'd be interested in. Like here's 25 posts that would be interesting?

Only frontpage? no second page? No sort by new, which is my preferred.

azath92 2 days ago||

thanks for checking it out and taking the time to give some feedback!

When weve been testing things, we often find that if there wasnt a great match between the options when picking preferences, and whats currently on the front page, that the context it generates will result in a lot of skips (understandably, but not great UX). Right now can try regenerating your context (and going through the process again), or manually editing it to get to different results.

Theres also some work for us to better select the options when picking preferences, or ensuring we always surface some deep dives.

Applying the same process to more pages, or bubling up content from multiple pages, or new is a great idea. cool to hear thats where you would look.

fossa1 2 days ago||

[dead]

wickedsight 2 days ago||

It put this post at the top of my feed, which is cool, because it's incredibly relevant to my interests. I used to work on something similar, but way before LLMs were a thing.

Would you be willing to share some more of the architecture/tech stack?

azath92 2 days ago|

yeah we see that a bit as well, i promise its not hardcoded in haha.

On the LLM side of things we are using Gemini 2.5 flash, mostly for speed, and found it to be reasonably good quality at a vibe level compared to something heavier like claude 4, probably because we've worked hard to keep the task very simple and explicit. But in saying that there are a bunch of comments on quality that really highlight that if we want to get serious about that we should put in some user feedback loops and evals.

Its all in JS/TS, using vercel ai for the LLM calls, storage is local, but in order to really dig into quality we might start saving things, but to do that well we'd have to add auth/users etc. and we wanted to keep it light for a demo. We have been recently exploring langfuse for tracing, and are really liking that, and will probably look at using them for first pass evals when we get to it for this project.

We also talked quite a bit about non-LLM recsys and aside from time to set up and do well, something I really like is the sense of transparency and agency. you can see your profile, and edit it if you like to see the change in your results. I almost think wed lean further into that rather than folding in some trad DS or recsys stuff even if that might make the results better. Just musings at this point though.

wickedsight 2 days ago||

Thanks for the elaborate response! I recently looked at doing something similar and ended up with the option of using vector embeddings. Is that something you've considered?

azath92 1 day ago||

For a first pass having some emb search would require we store and process a lot, which we dont do for this light weight demo. all the posts and recommendations are done on the fly, only looking at the current top page and re-ranking. as some others suggest, if we wanted to expand the recs more broadly than the front page then it might be a great way to run a first cut before running our rec llm on it.

For richer data to build a profile its something we look at a bunch for other projects, which could get folded in here if we decide to make it more persistent.

wasabi991011 2 days ago||

During the rating part, am I supposed to be clicking on the link to decide how to rate it? Or should I be basing myself only on the title?

azath92 2 days ago|

how ever you like! i tend to do it off the title, but you could click into it if you like.

drakonka 2 days ago|

This looks great! At first glance the dive/skim/skip suggestions it offered for me are well judged (I'm now actually diving into the dive ones).

More comments...