Show HN: Haystack – Review pull requests like you wrote them yourself

Posted by akshaysg 3 days ago

Show HN: Haystack – Review pull requests like you wrote them yourself(haystackeditor.com)

Hi HN! We’re Akshay and Jake. We put together a tool called Haystack to make pull requests straightforward to read.

What Haystack does:

-- Builds a clear narrative. Changes in Haystack aren’t just arranged as unordered diffs. Instead, they unfold in a logical order, each paired with an explanation in plain, precise language

-- Focuses attention where it counts. Routine plumbing and refactors are put into skimmable sections so you can spend your time on design and correctness

-- Provides full cross-file context. Every new or changed function/variable is traced across the codebase, showing how it’s used beyond the immediate diff

Here’s a quick demo: https://youtu.be/w5Lq5wBUS-I

If you’d like to give it a spin, head over to haystackeditor.com/review! We set up some demo PRs that you should be able to understand and review even if you’ve never seen the repos before!

We used to work at big companies, where reviewing non-trivial pull requests felt like reading a book with its pages out of order. We would jump and scroll between files, trying to piece together the author’s intent before we could even start reviewing. And, as authors, we would spend time to restructure our own commits just to make them readable. AI has made this even trickier. Today it’s not uncommon for a pull request to contain code the author doesn’t fully understand themselves!

So, we built Haystack to help reviewers spend less time untangling code and more time giving meaningful feedback. We would love to hear about whether it gets the job done for you!

How we got here:

Haystack began as (yet another) VS Code fork where we experimented with visualizing code changes on a canvas. At first, it was a neat way to show how pieces of code worked together. But customers started laying out their entire codebase just to make sense of it. That’s when we realized the deeper problem: understanding a codebase is hard, and engineers need better ways to quickly understand unfamiliar code.

As we kept building, another insight emerged: with AI woven into workflows, engineers don’t always need to master every corner of a codebase to ship features. But in code review, deep and continuous context still matters, especially to separate what’s important to review from plumbing and follow-on changes.

So we pivoted. We took what we’d learned and worked closely with engineers to refine the idea. We started with simple code analysis (using language servers, tree-sitter, etc.) to show how changes relate. Then we added AI to explain and organize those changes and to trace how data moves through a pull request. Finally, we fused the two by empowering AI agents to use static analyses. Step by step, that became the Haystack we’re showing today.

We’d love to hear your thoughts, feedback, or suggestions!

85 points | 59 comments

tkiolp4 3 days ago|

As I work more with AI, I’ve came to the conclusion that I have no patience to read AI-generated content, whether the content is right or wrong. I just feel like it’s time wasted. Countless of examples: meeting summaries (nobody reads them), auto generated code (we usually do it for prototypes and pocs, if it works, we ship it, no reviews. For serious stuff we take care of the code carefully), and a large etc.

I like AI on the producing side. Not so much on the consuming side.

gobdovan 3 days ago||

For me, AI meeting summaries are pretty useful. The only way I see they're not useful for you is that you're disciplined enough to write down a plan based on the meeting subject.

shortcord 3 days ago|||

I tend to agree. Except if it's text generated by me for me.

I don't want you to send me a AI-generated summary of anything, but if I initiated it looking for answers, then it's much more helpful.

jaredsohn 3 days ago||

I'm not doing this much now, but this AI-generated text might be more useful if you use AI to ask questions using it as a source.

mediaman 3 days ago|||

Meeting notes are useful in two ways, for me:

- I'm reviewing the last meeting of a regular meeting cadence to see what we need to discuss.

- I put it in a lookup (vector store, whatever) so I can do things like "what was the thing customer xyz said they needed to integrate against".

Those are pretty useful. But I don't usually read the whole meeting notes.

I think this is probably more broadly true too. AI can generate far more text than we can process, and text treatises on what an AI was prompted to say is pretty useless. But generating text not with the purpose of presenting it to the user but as a cold store of information that can be paired with good retrieval can be pretty useful.

xwolfi 3 days ago||

Im in a large company, sometimes we have long incident meetings running for hours and new idiots join in the middle "what happened?". Now at least we can get summaries of the past hours during the meeting to catch up without bothering everyone !

jon-wood 3 days ago||

I'm aware this idealistic, but I would suggest simply not doing that. Don't have hours long meetings with people joining midway through.

jaredsohn 2 days ago||

Seems very idealistic :)

I think this is about when the app is broken and people are keeping a meeting app open to communicate with each other as they scramble to fix things.

So the limitation here is more about problems not being solved yet rather than how a 'meeting' is organized.

jon-wood 3 days ago|||

While in principle that should be great I don't even slightly trust it as a technique because you're compounding points at which the LLM can get things wrong. First you've got the speech to text engine, which will introduce errors based on things like people mumbling, or a bird shouting outside the window. That's then fed into a summarising LLM to make the meeting notes, which may latch onto the errors in the speech to text engine, or just make up its own new and exciting misinterpretations. Finally you're feeding those into some sort of document store to ask another LLM questions about them, and that LLM too can misinterpret things in interesting ways. Its like playing a game of chinese whispers with yourself.

jaredsohn 2 days ago||

Understand your concerns; I'm interested in trying it to see if I get value from it (things don't have to be perfect to be useful). For the last part, you can ask it to include a source for how it answers questions and check that.

Cthulhu_ 3 days ago|||

I don't read most of my (work) emails either, but I think the most important part of AI generated meeting notes is that they're searchable / indexable, in the off chance that you do need to find or refer to something mentioned in an article.

But to be blunt / irreverent, it's the same with Git commit messages or technical documentation; nobody reads them unless they need them, and only the bits that are important to them at that point in time.

jon-wood 3 days ago||

I can't help but see the irony of complaining that people don't read git commit messages or technical documentation in the comments for a product to assist in code review.

You know what really, really, helps while doing code review? Good commit messages, and more generally, good commit practices so that each commit is describing a set of changes which make sense together. If you have that then code review becomes much easier, you just step through each commit in turn and you can see how the code got to be where it is now, rather than Github's default "here's everything, good luck" view.

The other thing that helps? Technical documentation that describes why things are as they are, and what we're trying to achieve with a piece of work.

cosmosgenius 2 days ago||

Maybe AI doing rebase, code chunking and commits will help? This kinda makes sense. Reviewable+mergables chunks do make the code way faster to merge.

akshaysg 3 days ago|||

That's fair! If there were a "minimal" mode where you could still access callers, data flows, and dependencies with no AI text, would it be helpful for your reviews?

ray__ 3 days ago||

Not parent, but in my opinion the answer here is yes. I agree that there is a real need here and a potentially solid value proposition (which is not the case with a lot of vscode-fork+LLM-based starups) but the whole point should be to combat the verbosity and featurelessness of LLM-generated code and text. Using an LLM on the backend to discover meaningful connections in the codebase may sometimes be the right call but the output of that analysis should be some simple visual indication of control flow or dependency like you mention. At a first look the output in the editor looks more like an expansion rather than a distillation.

Unrelated, but I don't know why I expected the website and editor theme to be hay-yellow and or hay-yellow and black instead of the classic purple on black :)

akshaysg 3 days ago||

Thanks for the opinion! That makes a lot of sense and I like the concept of being an extension of a user's own analysis vs hosing them with information.

Yeah originally I thought of using yellow/brown or yellow/black but for some reason I didn't like the color. Plenty of time to go back though!

volkk 3 days ago||

honestly i feel the same way and i can't quite put into words why. I guess if I had to -- I think it's because I know not all AI generated stuff is equally created and that some people are terrible at prompting/or don't even proofread the stuff that's outputted, so I have this internal barometer that screams "you're likely wasting your time reading this" and so I just learned to avoid it entirely. Which is sad, because clearly now a ton of stuff is AI generated, so I barely read anything, _especially_ if I see any signals like "it's not just this, it's that"

Ethee 3 days ago||

Products like these make me realize we're solving for the wrong problems with a lot of these AI solutions. I don't want you to take this as a hit to you or your product, I actually think it's extremely cool and will likely find a use. But from my perspective if this is a product you think you need, then you likely have a bigger organizational issue, as PRs are probably the last thing that I would want an AI 'intern' to organize for me.

akshaysg 3 days ago|

> you likely have a bigger organizational issue

Could you expound on this? In my experience as a software engineer, a pull request could fall into one of two buckets (assuming it's not trivial):

1. The PR is not organized by the author so it's skimmed and not fully understood because it's so hard to follow along

2. The PR author puts a lot of time into organizing the pull request (crafting each commit, trying to build a narrative, etc.) and the review is thorough, but still not easy

I think organization helps the 1st case and obviates the need for the author to spend so much time crafting the PR in the 2nd case (and eliminates messy updates that need to be carefully slotted in).

Curious to hear how y'all handle pull requests!

Ethee 3 days ago||

This is where I feel like we've solved a third-order problem. If you're sorting all PRs into those two buckets then you should probably take a step back and redefine what a PR is for your organization, as both 1 and 2 make the assumption that the PR is too big to review in a single sit down or that the author didn't put in enough effort to craft their PR. Both of these should just be rejected outright in favor of doing things in a smaller more manageable way, instead of having an AI sort through something that a human should have started with. Obviously this is more of an ideal situation and a lot of companies don't work on the ideal which is why I think your product will find good use because companies don't want to invest in slowing down, only going faster.

mystickphoenix 2 days ago|||

> you should probably take a step back and redefine what a PR is for your organization

I agree with this wholeheartedly if you are in a role that allows you to redefine what a PR is. In almost every organization that I've worked for, the PR is defined several levels above my pay grade and suggesting changes/updates/etc is usually seen as complaining.

akshaysg 3 days ago|||

Interesting. At my previous company there was a debate about smaller PRs vs bigger PRs and the end conclusion was that there are tradeoffs in being able to deal with 2-5 bite-sized PRs vs one large PR. The largest one being that it's hard to grasp the totality of the pull request and how the different PRs work together.

> companies don't want to invest in slowing down, only going faster.

I do think this is the way things are going to go moving forward, for better or for worse!

phyzome 3 days ago||

My solution is to organize my PRs as a sequence of commits that explain what they do, and then a PR description that gives an overview and motivates the changes. I've gotten really positive feedback on this, and it dramatically speeds up reviews of my code. Overall less work for the team. (And it often helps me find problems before I even submit the PR.)

As for other people's PRs? If they don't give a good summary, I ask them to write one.

Cthulhu_ 3 days ago||

> If they don't give a good summary, I ask them to write one.

Exactly; if people can't be bothered to describe (and justify) their work, or if they outsource it to AI that creates something overly wordy and possibly wrong, why should I be bothered to review it?

spott 1 day ago|||

How much time does it take you to craft the PR compared to writing the code? Just curious if crafting the PR becomes significantly easier with practice.

akshaysg 3 days ago||

Yeah I did this too as an engineer!

I think this is a valid part of the "crafting PR" skill that's under appreciated, and part of the goal of Haystack here is to make that part of PR craft effortless.

phyzome 2 days ago||

The effort is part of what makes the outcome better, though.

DTrejo 3 days ago||

Very glad you're working on this! Here is my wish list as a code reviewer:

1. Allow me to step through the code execution paths that have been modified in the pull request, based on the tests that have been modified.

2. Allow me to see the data being handled in variables as I look through the code.

3. Allow me to see code coverage of each part of the code.

4. Show me the full file as I am navigating through the program execution so that I can feel the level of abstraction and notice nearby repetition or code that would benefit from being cleaned up.

Full article: https://dtrejo.com/code-reviews-sad

akshaysg 2 days ago|

> 1. Allow me to step through the code execution paths that have been modified in the pull request, based on the tests that have been modified.

Not sure if I fully grasp this! We tried to kind of do this in previous iterations (show call graphs all at once) and it gets messy very fast. Could you elaborate on this point in particular?

DTrejo 1 day ago||

Sure - imagine my PR adds one new test which test one new function.

Starting from the test, allow me to step through the program execution, just like a debugger, to observe variables, surrounding code, and the complete file.

If you read only the covered lines of code in a linear way, you'd miss the refactoring opportunities because you aren't looking at the rest of the file.

mclanett 3 days ago||

Did not load, sad.

Failed to load resource: net::ERR_BLOCKED_BY_CLIENT ^ I'm not exactly sure what this is about. I think it is https://static.cloudflareinsights.com/beacon.min.js/vcd15cbe... which I would imagine is probably not necessary.

Uncaught TypeError: Cannot convert undefined or null to object at Object.keys (<anonymous>) at review/?pr_identifier=xxx/xxx/1974:43:12

These urls seem to be kind of revealing.

akshaysg 3 days ago|

Trying to fix/find this! Could you email me the repo details? Very sorry for the error here.

In terms of auth: you should get an "unauthenticated" if you're looking at a repo without authentication (or a non-existent repo).

irrationalfab 3 days ago||

This nails a real problem. Non-trivial PRs need two passes: first grok the entrypoints and touched files to grasp the conceptual change and review order, then dive into each block of changes with context.

tolerance 3 days ago||

It could be just fatigue over the technology itself, but “like you wrote them yourself” sounds too much like a dog whistle for a user base of programmers working with AI generated PRs. Which means that this would be AI iterating over AI.

Feedback: Try speeding up your demo animations and resize the mouse to its regular size. My estimate is that if the marketing copy explains what a thing is, what it does and why it’s useful then all a visitor wants to see in an image is things go pop, boom and whoosh.

aleksiy123 3 days ago||

I don't think this isa fair critique at all.

Code reviews have always been about primarily reviewing others people code.

Abd knowing your won code better than others people code is a real thing?

akshaysg 3 days ago||

Thanks for the feedback!

Not sure if I fully grasp what you mean by dog whistling, but at the end of the day, like another commenter said, Haystack is also pretty helpful for when you're done experimenting with a piece of work and need to see what an AI has generated.

mclanett 3 days ago||

I love this idea, trying it out now. There is QUITE a delay doing the analysis, which is reasonable, so I assume as a productionized (non-demo) release this will be async?

akshaysg 3 days ago|

There's a GitHub app that you can install on your repo.

If you install and subscribe to the product, we create a link for you every time you make a pull request. We're working (literally right now!) on making it create a link every time you're assigned a review as well.

We'll also speed up the time in the future (it's pretty slow)!

kanodiaashu 3 days ago||

I would really want to use this, maybe about once a week, for major PRs. I find it absurd that we all get AI help writing large features but very little help when doing the approx same job in reviewing that code. I actually would even read my own PRs with it, as my workflow with AI is to prompt it to acheive building some feature/goal, then only review the code once things work (this is an oversimplification).

akshaysg 3 days ago|

Yes! I personally iterate without understanding the code too much (since it'll change), and fully assess what Claude Code (in my case) has done after I finished a piece of work.

shortcord 3 days ago|

I think tools like this are useful, but they can never replace the quality of the narrative that someone who actually wrote the code can come up with.

There's just so much contextual data outside of the code itself that you miss out on. This looks like an improvement over Github Co-Pilot generated summaries, but that's not hard.

More comments...