Git commands I run before reading any code

Posted by grepsedawk 14 hours ago

Git commands I run before reading any code(piechowski.io)

1609 points | 348 commentspage 3

seba_dos1 12 hours ago|

> If the team squashes every PR into a single commit, this output reflects who merged, not who wrote.

Squash-merge workflows are stupid (you lose information without gaining anything in return as it was easily filterable at retrieval anyway) and only useful as a workaround for people not knowing how to use git, but git stores the author and committer names separately, so it doesn't matter who merged, but rather whether the squashed patchset consisted of commits with multiple authors (and even then you could store it with Co-authored-by trailers, but that's harder to use in such oneliners).

theshrike79 12 hours ago||

Can you explain to me (an avid squash-merger) what extra information do you gain by having commits that say "argh, let's see if this works", "crap, the CI is failing again, small fix to see if it works", "pushing before leaving for vacation" in the main git history?

With a squash merge one PR is one commit, simple, clean and easy to roll back or cherry-pick to another branch.

seba_dos1 12 hours ago|||

These commits reaching the reviewer are a sign of either not knowing how to use git or not respecting their time. You clean things up and split into logical chunks when you get ready to push into a shared place.

theshrike79 11 hours ago|||

Why would the reviewer look at the commit messages instead of the code?

1. Open PR page in whatever tool you're using

2. Read title + description to see what's up

3. Swap to diff and start reading through the changes

4. Comment and/or approve

I've never heard anyone bothering to read the previous commit messages for a second, why would they care?

jfultz 8 hours ago|||

In some cases, reviewing PR diffs commit-by-commit (and with the logs as the narration of the diff-by-diff story) is a substantial improvement over reviewing the entire PR diff. Concrete examples...

* A method or function that has code you realize needs to be shared...the code may need to be moved and also modified to accommodate its shared purpose. Separating the migration from any substantive modifications allows you to review the migration commit with the assistance of git's diff.colorMoved feature. It becomes easier to understand what changes are due to the migration, and what changes were added for more effective sharing.

* PRs sometimes contain mechanical work that is easy to review in isolation. Added or removed arguments, function renames, etc. No big deal if it's two or three instances, but if it's dozens or hundreds of instances, it's easier for the humans to review all of those consistent changes together, rather than having them mixed in with other things one has to reason about.

* Sometimes a flow of commits can help follow a difficult chain of reasoning. PR developer claims that condition X can never occur, but the code is complex enough that it's difficult to verify. However, by transforming the code in targeted ways that are possible to reason about, the complexity might be reducible to the point where the claim becomes obvious. One frequent example I see of this is of function/method arguments that are actually unnecessary, but it wasn't obvious until after some code transformations.

seba_dos1 10 hours ago||||

Because it's a useful abstraction. If you only look at PRs and don't ever care about commits, why are they even being sent to reviewer in the first place? Just send a diff file.

Having atomic commits lets you actually benefit from having them. Suddenly you don't have to perform weird dances with interconnected PRs with dependencies as "PR too big" is not such a problem anymore as long as commits are digestible; you can have things property bisectable; you can preserve shared authorship; you can range-diff and have a better view on what and how changed between review passes, and so on...

The unit of change is commit, and PRs group commits you want someone to pull. If you don't want or need any of that, you're just sending a patch file in a needlessly elaborate way.

Anon1096 10 hours ago||

> If you only look at PRs and don't ever care about commits, why are they even being sent to reviewer in the first place? Just send a diff file.

This is in fact what hg does with amending changesets and yes it works far better. Keep PRs small and atomic and you never need to worry about what happens intra-pr. If you need bigger units of work that's what stacking is for.

seba_dos1 9 hours ago||

Stacking is good for expressing dependencies, but isn't helpful when you need to make several distinct changes that aren't necessarily needed unless you take them all in. What's the value in having a separate PR that introduces a framework that you later use in another PR when you may not actually want to merge it if the latter one doesn't end up being merged as well?

A PR is a group of commits, just utilize that when you need it.

ipsento606 10 hours ago|||

>Swap to diff and start reading through the changes

this forces the reviewer to view the entire diff at once, which can greatly increase the cognitive load vs. being able to view diffs of logical units of work

for tiny PRs it may not matter, but for substantial PRs it can matter a lot

croemer 12 hours ago||||

What if the shared place is the place where you run a bunch of CI? Then you push your work early to a branch to see the results, fix them etc.

seba_dos1 12 hours ago|||

You can do whatever you want with stuff nobody else looks at. I do too.

I meant "shared place" as an open review request or a shared branch rather than shared underlying infrastructure. Shared by people's minds.

mr_mitm 12 hours ago|||

You can always force-push a cleaned up version of your branch when you are ready for review, or start a new one and delete the WIP one.

croemer 10 hours ago|||

You can, but instead you can also just squash merge in one click. And avoid that people merge there dozens of fixes if you allow anything but squash merge.

theshrike79 11 hours ago|||

I hate (and fear) force-pushing and "cleaning up" git history as much as other people dislike squash-merging =)

It just feels wrong to force push, destroying stuff that used to be there.

And I don't have the time or energy to bisect through my shitty PR commits and combine them into something clean looking - I can just squash instead.

seba_dos1 10 hours ago||

Nothing is destroyed by a force push. It just overwrites a single pointer, and even keeps its old value in reflog.

Things that aren't referenced by anything anymore will eventually get garbage collected and actually destroyed, but you can just keep a reference somewhere to prevent that from happening if you need. Or even disable garbage collection completely.

Looks like people's fears about git come just from not knowing what it does.

Noumenon72 8 hours ago||

You can't use the remote reflog to revert what you force pushed, can you? But I agree that having your local reflog means you're never totally lost. I still just make a branch before major edits so I can go back.

zaphirplane 12 hours ago||||

What are examples of better ones. I don’t get the let me show the world my work and I’m not a fan of large PR

duskdozer 11 hours ago||

if you mean better messages, it's not really that. those junk messages should be rewritten and if the commits don't stand alone, merged together with rebase. it's the "logical chunks" the parent mentioned.

it's hard to say fully, but unless a changeset is quite small or otherwise is basically 0% or 100%, there are usually smaller steps.

like kind of contrived but say you have one function that uses a helper. if there's a bug in the function, and it turns out to fix that it makes a lot more sense to change the return type of the helper, you would make commit 1 to change the return type, then commit 2 fix the bug. would these be separate PRs? probably not to me but I guess it depends on your project workflow. keeping them in separate commits even if they're small lets you bisect more easily later on in case there was some unforseen or untested problem that was introduced, leading you to smaller chunks of code to check for the cause.

orsorna 11 hours ago||

If the code base is idempotent, I don't think showing commit history is helpful. It also makes rebases more complex than needed down the line. Thus I'd rather squash on merge.

I've never considered how an engineer approaches a problem. As long as I can understand the fundamental change and it passes preflights/CI I don't care if it was scryed from a crystal ball.

This does mean it is on the onus of the engineer to explain their change in natural language. In their own words of course.

seba_dos1 10 hours ago|||

Commits don't show "how an engineer approaches a problem". Commits are the unit of change that are supposed to go into the final repository, purposefully prepared by the engineer and presented for review. The only thing you do by squashing on merge is to artificially limit the review unit to a single commit to optimize the workflow towards people who don't know how to use git. Personally I don't think it's a good thing to optimize for.

orsorna 7 hours ago||

Preserving commit history pre-merge only seems useful if I had to revert or rebase onto an interstitial commit. This is at odds with treating PRs as atomic changes to the code base.

I might have not stated my position correctly. When I mean "squash on merge", I mean the commit history is fully present in the PR for full scrutiny. Sometimes commits may introduce multiple changes and I can view commit ranges for each set of changes. But it takes the summation of the commits to illustrate the change the engineer is proposing. The summation is an atomic change, thus scrutinizing terms post-merge is meaningless. Squashing preserves the summation but rids of the terms.

Versioned releases on main are tagged by these summations, not their component parts.

OkayPhysicist 3 hours ago|||

If you don't care about how the problem was solved, why are you reviewing it at all?

yokoprime 12 hours ago|||

Haha, good luck working with a team with more than 2 people. A good reviewer looks at the end-state and does not care about individual commits. If im curious about a specific change i just look at the blame.

tasuki 11 hours ago|||

> A good reviewer looks at the end-state and does not care about individual commits.

Then I must be a bad reviewer. In a past job, I had a colleague who meticulously crafted his commits - his PRs were a joy to review because I could go commit by commit in logical chunks, rather than wading through a single 3k line diff. I tried to do the same for him and hope I succeeded.

theshrike79 11 hours ago|||

And then someone comments on a thing, they change it and force-push another "clean" history on top and all of your work is wasted because the PR is now completely different =)

mgfist 10 hours ago||||

Why are those not just separate PRs? Or if they really needed to be merged at once - they should still be separate PRs but on a feature branch

seba_dos1 10 hours ago||

Why have PRs - groups of commits to pull - then if all you need is a single patch file?

mgfist 9 hours ago|||

You can, but most of us work in Github and having a PR to dump commits to is very easy and convenient. Then, when you get some feedback on your PR, you can dump more commits and it's very easy for the reviewer to see what has changed since the last time they reviewed it.

I feel like what you're arguing is that you should clean up your commits before anyone else sees them. Fair. But you could also clean it up right before merging to main. It's not that different, except the latter is much less annoying, particularly when going back and forth with people.

I know this is a very github centric workflow, but that's where most engineers work now, and it's nice and easy. This wouldn't work for eg: contributing to linux, but that's not what most of us do.

awesome_dude 39 minutes ago|||

This is where the "Trunk based development" people live - I personally believe that commits should be atomic, because git bisect on smaller meaningful commits is a hang of a lot better than a monster 90 file change commit

KptMarchewa 11 hours ago|||

Split the PR rather than force me to wade through your commit history. Use graphite or something else that allows you to stack PRs.

jlokier 8 hours ago||

Why is it "wade through" if there are 10 clearly distinct but dependent commits, but comfortable if it's 10 stacked PRs instead? They are basically the same thing, presented ever so slightly differently.

I think in most teams I've worked with, the majority of developers (> 85%) barely undestand what Git is doing or what things mean inside GitHub, have never seen commit history as a graph, have never run something like "git log --oneline --graph --decorate" or "--format", and have never heard of "git range-diff" which is very useful for following commit/PR/unit changes.

Personally I review using "git" itself, so I see the graph structure either way, and there's little difference between stacked PRs, commit chains in a single PR, or even feature branches, from that point of view. Even force-push branch updatea aren't difficult to review, because of the reflog and "git range-diff". The differences are mainly in what kinds of behaviour the web-based tooling promotes in the rest of the team, which does matter, and depends on the team.

I agree with you if you're using Graphite instead of GitHub. Having a place to give feedback and/or approval on the individual "units" (commits in a PR, or PRs in a stack) is useful, grouping dependent but distinct changes is useful, and diff'd commit evolution within each unit PR in response to back-end-forth review feedback is useful in some collaborative settings. Though, if you know "git range-diff" and reflog, that shows diff'd commit evolution quite well.

In GitHub, people are confused by stacked PRs both conceptually and due to the GitHub UX around them. Most times when I've posted a stacked PR to a GitHub project, other people didn't realise it was stacked, and occasinally someone else has merged the tip of a stack made by me, and been surprised to see all the dependent PRs merged automatically as a side effect. Usually before they get to reviewing those other PRs :-)

People understand commit sequences in a PR, though I've rarely seen people treat the individual commits as units for review when using GitHub, unfortunately. In the Linux kernel world where Git was born, the PR flow is completely different from GitHub: Their system tends to result in feedback on individual commits. It also encourages better quality feedback, with less nitpicking, and better quality commits.

jfengel 10 hours ago||||

Sometimes I have to go back and fix a bug that appeared during another branch. Having the original commits helps me bisect it.

Not often, but given that it costs me nothing to have it all in my tree, I'd rather have it than not.

hhjinks 12 hours ago||||

You review code not to verify the actual output of the code, but the code itself. For bugs, for maintainability. Commit hygiene is part of that.

seba_dos1 12 hours ago|||

I have no troubles working on big FLOSS projects where reviews usually happen at the commit level :)

theshrike79 11 hours ago||

So if a PR consists of 20 commits, they review every single commit linearly without looking at the end result first?

seba_dos1 10 hours ago||

Yes, and in some projects 20 commits is not even a big PR, more like "regular sized". The LKML's first page is now full of PRs with around 20 commits, here's a random one as an example: https://lore.kernel.org/netdev/20260408121252.2249051-1-dhow...

And here's a slightly smaller one which isn't about "miscellaneous fixes": https://lore.kernel.org/netdev/20260408122027.80303-1-xuanzh...

Some of these commits even get reviewed by different maintainers before being merged, which is common when a patchset touches several subsystems at once.

Aachen 12 hours ago||||

If someone uses git commits like the save function of their editor and doesn't write messages intended for reading by anyone else, it makes sense to want to hide them

For other cases, you lose the information about why things are this way. It's too verbose to //comment on every like with how it came to be this way but on (non-rare in total, but rare per line) occasion it's useful to see what the change was that made the line be like this, or even just who to potentially ask for help (when >1 person worked on a feature branch, which I'd say is common)

seba_dos1 12 hours ago||

> If someone uses git commits like the save function of their editor

I use it like that too and yet the reviewers don't get to see these commits. Git has very powerful tools for manipulating the commit graph that many people just don't bother to learn. Imagine if I sent a patchset to the Linux Kernel Mailing List containing such "fix typo", "please work now", "wtf" patches - my shamelessness has its limits!

Aachen 11 hours ago||

Seems like a lot of extra effort (save, add, commit, come up with some message even if it's a prayer to work now) only to undo it again later and create a patch or alternate history out of the final version. Why bother with the intermediate commits if you're not planning for it to be part of the history?

bguebert 1 hour ago|||

Sometimes its nice to have a history like that because then maybe you are thinking of trying the thing they tried that wouldn't work and it would save you some time trying it if you can tell from those commits that it didn't work.

seba_dos1 10 hours ago||||

Git is a version control system. It does not care about what it versions.

When I work on something, I commit often and use the commit graph as a undo tool on steroids. I can see what I tried, I can cherry-pick or revert stuff while experimenting, I can leave promising but unfinished stuff to look at later, or I can just commit to have a simple way to send stuff to CI, or a remote backup synced between machines.

Once I'm done working on something, it's time to take a step back, look at the big picture, see how many changes my experiments have actually yielded, separate them, describe and decide whether they go to review together or split in some way, as sometimes working on a single thing requires multiple distinct changes (one PR with multiple commits), but sometimes working in a single session yields fixes for multiple unrelated issues (several PRs). Only then it gets presented to the reviewer.

It just happens that I can do both these distinct jobs with a single tool.

thi2 10 hours ago||||

Because I might want to go back to this current messy state but I don't want to commit it like this (hardcoded test strings, debug logs, cutted corners to see if something works, you name it).

I simply commit something like "WIP: testing xy" and if its working and properly implemented i can squash/rebase/edit the commit message and force push it to my feature branch. Using a Git client like Gitkraken makes this incredibly easy, takes seconds.

This way I can leverage version control without committing bogus states to the final PR.

skydhash 10 hours ago|||

If the team is using a PR workflow, the PR is a working place to produce one single commit. The individual commits are just timestamped changes and comments. Think of it as the equivalent of annotated diff in mailing list conversation.

tasuki 11 hours ago||||

You gain the extra information by having reasonable commit messages rather than the ones you mentioned. To fix CI you force push.

Can you explain to me what an avid squash-merger puts into the commit message of the squashed commit composed of commits "argh, let's see if this works", "crap, the CI is failing again, small fix to see if it works", and "pushing before leaving for vacation" ?

theshrike79 11 hours ago||

The squashed commit from the PR -> main will have a clean title + description that says what was added.

Usually pretty close to what the PR title + description are actually, just without the videos and screenshots.

Example:

feat(ui): Add support for tagging users

* Users can be tagged via the user page * User tags visible in search results (configurable)

etc..

I don't need to spend extra time cleaning up my git commits and force-pushing on the PR branch, losing context for code reviews etc. Nor does anyone have to see my shitty angry commits when I tried to figure out why Playwright tests ran on my machine and failed in the CI for 10 commits.

BeetleB 7 hours ago||||

Trivial and not too silly example:

Part of new feature you had working in an intermediate commit, but broke somewhere along the way and is not working in your last commit when you squashed.

If you catch it early enough, I suppose it's in your reflog, but otherwise you're screwed.

It sounds like a silly example, but I bet most developers have run into this at some point.

With mercurial/jujutsu, you get the best of both worlds: The "argh, let's see if this works" commits are what I call "microcommits", and the squashed versions are the real/public commits. With jujutsu, you get both. Your log shows only the "real" commits (equivalent of squashing all the commits between that and the prior "real" commit). But if you want to drill down into the microcommits, the information is always there.

Let's acknowledge the reality. Many people use git not just for version control, but for backup ("Let me commit this so I don't lose it"). Let's ensure the VC tool supports both and doesn't force you to pick one over the other.

joshstrange 9 hours ago||||

> "argh, let's see if this works", "crap, the CI is failing again, small fix to see if it works", "pushing before leaving for vacation"

These are all bad commits IMHO. Aside from the CI one, I understand that message. I have commits like that on personal projects but for professional projects I'd be frustrated if people were committing messages like that.

Personally I'm a "one commit" type of guy, I don't like committing things in a broken state even on a side branch unless I have to (to share the code or test a CI). Occasionally I will make multiple commits at the very end to make review easier or once I have everything working but I want to try something different but I have a bunch of options of saving code that don't involving committing:

- Stash

- Shevle (IDEA)

- Backblaze

- Time Machine

- Local History (IDEA)

The idea of committing WIP before leaving for a vacation just feels so wrong to me.

I once worked for someone who wanted developers to commit code before the end of every day as a safety measure. His reasoning was in case the developer's computer died or similar. I found that silly at the time and still do now. That's what backups are for, I dislike when people use git as a backup like that in a professional setting.

thi2 10 hours ago||||

Why are those commits ending in the PR? Just unprofessional to work like that.

psalaun 7 hours ago|||

git bisect gets more useful because it will pin a smaller set of changes

mcpherrinm 11 hours ago|||

Squash merge is the only reasonable way to use GitHub:

If you update a PR with review feedback, you shouldn’t change existing commits because GitHub’s tools for showing you what has changed since your last review assume you are pushing new commits.

But then you don’t want those multiple commits addressing PR feedback to merge as they’re noise.

So sure, there’s workflows with Git that doesn’t need squashing. But they’re incompatible with GitHub, which is at least where I keep my code today.

Is it perfect? No. But neither is git, and I live in the world I am given.

mgfist 10 hours ago|||

Yes, I think people who are anti squash merge are those who don't work in Github and use a patch based system or something different. If you're sending a patch for linux, yes it makes sense that you want to send one complete, well described patch. But Github's tooling is based around the squash merge. It works well and I don't know anyone in real life who has issues with it.

And to counter some specific points:

* In a github PR, you write the main commit msg and description once per PR, then you tack on as many commits as you want, and everyone knows they're all just pieces of work towards the main goal of the eventually squashed commit

* Forcing a clean up every time you make a new commit is not only annoying extra work, but it also overwrites history that might be important for the review of that PR (but not important for what ends up in main branch).

* When follow up is requested, you can just tack on new commits, and reviewers can easily see what new code was added since their last review. If you had to force overwrite your whole commit chain for the PR, this becomes very annoying and not useful to reviewers.

* In the end, squash merge means you clean up things once, instead of potentially many times

goosejuice 7 hours ago||

Forcing a single commit per PR is the issue imo. It's a lazy solution. Rebase locally into sensible commits that work independently and push with lease. Reviewers can reset to remote if needed.

l72 6 hours ago||||

If your goal here is to have linear history, then just use a merge commit when merging the PR to main and always use `git log --first-parent`. That will only show commits directly on main, and gives you a clean, linear history.

If you want to dig down into the subcommits from a merge, then you still can. This is useful if you are going back and bisecting to find a bug, as those individual commits may hold value.

You can also cherry pick or rollback the single merge commit, as it holds everything under it as a single unit.

This avoids changing history, and importantly, allows stacked PRs to exist cleanly.

mcpherrinm 6 hours ago||

Git bisect is one of the important reasons IMO to always squash-merge pull requests: Because the unit of review is the pull request.

I think this is all Github's fault, in the end, but I think we need to get Github to change and until then will keep using squash-merges.

olejorgenb 1 hour ago|||

git bisect --first-parent

juped 2 hours ago|||

No.

The cases where bisect fails you are, basically, ones where it lands on a merge that does too much - you now have to manually disentangle the side that did too much to find out exactly what interaction caused the regression. But this is on the rarer side because it's rare for an interaction to be what caused the regression, it's more common that it's a change - which will be in a non-merge commit.

The squash merge workflow means every single commit is a merge that does too much. Bisect can't find anything useful for you by bisection anymore, so you have to get lucky about how much the merge did, unenriched by any of the history that you deleted.

arnorhs 12 hours ago|||

The author is talking about the case where you have coherent commits, probably from multiple PRs/merges, that get merged into a main branch as a single commit.

Yeah, I can imagine it being annoying that sqashing in that case wipes the author attribution, when not everybody is doing PRs against the main branch.

However, calling all squash-merge workflows "stupid" without any nuance.. well that's "stupid" :)

seba_dos1 12 hours ago|||

I don't think there's much nuance in the "I don't know --first-parent exists" workflow. Yes, you may sometimes squash-merge a contribution coming from someone who can't use git well when you realize that it will just be simpler for everyone to do that than to demand them to clean their stuff up, but that's pretty much the only time you actually have a good reason to do that.

l72 6 hours ago|||

I really, really wish git changed two defaults:

  * git merge ALWAYS does a merge and git pull ALWAYS does a fast forward.
  * git log --first-parent is the default. Have a git log --deep if you want to go down into branches.

If you use a workflow that always merges a PR with a merge commit, then git log --first-parent gives you a very nice linear history. I feel like if this was the default, so many arguments about squashing or rebasing workflows wouldn't be necessary to get our "linear history", everyone would just be doing merges and be happy with it. You get a clean top level history and you can dig down into the individual commits in a merge if you are bisecting for a bug.

juped 2 hours ago||

I agree.

I set merge.ff = false and alias ff to merge --ff-only. I don't use pull but I do have pull.ff = only set, just in case someday I do.

The graph log and the first-parent log serve different purposes and possibly shouldn't be the same command conceptually; this varies by user preference but the first-parent log is more of a "good default", generally. Merges do say "Merge" at the start, after all.

This is what I advise people to do in consulting engagements, too, it's not one of my personal quirks.

skydhash 10 hours ago|||

Do people actually share PR as in different people contributing to the same branch?

Also I can understand not squashing if the contribution comes from outside the organization. But in that case, I would expect a cleaned up history. But if every contribution is from members of the team, who can merge their own PR, squash merge is an easy way to get a clean history. Especially when most PR should be a single commit.

l72 6 hours ago||

We do. If we are building out a feature, none of its code is merged into main until it is complete (if this is a big feature, we milestone into mergeable and releasable units).

The feature is represented by a Story in Jira and a feature branch for that story. Subtasks in jira are created and multiple developers can pick up the different subtasks. There is a personal branch per subtasks, and PRs are put up against the feature branch. Those subtasks are code reviewed, tested, and merged into the feature branch.

In the end, it is the feature branch that is merged (as a single merge commit and complete unit) into main, and may well have had contributions from multiple people.

juped 2 hours ago|||

Somewhat Linux-like. You could probably improve it purely from a git perspective by letting subtask dependencies be many-to-many (the commit graph is a dependency graph), but what you have is probably best for your whole Jira workflow.

skydhash 5 hours ago|||

I get your POV, but I’ve always considered that long-lived branches in the canonical repo (the one in the forge) other than the main one should be directly related to deployable artifacts. Anything else should be short-lived.

There can be experiment on the side that warrants your approach, but the amounts of merge going back and forth would make this hard to investigate (especially when blaming) I would prefer to have one single commit with a message that describe every contribution.

duskdozer 11 hours ago|||

I think the point is that if you have to squash, the PR-maker was already gitting wrong. They should have "squashed" on their end to one or more smaller, logically coherent commits, and then submitted that result.

skydhash 11 hours ago||

It’s not “having to squash”. The intent was already for a PR to be a single commit. I could squash it on my end and merge by rebasing, but any alteration would then need to be force-pushed. So I don’t bother. I squash-merge when it’s ready and delete the branch.

lamasery 11 hours ago|||

Squash-merge is entirely fine for small PRs. Cleaning up the commits in advance (probably to just squash them to one or two anyway) is extra work, and anything that discourages people from pushing often (to get the code off their local machine) needs to be well-justified. Just review the (smallish!) total outcome of all the commits and squash after review. A few well-placed messages on the commit, attached to relevant lines, are more helpful and less work than cleaning up the commit history of a smallish PR.

For really large PRs, I’m more inclined to agree with you, but those should probably have their own small-PR-and-squash-merge flow that naturally cleans up their git history, anyway.

I categorically disagree that squash-merge is “stupid” but agree there are many ways to skin this cat.

LinXitoW 9 hours ago|||

How does not squash merging deal with the fact that branches disappear when merging? What I mean is that the information "this commit happened in the context of this PR or this overarching goal" goes missing. When you squash, you use the one central unit of information management in Git: the commit.

filcuk 12 hours ago|||

Having the tree easy to filter doesn't matter if it returns hundreds of commits you have to sift through for no reason.

seba_dos1 11 hours ago||

Having the commit graph easy to filter means exactly that you don't have to sift through hundreds of commits for no reason. What else did you think it would mean?

6thbit 4 hours ago||

Calling squash stupid sounds like a case of Dunning-Kruger.

If you've worked on a large team without squashing and without increasing frustration I'd be greatly interested to hear about it.

Cthulhu_ 10 hours ago||

For "what changes the most", in my project it's package.json / lock (because of automatic dependency updates) and translation / localization files; I'd argue that's pretty normal and healthy.

For the "bus factor", there's one guy and then there's me, but I stopped being a primary contributor to this project nearly two years ago, lol.

gherkinnn 13 hours ago||

These are some helpful heuristics, thanks.

This list is also one of many arguments for maintaining good Git discipline.

arthurjj 8 hours ago||

These were interesting but I don't know if they'd work on most or any of the places I've worked. Most places and teams I've worked at have 2-3 small repos per project. Are most places working with monorepos these days?

abustamam 8 hours ago|

I can't speak for most, but the past few places I consulted or worked at used monorepos.

BigTTYGothGF 8 hours ago||

Jesus I've seen what you've done for others and want that for myself.

abustamam 5 hours ago||

ianberdin 2 hours ago||

Well, 70% of my commits are “123”.

alaudet 9 hours ago||

This is good stuff. Why I never think of things like this is beyond me. Thanks

mikaoelitiana 9 hours ago||

I created a small TUI based on the article https://github.com/mikaoelitiana/git-audit

vladsanchez 8 hours ago|

You beat me to it! I envisioned creating some aliases but you exceeded it by building a TUI. Good job Claude! LOL ;)

pscanf 10 hours ago||

I just finished¹ building an experimental tool that tries to figure out if a repo is slopware or not just by looking at it's git history (plus some GitHub activity data).

The takeaway from my experiment is that you can really tell a lot by how / when / what people commit, but conclusions are very hard to generalize.

For example, I've also stumbled upon the "merge vs squash" issue, where squashes compress and mostly hide big chunks of history, so drawing conclusions from a squashed commit is basically just wild guessing.

(The author of course has also flagged this. But I just wanted to add my voice: yeah, careful to generalize.)

¹ Nothing is ever finished.

niedbalski 11 hours ago|

Ages ago, google released an algorithm to identify hotspots in code by using commit messages. https://github.com/niedbalski/python-bugspots

More comments...