Top
Best
New

Posted by kwantaz 10/27/2024

We shrunk our Javascript monorepo git size(www.jonathancreamer.com)
334 points | 213 commentspage 3
nsonha 10/28/2024|
I think the title misses the "Honey, " part
EDEdDNEdDYFaN 10/27/2024||
better question - does the changelog need to be checked in the first place?
DeathMetal3000 10/27/2024|
They fixed a bug on a tool that is widely used. In what world is questioning why an organization is checking in a file that you have no context on a “better question”.
jakub_g 10/27/2024||
Paraphrasing meat of the article:

- When you have multiple files in the repo which have the same trailing 16 characters in the repo path, git may wrongly calculate deltas, mixing up between those files. In here they had multiple CHANGELOG.md files mixed up.

- So if those files are big and change often, you end up with massive deltas and inflated repo size.

- There's a new git option (in Microsoft git fork for now) and config to use full file path to calculate those deltas, which fixes the issue when pushing, and locally repacking the repo.

```

git repack -adf --path-walk

git config --global pack.usePathWalk true

```

- According to a screenshot, Chromium repacked in this way shrinks from 100GB to 22GB.

- However AFAIU until GitHub enables it by default, GitHub clones from such repos will still be inflated.

kreetx 10/27/2024||
I don't think GitHub, or any other git host, will have objections to using it once it's part of mainline git?

Also, thank you for the TLDR!

masklinn 10/27/2024||
> I don't think GitHub, or any other git host, will have objections to using it once it's part of mainline git?

Fixing an existing repository requires a full repack, and for a repository as big as Chromium it still takes more than half a day (56000 seconds is 15h30), even if that's an improvement over the previous 3 days it's a lot of compute.

From my experience of previous attempts, trying to get Github to run a full repack with harsh settings is extremely difficult (possibly because their infrastructure relies on more loosely packed repositories), I tried to get that for $dayjob's primary repository whose initial checkout had gotten pretty large and got nowhere.

As of right now, said repository is ~9.5GB on disk on initial clone (full, not partial, excluding working copy). Locally running `repack -adf --window 250` brings it down to ~1.5GB, at the cost of a few hours of CPU.

The repository does have some of the attributes described in TFA, so I'm definitely looking forward to trying these changes out.

leksak 10/27/2024||
Wouldn't a potential workaround be to create a new barebones repository and push the repacked one there? Sure, people will have to change their remote origin but if it solves the problem that might be worth the hassle?
masklinn 10/27/2024||
It breaks the issues, PRs, all the tooling and integration, …

For now we’re getting by with partial clones, and employee machines being imaged with a decently up to date repository.

deskr 10/27/2024|||
> in Microsoft git fork for now

Wait, what? Has MS forked git?

jakub_g 10/27/2024|||
MS has had their fork of git for years, and they contributed many performance features for monorepos since then to the mainline.
keybored 10/29/2024|||
Companies fork Git in order to work on things internally until they ready to be proposed for inclusion into Git itself. I’m pretty sure that GitHub and GitLab (and?) do the same thing.

These are not forks-going-their-own-way forks.

jamalaramala 10/27/2024||
Thank you to the AI that summarised the article. ;-)
jimjimjim 10/27/2024||
Did anybody else shudder at "Shrunked"?
tankenmate 10/27/2024||
Shrunken, shrunked ain't no language I ever heard of.
amsterdorn 10/27/2024|||
Honey, I done shrunked them kids
0points 10/27/2024||
English is my third language, also yes.
killingtime74 10/27/2024||
Shrank
tankenmate 10/27/2024||
Would be correct if it is "We shrank", but from my poor memory of the terminology that is the transitive form, shrunken is the intransitive form. But once again from my poor memory.
darraghenright 10/27/2024|||
I've spoken English as my native language for almost five decades and I've never seen/heard the word "shranked" before.

This surely cannot be correct. Even the title of the linked article doesn't use "shranked". What?

forgotpwd16 10/27/2024||
Commonly (since ca. 19th century), shrank is used as the past tense of shrink, shrunk as the past particle, and shrunken as an adjective. The title of the linked article uses "shrunk" as past tense and the submitted title was changed to "shrunked" for some reason. "Shranked" was not mentioned anywhere. (But "shrinked" has had some use in the past.)
peutetre 10/27/2024|||
I was in the pool!
dougthesnails 10/27/2024|||
I think I prefer shrunked in this context.
Sparkyte 10/27/2024||
Shrinky dinky
bubblesnort 10/27/2024||
Honey, I shrunk the git!
blumomo 10/27/2024||
[flagged]
mark_and_sweep 10/27/2024||
As a German, I assumed he's talking about poor connection speeds.
blumomo 10/27/2024||
You Germans have slow internet speed? Why’s that?
btilly 10/27/2024||
https://youtu.be/W1ZZ-Yni8Fg?si=493ozTdkEsXJnPpB does a good job of explaining it.
mirekrusin 10/27/2024|||
Size doesn't matter, it's how you use it (no invalid diffs on paths sharing trailing part).
tom_ 10/27/2024||
They're not actually smaller. It just looks like it because they're further away.
AbuAssar 10/27/2024||
the gif memes were very distracting...
dangsux 10/27/2024|
[dead]