Top
Best
New

Posted by birdculture 1 day ago

Package managers keep using Git as a database, it never works out(nesbitt.io)
681 points | 381 comments
c-linkage 1 day ago|
This seems like a tragedy of the commons -- GitHub is free after all, and it has all of these great properties, so why not? -- but this kind of decision making occurs whenever externalities are present.

My favorite hill to die on (externality) is user time. Most software houses spend so much time focusing on how expensive engineering time is that they neglect user time. Software houses optimize for feature delivery and not user interaction time. Yet if I spent one hour making my app one second faster for my million users, I can save 277 user hour per year. But since user hours are an externality, such optimization never gets done.

Externalities lead to users downloading extra gigabytes of data (wasted time) and waiting for software, all of which is waste that the developer isn't responsible for and doesn't care about.

Aurornis 21 hours ago||
> Most software houses spend so much time focusing on how expensive engineering time is that they neglect user time. Software houses optimize for feature delivery and not user interaction time.

I don’t know what you mean by software houses, but every consumer facing software product I’ve worked on has tracked things like startup time and latency for common operations as a key metric

This has been common wisdom for decades. I don’t know how many times I’ve heard the repeated quote about how Amazon loses $X million for every Y milliseconds of page loading time, as an example.

rovr138 20 hours ago|||
There was a thread here earlier this month,

> Helldivers 2 devs slash install size from 154GB to 23GB

https://news.ycombinator.com/item?id=46134178

Section of the top comment says,

> It seems bizarre to me that they'd have accepted such a high cost (150GB+ installation size!) without entirely verifying that it was necessary!

and the reply to it has,

> They’re not the ones bearing the cost. Customers are.

viraptor 17 hours ago|||
There was also the GTA wasting minutes to load/parse JSON files at startup. https://nee.lv/2021/02/28/How-I-cut-GTA-Online-loading-times...

And Skylines rendering teeth on models miles away https://www.reddit.com/r/CitiesSkylines/comments/17gfq13/the...

Sometimes the performance is really ignored.

darubedarob 5 hours ago||
Wasn' there a website with formula on how much time things like the GTA bug costed humanity as a whole? Like 5 minutes × users× sessionsperday accumulated?

It cost several human lifetimes if i remember correctly. Still not as bad as windows update which taking the time times wage has set the gdp of a small nation on fire every year..

ux266478 20 hours ago||||
That's not how it works. The demand for engineering hours is an order of magnitude higher than the supply for any given game, you have to pick and choose your battles because there's always much, much more to do. It's not bizarre that nobody verified texture storage was being done in an optimal way at launch, without sacrificing load times at the altar or visual fidelity, particularly given the state the rest of the game was in. Who the hell has time to do that when there are crashes abound and the network stack has to be rewritten at a moments notice?

Gamedev is very different from other domains, being in the 90th percentile for complexity and codebase size, and the 99th percentile for structural instability. It's a foregone conclusion that you will rewrite huge chunks of your massive codebase many, many times within a single year to accomidate changing design choices, or if you're lucky, to improve an abstraction. Not every team gets so lucky on every project. Launch deadlines are hit when there's a huge backlog of additional stuff to do, sitting atop a mountain of cut features.

swiftcoder 20 hours ago|||
> It's not bizarre that nobody verified texture storage was being done in an optimal way at launch

The inverse, however, is bizarre. That they spent potentially quite a bit of engineering effort implementing the (extremely non-optimal) system that duplicates all the assets half a dozen time to potentially save precious seconds on spinning rust - all without validating it was worth implementing in the first place.

MBCook 15 hours ago|||
Was Helldivers II built from the ground up? Or grown from the v1 codebase?

The first was on PS3 and PS4 where they had to deal with spinning disks and that system would absolutely be necessary.

Also if the game ever targeted the PS4 during development, even though it wasn’t released there, again that system would be NEEDED.

rovr138 19 hours ago|||
Yes.

They talk about it being an optimization. They also talk about the bottleneck being level generation, which happens at the same time as loading from disk.

darubedarob 5 hours ago||||
Gamedev engineering hours are also in endless oversupply thanks to myDreamCream brain.
jesse__ 11 hours ago|||
> It's a foregone conclusion that you will rewrite huge chunks of your massive codebase many, many times within a single year

Tell me you don't work on game engines without telling me..

----

Modern engines are the cumulative result of hundreds of thousands of engine-programmer hours. You're not rewriting Unreal in several years, let alone multiple times in one year. Get a grip dude.

atiedebee 4 hours ago||
I think they meant the gameplay side of things instead of the engine
saghm 8 hours ago||||
I don't think it's quite that simple. The reason they had such a large install size in the first place was due to concern about the load times for players using HDDs instead of SSDs; duplicating the data was intended to be a way to avoid making some players load into levels much more slowly than others (which in an online multiplayer game would potentially have repercussions for other players as well). The link you give mentions that this was based on flawed data (although it's somewhat light on those details), but that's means the actual cause was a combination of a technical mistake and the presence of care for user experience, just not the experience of the majority at the expense of the smaller but not insignificant minority. There's certainly room for argument about whether this was the correct judgement call to make or that they should have been better at recognizing their data was flawed, but it doesn't really seem like it fits the trends of devs not giving a shit about user experience. If making perfect judgement calls and never having flawed data is the bar for proving you care about users, we might as well just give up on the idea that any companies will ever reach it.
godelski 5 hours ago||||
How about GitHub actions with safe sleep that took over a year to accept a trivial PR that fixed a bug that caused actions to hang forever because someone forgot that you need <= instead of == in a counter...

Though in this case GitHub wasn't bearing the cost, it was gaining a profit...

https://github.com/actions/runner/pull/3157

https://github.com/actions/runner/issues/3792

kibwen 15 hours ago|||
> They’re not the ones bearing the cost. Customers are.

I think this is uncharitably erasing the context here.

AFAICT, the reason that Helldivers 2 was larger on disk is because they were following the standard industry practice of deliberately duplicating data in such a way as to improve locality and thereby reduce load times. In other words, this seems to have been a deliberate attempt to improve player experience, not something done out of sheer developer laziness. The fact that this attempt at optimization is obsolete these days just didn't filter down to whatever particular decision-maker was at the reins on the day this decision was made.

dijit 21 hours ago||||
I worked in e-commerce SaaS in 2011~ and this was true then but I find it less true these days.

Are you sure that you’re not the driving force behind those metrics; or that you’re not self-selecting for like-minded individuals?

I find it really difficult to convince myself that even large players (Discord) are measuring startup time. Every time I start the thing I’m greeted by a 25s wait and a `RAND()%9` number of updates that each take about 5-10s.

godelski 4 hours ago|||
I have plenty of responses to an angry comment I made several months ago that supports your point.

I made a slight at Word taking like 10 seconds to start and some people came back saying it only takes 2, as if that still isn't 2s too long.

Then again, look at how Microsoft is handling slow File Explorer speeds...

https://news.ycombinator.com/item?id=44944352

jama211 20 hours ago||||
Discord’s user base is 99% people who leave it running 100% of the time, it’s not a typical situation
dijit 20 hours ago||
I think that they make the startup so horrible that people are more likely to leave it running.
hexer292 19 hours ago|||
As a discord user, it's the kind of platform that I would want to have running to receive notifications, sort of like the SMS of gaming.

A large part of my friend group use discord as the primary method of communication, even in an in person context (was at a festival a few months ago with a friend, and we would send texts over discord if we got split up) so maybe its not a common use case.

solarkraft 8 hours ago||||
It leads to me dreading having to start it (or accidentally starting it - remember IE?) and opting for the browser instead.
jama211 11 hours ago|||
I strongly doubt that!
spockz 19 hours ago||||
I have the same experience on windows. On the other hand, starting up discord on my cachyos install is virtually instant. So maybe there is also a difference between the platform the developers use and that their users use.
drob518 19 hours ago|||
Yep, indeed. Which is the main reason I don’t run Discord.
jama211 11 hours ago||
I strongly doubt that. The main reason you don’t run it is likely because you don’t have strong motivation to do so, or you’d push through the odd start up time.
oceanplexian 3 hours ago||
Just going to throw out an anecdote that I don’t use it for the same reason.

It’s closed unless I get a DM on my phone and then I suffer the 2-3 minute startup/failed update process and quit it again. Not a fan of leaving their broken, resource hogging app running at all times.

ponector 18 hours ago||||
Contrary, every consumer facing product I've worked had no performance metrics tracked. And for enterprise software it was even worse as the end user is not the one who makes a decision to buy and use software.

>>what you mean by software houses

How about Microsoft? Start menu is a slow electron app.

julianz 17 hours ago|||
The Start menu is not an Electron app. Don't believe everything you read on the internet.
Conan_Kudo 1 hour ago|||
The Start menu is React Native, but Outlook is now an Electron app.
Spooky23 16 hours ago||||
That makes the usability and performance of the windows start menu even more embarrassing.

The decline of Windows as a user facing product is amazing, especially as they are really good at developing things they care about. The “back of house” guts of Windows has improved alot, for example. They should just have a cartoon Bill Gates pop up like clippy and flip you the bird at this point.

jiggawatts 13 hours ago||
Much worse is that the search function built into the start menu has been broken in different ways in every major release of Windows since XP, including Server builds.

It has both indexing failures and multi-day performance issues for mere kilobytes of text!

odo1242 16 hours ago||||
React Native, not Electron. Though it is slower than it was
kortilla 16 hours ago|||
People believing it says something about the start menu
TehShrike 16 hours ago||
hey, haven't seen that one in the wild for a little bit :-D https://www.smbc-comics.com/comic/aaaah
kortilla 9 hours ago||
The comic artist seems pretty ignorant to think that it’s not meaningful.

What falsehoods people believe and spread about a particular topic is an excellent way to tell what the public opinion is on something.

Consider spreading a falsehood about Boeing QA getting bonuses based on number of passed planes vs the same falsehood about Airbus. If the Boeing one spreads like wildfire, it tells you that Boeing has a terrible track record of safety and that it’s completely believable.

Back to the start menu. It should be a complete embarrassment to MSFT SWEs that people even think the start menu performance is so bad that it could be implemented in electron.

In summary: what lies spread easily is an amazing signal on public perception. The SMBC comic is dumb.

philipallstar 18 hours ago|||
> How about Microsoft? Start menu is a slow electron app.

If your users are trapped due to a lack of competition then this can definitely happen.

moregrist 18 hours ago||||
> I don’t know how many times I’ve heard the repeated quote about how Amazon loses $X million for every Y milliseconds of page loading time, as an example.

This is true for sites that are trying to make sales. You can quantify how much a delay affects closing a sale.

For other apps, it’s less clear. During its high-growth years, MS Office had an abysmally long startup time.

Maybe this was due to MS having a locked-in base of enterprise users. But given that OpenOffice and LibreOffice effectively duplicated long startup times, I don’t think it’s just that.

You also see the Adobe suite (and also tools like GIMP) with some excruciatingly long startup times.

I think it’s very likely that startup times of office apps have very little impact on whether users will buy the software.

delaminator 18 minutes ago||
They even made it render the screen but still be unusable to make it look like it was running.
j_w 13 hours ago||||
Clearly Amazon doesn't care about that sentiment across the board. Plenty of their products are absurdly slow because of their poor engineering.
eviks 18 hours ago||||
The issue here is not tracking, but developing. Like, how do you explain the fact that whole classes of software have gotten worse on those "key metrics"? (and that includes web-selling webpages)
pjmlp 19 hours ago||||
An exception that confirms the rule.
croes 16 hours ago||||
Then why do many software house favor cloud software over on premise?

They often have a recognizable delay to user data input compared to local software

mindslight 19 hours ago||||
> every consumer facing software product I’ve worked on has tracked things like startup time and latency for common operations as a key metric

Are they evaluating the shape of that line with the same goal as the stonk score? Time spent by users is an "engagement" metric, right?

venturecruelty 15 hours ago|||
>I don’t know what you mean by software houses, but every consumer facing software product I’ve worked on has tracked things like startup time and latency for common operations as a key metric.

Then respectfully, uh, why is basically all proprietary software slow as ass?

ekjhgkejhgk 1 day ago|||
I wouldn't call it tragedy of the commons, because it's not a commons. It's owned by microsoft. They're calculating that it's worth it for them, so I say take as much as you can.

Commons would be if it's owned by nobody and everyone benefits from its existence.

dahart 22 hours ago|||
> so I say take as much as you can. Commons would be if it’s owned by nobody

This isn’t what “commons” means in the term ‘tragedy of the commons’, and the obvious end result of your suggestion to take as much as you can is to cause the loss of access.

Anything that is free to use is a commons, regardless of ownership, and when some people use too much, everyone loses access.

Finite digital resources like bandwidth and database sizes within companies are even listed as examples in the Wikipedia article on Tragedy of the Commons. https://en.wikipedia.org/wiki/Tragedy_of_the_commons

nkmnz 19 hours ago||
No, the word and its meaning both point to the fact that there’s no exclusive ownership of a commons. This is importantl, since ownership is associated with bearing the cost of usage (i.e., deprecation) which would lead an owner to avoid the tragedy of the commons. Ownership is regularly the solution to the tragedy (socialism didn’t work).

The behavior that you warn against is that of a free rider that make use of a positive externality of GitHub’s offering.

dahart 18 hours ago||
That is one meaning of “commons”, but not all of them, and you might be mistaking which one the phrase ‘tragedy of the commons’ is using.

“Commons can also be defined as a social practice of governing a resource not by state or market but by a community of users that self-governs the resource through institutions that it creates.”

https://en.wikipedia.org/wiki/Commons

The actual mechanism by which ownership resolves tragedy of the commons scenarios is by making the resource non-free, by either charging, regulating, or limiting access. The effect still occurs when something is owned but free, and its name is still ‘tragedy of the commons’, even when the resource in question is owned by private interests.

bawolff 18 hours ago||
How does that differ from what the person you are arguing against is saying?
dahart 17 hours ago||
Ownership, I guess. The 2 parent comments are claiming that “tragedy of the commons” doesn’t apply to privately owned things. I’m suggesting that it does.

Edit: oh, I do see what you mean, and yes I misunderstood the quote I pulled from WP - it’s talking about non-ownership. I could pick a better example, but I think that’s distracting from the fact that ‘tragedy of the commons’ is a term that today doesn’t depend on the definition of the word ‘commons’. It’s my mistake to have gotten into any debate about what “commons” means, I’m only saying today’s usage and meaning of the phrase doesn’t depend on that definition, it’s a broader economic concept.

nkmnz 13 hours ago||
No, it’s not.
dahart 12 hours ago||
What’s not what? Care to back up your argument with any links? I already pointed out that examples in the WP article for ‘Tragedy of the Commons’ use private property. https://en.wikipedia.org/wiki/Tragedy_of_the_commons#Digital... Are you contradicting the Wikipedia article? Why, and on what basis?
bawolff 6 hours ago||
I'm not sure i agree that the Wikipedia article supports your position.

Certainly private property is involved in tragedy of the commons. In the classic shared cattle ranching example, the individual cattle are private property, only the field is held in common.

I generally think that tragedy of the commons requires the commons, to, well, be held in common. If someone owns the thing that is the commons, its not a commons but just a bad product. (With of course some nit picking about how things can be de jure private property while being defacto common property)

In the microsoft example, windows becoming shitty software is not a tragedy of the commons, its just MS making a business decision because windows is not a commons. On the other hand, computing in general becoming shitty, because each individual app does attention grabbing dark patterns, as it helps the induvidual apps bottom line while hurting the ecosystem as a whole, would be a tragedy of the commons, as user attention is something all apps hold in common and none of them own.

TeMPOraL 1 day ago||||
Still, because reality doesn't respect boundaries of human-made categories, and because people never define their categories exhaustively, we can safely assume that something almost-but-not-quite like a commons, is subject to an almost-but-not-quite tragedy of the commons.
bee_rider 20 hours ago|||
That seems to assume some sort of… maybe unfounded linearity or something? I mean, I’m not sure I agree that GitHub is nearly a commons in any sense, but let’s put that aside as a distraction…

The idea of the tragedy of the commons relies on this feedback loop of having these unsustainably growing herds (growing because they can exploit the zero-cost-to-them resources of the commons). Feedback loops are notoriously sensitive to small parameter changes. MS could presumably impose some damping if they wanted.

TeMPOraL 20 hours ago|||
> That seems to assume some sort of… maybe unfounded linearity or something

Not linearity but continuity, which I think is a well-founded assumption, given that it's our categorization that simplifies the world by drawing sharp boundaries where no such bounds exist in nature.

> The idea of the tragedy of the commons relies on this feedback loop of having these unsustainably growing herds (growing because they can exploit the zero-cost-to-them resources of the commons)

AIUI, zero-cost is not a necessary condition, a positive return is enough. Fishermen still need to buy fuel and nets and pay off loans for the boats, but as long as their expected profit is greater than that, they'll still overfish and deplete the pond, unless stronger external feedback is introduced.

Given that the solution to tragedy of the commons is having the commons owned by someone who can boss the users around, GitHub being owned by MS makes it more of a commons in practice, not less.

kortilla 15 hours ago||
No, it’s not a well-founded assumption. Many categories like these were created in the first place because there is a very obvious discontinuous step change in behavior.

You’re fundamentally misunderstanding what tragedy of the commons is. It’s not that it’s “zero-cost” for the participants. All it requires a positive return that has a negative externality that eventually leads to the collapse of the system.

Overfishing and CO2 emissions are very clearly a tragedy of the commons.

GitHub right now is not. People putting all sorts of crap on there is not hurting github. GitHub is not going to collapse if people keep using it unbounded.

Not surprisingly, this is because it’s not a commons and Microsoft oversees it, placing appropriate rate limits and whatnot to make sure it keeps making sense as a business.

thayne 19 hours ago|||
And indeed MS/GitHub does impose some "damping" in the form of things like API request throttling, CPU limits on CI, asking Homebrew not to use shallow cloning, etc. And those limits are one of the reasons given why using git as a database isn't good.
reactordev 1 day ago||||
An A- is still an A kind of thinking. I like this approach as not everything perfectly fits the mold.
lo_zamoyski 22 hours ago||||
There is an analogy in the sense that for the users a resource is, for certain practical intents and purposes, functionally common. Social media is like this as well.

But I would make the following clarifications:

1. A private entity is still the steward of the resource and therefore the resource figures into the aims, goals, and constraints of the private entity.

2. The common good is itself under the stewardship of the state, as its function is guardian of the common good.

3. The common good is the default (by natural law) and prior to the private good. The latter is instituted in positive law for the sake of the former by, e.g., reducing conflict over goods.

TeMPOraL 20 hours ago||
> There is an analogy in the sense that for the users a resource is, for certain practical intents and purposes, functionally common. Social media is like this as well.

I think it's both simpler and deeper than that.

Governments and corporations don't exist in nature. Those are just human constructs, mutually-recursive shared beliefs that emulate agents following some rules, as long as you don't think too hard about this.

"Tragedy of the commons" is a general coordination problem. The name itself might've been coined with some specific scenarios in mind, but for the phenomenon itself, it doesn't matter what kind of entities exploit the "commons"; the "private" vs. "public" distinction itself is neither a sharp divide, nor does it exist in nature. All that matters is that there's some resource used by several independent parties, and each of them finds it more beneficial to defect than to cooperate.

In a way, it's basically a 3+-player prisonner's dilemma. The solution is the same, too: introducing a party that forces all other parties to cooperate. That can be a private or public or any other kind of org taking ownership of the commons and enforcing quotas, or in case of prisonners, a mob boss ready to shoot anyone who defects.

ttiurani 23 hours ago|||
The whole notion of the "tragedy of the commons" needs to be put to rest. It's an armchair thought experiment that was disproven at the latest in the 90s by Elinor Ostrom with actual empirical evidence of commons.

The "tragedy", if you absolutely need to find one, is only for unrestricted, free-for-all commons, which is obviously a bad idea.

wongarsu 23 hours ago|||
A high-trust community like a village can prevent a tragedy of the commons scenario. Participants feel obligations to the community, and misusing the commons actually does have real downsides for the individual because there are social feedback mechanisms. The classic examples like people grazing sheep or cutting wood are bad examples that don't really work.

But that doesn't mean the tragedy of the commons can't happen in other scenarios. If we define commons a bit more generously it does happen very frequently on the internet. It's also not difficult to find cases of it happening in larger cities, or in environments where cutthroat behavior has been normalized

TeMPOraL 22 hours ago|||
> A high-trust community like a village can prevent a tragedy of the commons scenario. Participants feel obligations to the community, and misusing the commons actually does have real downsides for the individual because there are social feedback mechanisms.

That works while the size of the community is ~100-200 people, when everyone knows everyone else personally. It breaks down rapidly after that. We compensate for that with hierarchies of governance, which give rise to written laws and bureaucracy.

New tribes break off old tribes, form alliances, which form larger alliances, and eventually you end up with countries and counties and vovoidships and cities and districts and villages, in hierarchies that gain a level per ~100x population increase.

This is sociopolitical history of the world in a nutshell.

lukan 22 hours ago|||
"and eventually you end up with countries and counties and vovoidships and cities and districts and villages, in hierarchies that gain a level per ~100x population increase."

You say it like this is a law set in stone, because this is what happened im history, but I would argue it happened under different conditions.

Mainly, the main advantage of an empire over small villages/tribes is not at all that they have more power than the villages combined, but that they can concentrate their power where it is needed. One village did not stand a chance against the empire - and the villages were not coordinated enough.

But today we would have the internet for better communication and coordination, enabling the small entieties to coordinate a defense.

Well, in theory of course. Because we do not really have autonomous small states, but are dominated by the big players. And the small states have mowtly the choice which block to align with, or get crushed. But the trend might go towards small again.

(See also cheap drones destroying expensive tanks, battleships etc.)

ajuc 20 hours ago||
Internet is working exactly the opposite way to what your describing - it's making everything more centralized. Once we had several big media companies in each country and in each big city. Now we have Google and Facebook and tik tok and twitter and then the "whatevers".

NETWORK effect is a real thing

lukan 19 hours ago||
Yes, but there is a difference between having the choice of joining FB or not having a choice at all when the empire comes to claim you (like in Ukraine).
8note 10 hours ago||
FB is part of the empire though, and it is coming for us.

canadians need an anti-imperial radio-canada run alternative. we arent gonna be able to coordinate against the empire when the empire has the main control over the internet.

when the americans come a knocking, we're gonna wish we had chinese radios

xorcist 21 hours ago||||
> That works while the size of the community is ~100-200 people,

Yet we regularly observe that working with millions of people; we take care of our young, we organize, when we see that some action hurt our environment we tend to limit its use.

It's not obvious why some societies break down early and some go on working.

TeMPOraL 19 hours ago|||
> Yet we regularly observe that working with millions of people; we take care of our young, we organize, when we see that some action hurt our environment we tend to limit its use.

That's more like human universals. These behaviors generally manifest to smaller or larger degree, depending on how secure people feel. But those are extremely local behaviors. And in fact, one of them is exactly the thing I'm talking about:

> we organize

We organize. We organize for many reasons, "general living" is the main one but we're mostly born into it today (few got the chance to be among the founding people of a new village, city or country). But the same patterns show up in every other organizations people create, from companies to charities, from political interests groups to rural housewives' circles -- groups that grow past ~100 people split up. Sometimes into independent groups, sometimes into levels of hierarchies. Observe how companies have regional HQs and departments and areas and teams; religious groups have circuits and congregations, etc. Independent organizations end up creating joint ventures and partnerships, or merge together (and immediately split into a more complex internal structure).

The key factor here is, IMO, for everyone in a given group to be in regular contact with everyone else. Humans are well evolved for living in such small groups - we come with built-in hardware and software to navigate complex interpersonal situations. Alignment around shared goals and implicit rules is natural at this scale. There's no space for cheaters and free-loaders to thrive, because everyone knows everyone else - including the cheater and their victims. However, once the group crosses this "we're all a big family, in it together" size, coordinating everyone becomes hard, and free-loaders proliferate. That's where explicit laws come into play.

This pattern repeats daily, in organizations people create even today.

AnthonyMouse 21 hours ago||||
I get the feeling it's the combination of Schelling points and surplus. If everyone else is being pro-social, i.e. there is a culture of it, and the people aren't so hard up that they can reasonably afford to do the same, then that's what happens, either by itself (Hofstadter's theory of superrationality) or via anything so much as light social pressure.

But if a significant fraction of the population is barely scraping by then they're not willing to be "good" if it means not making ends meet, and when other people see widespread defection, they start to feel like they're the only one holding up their end of the deal and then the whole thing collapses.

This is why the tendency for people to propose rent-seeking middlemen as a "solution" to the tragedy of the commons is such a diabolical scourge. It extracts the surplus that would allow things to work more efficiently in their absence.

vlovich123 21 hours ago|||
I’ve heard stories from communist villages where everyone knew everyone. Communal parks and property was not respected and frequently vandalized or otherwise neglected because it didn’t have an owner and it was treated as something for someone else to solve.

It’s easier to explain in those terms than assumptions about how things work in a tribe.

ttiurani 22 hours ago||||
> But that doesn't mean the tragedy of the commons can't happen in other scenarios.

Commons can fail, but the whole point of Hardin calling commons a "tragedy" is to suggest it necessarily fails.

Compare it to, say, driving. It can fail too, but you wouldn't call it "the tragedy of driving".

We'd be much better off if people didn't throw around this zombie term decades after it's been shown to be unfounded.

lo_zamoyski 22 hours ago||||
Even here, the state is the steward of the common good. It is a mistaken notion that the state only exists because people are bad. Even if people were perfectly conscientious and concerned about the common good, you still need a steward. It simply wouldn’t be a steward who would need to use aggressive means to protect the common good from malice or abuse.
jandrewrogers 21 hours ago|||
> A high-trust community like a village can prevent a tragedy of the commons scenario.

No it does not. This sentiment, which many people have, is based on a fictional and idealistic notion of what small communities are like having never lived in such communities.

Empirically, even in high-trust small villages and hamlets where everyone knows everyone, the same incentives exist and the same outcomes happen. Every single time. I lived in several and I can't think of a counter-example. People are highly adaptive to these situations and their basic nature doesn't change because of them.

Humans are humans everywhere and at every scale.

yellow_postit 10 hours ago||
While an earlier poster is over stating Ostrom’s Nobel prize winning work — it is regularly shown that averting the tragedy of the commons is not as insurmountable as the original coining of the phrase implied.
Saline9515 23 hours ago||||
Ostrom showed that it wasn't necessarily a tragedy, if tight groups involved decided to cooperate. This common in what we call "trust-based societies", which aren't universal.

Nonetheless, the concept is still alive, and anthropic global warming is here to remind you about this.

dpark 21 hours ago||||
She not “disprove” the existence of the tragedy of the commons. What she established was that controlling the commons can be done communally rather than through privatization or through government ownership.

Communal management of a resource is still government, though. It just isn’t central government.

The thesis of the tragedy of the commons is that an uncontrolled resource will be abused. The answer is governance at some level, whether individual, collective, or government ownership.

> The "tragedy", if you absolutely need to find one, is only for unrestricted, free-for-all commons, which is obviously a bad idea.

Right. And that’s what people are usually talking about when they say “tragedy of the commons”.

gmfawcett 22 hours ago||||
Ostrom's results didn't disprove ToC. She showed that common resources can be communally maintained, not that tragic outcomes could never happen.
8note 10 hours ago||
i dont thjnk anything can disprove that ToC issues can happen under any situation.

that seems like an unreasonable bar, and less useful than "does this system make ToC less frequent than that system"

b00ty4breakfast 23 hours ago|||
yeah, it's a post-hoc rationalization for the enclosure and privatization of said commons.
TeMPOraL 22 hours ago||
And here I thought the standard, obvious solution to tragedy of the commons is centralized governance.
b00ty4breakfast 2 hours ago|||
That is, in fact, how medieval commons were able to exist successfully for hundreds of years.
dpark 21 hours ago|||
People invoke the tragedy of the commons in bad faith to argue for privatization because “the alternative is communism”. i.e. Either an individual or the government has to own the resource.

This is of course a false dichotomy because governance can be done at any level.

AnthonyMouse 20 hours ago||
It also seems to omit the possibility that the thing could be privately operated but not for profit.

Let's Encrypt is a solid example of something you could reasonably model as "tragedy of the commons" (who is going to maintain all this certificate verification and issuance infrastructure?) but then it turns out the value of having it is a million times more than the cost of operating it, so it's quite sustainable given a modicum of donations.

Free software licenses are another example in this category. Software frequently has a much higher value than development cost and incremental improvements decentralize well, so a license that lets you use it for free but requires you to contribute back improvements tends to work well because then people see something that would work for them except for this one thing, and it's cheaper to add that themselves or pay someone to than to pay someone who has to develop the whole thing from scratch.

jasonkester 23 hours ago||||
It has the same effect though. A few bad actors using this “free” thing can end up driving the cost up enough that Microsoft will have to start charging for it.

The jerks get their free things for a while, then it goes away for everyone.

Y_Y 23 hours ago||
I think the jerks are the ones who bought and enshittified GitHub after it had earned significant trust and become an important part of FOSS infrastructure.
irishcoffee 23 hours ago|||
Scoping it to a local maxima, the only thing worse than git is github. In an alternate universe hg won the clone wars and we are all better off for it.
MarsIronPI 19 hours ago||
Excuse me if this is obvious, but how is Mercurial better than Git from a repo format perspective?
dahart 22 hours ago|||
Why do you blame MS for predictably doing what MS does, and not the people who sold that trust & FOSS infra to MS for a profit? Your blame seems misplaced.

And out of curiosity, aside from costing more for some people, what’s worse exactly? I’m not a heavy GitHub user, but I haven’t really noticed anything in the core functionality that would justify calling it enshittified.

mastax 20 hours ago|||
Plenty of blame to go around.

Probably the worst thing MS did was kill GitHub’s nascent CI project and replace it with Azure DevOps. Though to be fair the fundamental flaws with that approach didn’t really become apparent for a few years. And GitHub’s feature development pace was far too slow compared to its competitors at the time. Of course GitHub used to be a lot more reliable…

Now they’re cramming in half baked AI stuff everywhere but that’s hardly a MS specific sin.

MS GitHub has been worse about DMCA and sanctioned country related takedowns than I remember pre acquisition GitHub being.

Did I miss anything?

Y_Y 20 hours ago|||
I don't blame them uniquely. I think it's a travesty the original GitHub sold out, but it's just as predictable. Giant corps will evilly make the line go up, individual regular people will have a finite amount of money for which they'll give up anything and everything.

As for how the site has become worse, plenty of others have already done a better job than I could there. Other people haven't noticed or don't care and that's ok too I guess.

groundzeros2015 22 hours ago||||
A public park suffers from tragedy of the commons even though it’s managed by the city.
drob518 19 hours ago||||
Right. Microsoft could easily impose a transfer fee if over a certain amount that would allow “normal” OSS development of even popular software to happen without charge while imposing a cost to projects that try to use GitHub like a database.
PunchyHamster 23 hours ago||||
Well, till you choose to host something yourself and it becomes popular
ericyd 22 hours ago||||
Tragedy of the Microsoft just doesn't sound as nice though
rvba 23 hours ago|||
I doubt anyone is calculating

Remember how GTA5 took 10 minutes to start and nobody cared? Lots of software is like this.

Some Blizzard games download 137 MB file every time you run them and take few minutes to start (and no, this is not due to my computer).

solatic 1 day ago|||
If you think too hard about this, you come back around to Alan Kay's quote about how people who are really serious about software should build their own hardware. Web applications, and in general loading pretty much anything over the network, is a horrible, no-good, really bad user experience, and it always will be. The only way to really respect the user is with native applications that are local-first, and if you take that really far, you build (at the very least) peripherals to make it even better.

The number of companies that have this much respect for the user is vanishingly small.

phkahler 22 hours ago|||
>> The number of companies that have this much respect for the user is vanishingly small.

I think companies shifted to online apps because #1 it solved the copy protection problem. FOSS apps are not in any hurry to become centralized because they dont care about that issue.

Local apps and data are a huge benefit of FOSS and I think every app website should at least mention that.

"Local app. No ads. You own your data."

xorcist 21 hours ago||
Another important reason to move to online applications is that you can change the terms of the deal at any time. This may sound more nefarious than it needs to be, it just means you do not have to commit fully to your licensing terms before the first deal is made, which is tempting for just about anyone.
hombre_fatal 23 hours ago||||
Software I don’t have to install at all “respects me” the most.

Native software being an optimum is mostly an engineer fantasy that comes from imagining what you can build.

In reality that means having to install software like Meta’s WhatsApp, Zoom, and other crap I’d rather run in a browser tab.

I want very little software running natively on my machine.

solatic 22 hours ago|||
Your browser is acting like a condom, in that respect (pun not intended).

Yes, there are many cases when condoms are indicative of respect between parties. But a great many people would disagree that the best, most respectful relationships involve condoms.

> Meta

Does not sell or operate respectful software. I will agree with you that it's best to run it in a browser (or similar sandbox).

tormeh 20 hours ago||
Desktop operating systems really dropped the ball on protecting us from the software we run. Even mobile OSs are so-so. So the browser is the only protection we reasonably have.

I think this is sad.

cosmic_cheese 16 hours ago||||
Web apps are great until you want to revert to an older version from before they became actively user-hostile or continue to use them past EoL or company demise.

In contrast as long as you have a native binary, one way or another you can make the thing run and nobody can stop you.

freedomben 23 hours ago||||
Yes, amen. The more invasive and abusive software gets, the less I want it running on my machine natively. Native installed applications for me now are limited only to apps I trust, and even those need to have a reason to be native apps rather than web apps to get a place in my app drawer
shash 21 hours ago|||
You mean you’d rather run unverified scripts using a good order of magnitude more resources with a slower experience and have an entire sandboxing contraption to keep said unverified scripts from doing anything to your machine…

I know the browser is convenient, but frankly, its been a horror show of resource usage and vulnerabilities and pathetic performance

whstl 21 hours ago||
The #1 reason the web experience universally sucks today is because companies add an absurd amount of third-party code on their pages for tracking, advertisement, spying on you or whatever non-essential purpose. That, plus an excessive/unnecessary amount of visual decoration.

The idea that somehow those companies would respect your privacy were they running a native app is extremely naive.

We can already see this problem on video games, where copy protection became resource-heavy enough to cause performance issues.

ghosty141 1 day ago|||
Yes because users don't appreciate this enough to pay for the time this takes.
zahlman 1 day ago|||
> Most software houses spend so much time focusing on how expensive engineering time is that they neglect user time. Software houses optimize for feature delivery and not user interaction time. Yet if I spent one hour making my app one second faster for my million users, I can save 277 user hour per year. But since user hours are an externality, such optimization never gets done.

This is what people mean about speed being a feature. But "user time" depends on more than the program's performance. UI design is also very important.

Y-bar 23 hours ago|||
You’ll enjoy ”Saving Lives” by Andy Hertzfied: https://www.folklore.org/Saving_Lives.html

> "The Macintosh boots too slowly. You've got to make it faster!"

kkjjjjw 16 hours ago||
https://news.ycombinator.com/item?id=44843223#44879509
bawolff 18 hours ago|||
> Software houses optimize for feature delivery and not user interaction time. Yet if I spent one hour making my app one second faster for my million users, I can save 277 user hour per year. But since user hours are an externality, such optimization never gets done.

Google and amazon are famous for optimizing this. Its not an externality to them though, even 10s of ms can equal an extra sale.

That said, i don't think its fair to add time up like that. Saving 1 second for 600 people is not the same as saving 10 minutes for 1 person. Time in small increments does not have the same value as time in large increments.

esafak 18 hours ago||
1. If you can price the cost of the externality, you can justify optimizing it.

2. Monopolies and situations with the principal/agent dilemma are less sensitive to such concerns.

bawolff 17 hours ago||
> 1. If you can price the cost of the externality, you can justify optimizing it.

An externality is usually a cost you don't pay (or pay only a negligible amount of). I don't see how pricing it helps justify optimizing it.

esafak 16 hours ago||
You are right. I should say perceived externality; there may be a price that is discounted.
robmccoll 22 hours ago|||
I don't think most software houses spend enough time even focusing on engineering time. CI pipelines that take tens of minutes to over an hour, compile times that exceed ten seconds when nothing has changed, startup times that are much more than a few seconds. Focus and fast iteration are super important to writing software and it seems like a lot of orgs just kinda shrug when these long waits creep into the development process.
DrewADesign 12 hours ago|||
> Yet if I spent one hour making my app one second faster for my million users, I can save 277 user hour per year. But since user hours are an externality, such optimization never gets done.

Wait times don’t accumulate. Depending on the software, to each individual user, that one second will probably make very little difference. Developers often overestimate the effect of performance optimization on user experience because it’s the aspect of user experience optimization their expertise most readily addresses. The company, generally, will have a much better ROI implementing well-designed features and having you squash bugs

drbojingle 12 hours ago||
A well designed feature IS considerate of time and attention. Why would I want a game on 20 fps when I could have it on 120? The smoothness of the experience increases my ability to use the experience optimally because I don't have to pay as much attention to it. I'd prefer if my interactions with machines were as smooth as my interactions driving a car down a empty dry highway mid day.

Prehaps not everyone cares but I've played enough Age of Empires 2 to know that there are plenty of people who have felt value gains coming from shaving seconds off this and that to get compound games over time. It's a concept plenty of folks will be familiar with.

DrewADesign 11 hours ago|||
Sure, but without unlimited resources, you need to have priorities, and everything has a ‘good enough’ state. All of this stuff lies on an Eisenhower chart and we tend to think our concerns fall into the important/urgent quadrant, but in the grand scheme of things, they almost never do.
8note 10 hours ago|||
i still prefer 15fps for games. if theyre putting the fps any higher, its not considerate of my time and attention

i have to pay less attention to a thing that updates less frequently. idle games are the best in that respect because you can check into the game on your own time rather than the game forcing you to pay attention on its time

ozim 23 hours ago|||
About apps done by software houses, even though we should strive for doing good job and I agree with sentiment...

First argument would be - take at least two 0's from your estimation, most of applications will have maybe thousands of users, successful ones will maybe run with 10's of thousands. You might get lucky to work on application that has 100's of thousands, millions of users and you work in FAANG not a typical "software house".

Second argument is - most users use 10-20 apps in typical workday, your application is most likely irrelevant.

Third argument is - most users would save much more time learning how to use applications (or to use computer) properly they use on daily basis, than someone optimizing some function from 2s to 1s. But of course that's hard because they have 10-20 apps daily plus god know how many other not on daily basis. Though still I see people doing super silly stuff in tools like Excel or even not knowing copy paste - so not even like any command line magic.

3371 19 hours ago|||
The user hour analogy sounds weird tho, 1s feels 1s regardless how many users you have. It's like the classic Asian teachers' logic of "if you come in 1 min late you are wasting N minutes for all of us in this class." It just does not stack like that.
BenjiWiebe 19 hours ago||
If the class takes N minutes and one person arrives 1 minute late, and the rest of the class is waiting for them, it does stack. Every one of those students lost a minute. Far worse than one student losing one minute.
3371 7 hours ago||
Do "we" lose 2mins because we both spent 1 min commenting? That sounds like The Mythical of Man Month thinking... for me time is parallel and does not combine.
pastor_williams 23 hours ago|||
This was something that I heavily focused on for my feature area a year ago - new user sign up flow. But the decreased latency was really in pursuit of increased activation and conversion. At least the incentives aligned briefly.
gritzko 20 hours ago|||
Let’s make a thought experiment. Suppose that I have a data format and a store that resolves the issues in the post. It is like git meets JSON meets key-value. https://github.com/gritzko/go-rdx

What is the probability of it being used? About 0%, right? Because git is proven and GitHub is free. Engineering aspects are less important.

pdimitar 14 hours ago|||
I am very interested by something like this but your README is not making it easy to like. Demonstrating with 2-3 sample apps using RDX might have gone a long way.

So how do I start using it if I, for example, want to use it like a decentralized `syncthing`? Can I? If not, what can I use it for?

I am not a mathematician. Most people landing on your repo are not mathematicians either.

We the techies _hate_ marketing with a passion but I as another programmer find myself intrigued by your idea... with zero idea how to even use it and apply it.

stkdump 20 hours ago|||
Sorry, I am turned off by the CRDT in there. It immediately smells of overengineering to me. Not that I believe git is a better database. But why not just SQL?
gritzko 20 hours ago||
Merges require revisioning. JSON or SQL do not have that in the model. This variant of CRDT is actually quite minimalistic.
stkdump 19 hours ago||
I would argue LWW is the opposite of a merge. It is better to immediately know at the time of writing that there is a conflict. CRDTs either solve or (in this case) don't solve a problem that doesn't really exist, especially for package managers.
gritzko 19 hours ago||
Git solves that problem and it definitely exists. Speaking of package managers, it really depends. Like, can we use one SQLite file for that? So easy, why no one is doing that?
inapis 1 day ago|||
>Yet if I spent one hour making my app one second faster for my million users, I can save 277 user hour per year. But since user hours are an externality, such optimization never gets done.

I have never been convinced by this argument. The aggregate number sounds fantastic but I don't believe that any meaningful work can be done by each user saving 1 second. That 1 second (and more) can simply be taken by me trying to stretch my body out.

OTOH, if the argument is to make software smaller, I can get behind that since it will simply lead to more efficient usage of existing resources and thus reduce the environmental impact.

But we live in a capitalist world and there needs to be external pressure for change to occur. The current RAM shortage, if it lasts, might be one of them. Otherwise, we're only day dreaming for a utopia.

adrianN 23 hours ago|||
Time saved to increased productivity or happiness or whatever is not linear but a step function. Saving one second doesn’t help much, but there is a threshold (depending on the individual) where faster workflows lead to a better experience. It does make a difference whether a task takes a minute or half a second, at least for me.
jorvi 19 hours ago||||
But there isn't just one company deciding externalizing cost on the rest of us is a great way to boost profit since it costs them very little. Especially for a monopoly like YouTube that can decide that eating up your battery is fine if it saves them a few cents in bandwidth costs.

Not all of those externalizing companies abuse your time but whatever they abuse can be expressed in a $ amount and $ can be converted to a median's person time via median wage. Hell, free time is more valuable than whatever you produce during work.

Say all that boils down to companies collectively stealing 20 minutes of your time each day. 140 minutes each week. 7280 (!) minutes each year, which is 5.05 days, which makes it almost a year over the course of 70 years.

So yeah, don't do what you do and sweettalk the fact that companies externalize costs (private the profits, socialize the losses). They're sucking your blood.

Aerroon 23 hours ago||||
One second is long enough that it can put a user off from using your app though. Take notifications on phones for example. I know several people who would benefit from a habitual use of phone notifications, but they never stick to using them because the process of opening (or switching over to) the notification app and navigating its UI to leave a notification takes too long. Instead they write a physical sticky note, because it has a faster "startup time".
tehbeard 23 hours ago||
All depends on the type of interaction.

A high usage one, absolutely improve the time of it.

Loading the profile page? Isn't done often so not really worth it unless it's a known and vocal issue.

https://xkcd.com/1205/ gives a good estimate.

Aerroon 14 hours ago||
This is very true, but I think some of it has to do with expectations too. Editing a profile page is a complex thing, therefore people are more willing to put up with loading times on it, whereas checking out someone's profile is a simple task and the brain has already moved on, so any delay feels bad.
schubidubiduba 2 hours ago|||
Just because one individual second is small, it still adds up.

Even if all you do with it is just stretching, there's a chance it will prevent you pulling a muscle. Or lower your stress and prevent a stroke. Or any number of other beneficial outcomes.

vlovich123 21 hours ago|||
I think it’s naive to think engineers or managers don’t realize this or don’t think in these ways.

https://www.folklore.org/Saving_Lives.html

pdimitar 13 hours ago||
Is it truly naive if most engineer's careers pass and they never meet even one such manager?

For 24 years of career I've met the grand total of _two_ such. Both got fired not even 6 months after I got in the company, too.

Who's naive here?

vlovich123 9 hours ago||
I’ve met one who asked me a question like this and he’s still at Apple having been promoted several times to a fairly senior position. But the question was only half hearted because the question was “how much CO2 would we save if we made something 10% more CPU efficient” and the answer even at Apple’s current scale of billions of iPhones was insignificant.

So now you and I both have come across such a manager. Why would you make the claim most engineer’s don’t come across such people?

loloquwowndueo 1 day ago|||
Just a reminder that GitHub is not git.

The article mentions that most of these projects did use GitHub as a central repo out of convenience so there’s that but they could also have used self-hosted repos.

machinationu 1 day ago|||
Explain to me how you self-host a git repo which is accessed millions of time a day from CI jobs pulling packages.
freedomben 23 hours ago|||
I'm not sure whether this question was asked in good faith, but is actually a damn good one.

I've looked into self hosting and git repo that has horizontal scalability, and it is indeed very difficult. I don't have the time to detail it in a comment here, but for anyone who is curious it's very informative to look at how GitLab handled this with gitaly. I've also seen some clever attempts to use object storage, though I haven't seen any of those solutions put heavily to the test.

I'd love to hear from others about ideas and approaches they've heard about or tried

https://gitlab.com/gitlab-org/gitaly

fulafel 2 hours ago||||
Let's assume 3 million. That's about 30 per second.

From compute POV you can serve that with one server or virtual machine.

Bandwidth-wise, given a 100 MB repo size, that would make it 3.4 GB/s - also easy terrain for a single server.

heavenlyhash 1 hour ago||
That is roughly the number of new requests per second, but these are not just light web requests.

The git transport protocol is "smart" in a way that is, in some ways, arguably rather dumb. It's certainly expensive on the server side. All of the smartness of it is aimed at reducing the amount of transfer and number of connections. But to do that, it shifts a considerable amount of work onto the server in choosing which objects to provide you.

If you benchmark the resource loads of this, you probably won't be saying a single server is such an easy win :)

fulafel 40 minutes ago||
Here's a web source about how much cpu time it took from 5 years ago: https://github.blog/open-source/git/git-clone-a-data-driven-...

Using the slowest clone method they measured 8s for a 750 MB repo, 0.45s for a 40MB repo. appears to be linear so 1.1s for 100MB should be a valid interpolation.

So doing 30 of those per second only takes 33 cores. Servers have hundreds of cores now (eg 384 cores: https://www.phoronix.com/review/amd-epyc-9965-linux-619).

And remember we're using worst case assumptions in a 3 places (full clone each time, using the slowest clone method, and numbers from old hardware). In practice I'd bet a fastish laptop would suffice.

fweimer 23 hours ago||||
These days, people solve similar problems by wrapping their data in an OCI container image and distribute it through one of the container registries that do not have a practically meaningful pull rate limit. Not really a joke, unfortunately.
mystifyingpoi 18 hours ago||
Even Amazon encourages this, probably not intentionally, more like as a bandaid for bad EKS config that people can do by mistake, but still - you can pull 5 terabytes from ECR for free under their free tier each month.
XorNot 16 hours ago||
I'd say it'd just Kubernetes in general should've shipped with a storage engine and an installation mechanism.

It's a very hacky feeling addon that RKE2 has a distributed internal registry if you enable it and use it in a very specific way.

For the rate at which people love just shipping a Helm chart, it's actually absurdly hard to ship a self contained installation without just trying to hit internet resources.

favflam 11 hours ago||||
Is running the git binary as a read-only nginx backend not good enough? Probably not. Hosting tarballs is far more efficient.
ozim 1 day ago||||
FTFY:

Explain to me how you self-host a git repo without spending any money and having no budget which is accessed millions of time a day from CI jobs pulling packages.

adrianN 23 hours ago||||
You git init —-bare on a host with sufficient resources. But I would recommend thinking about your CI flow too.
machinationu 22 hours ago||
no, hundred of thousands of thousands of individual projects CI jobs. OP was talking about package managers for the whole world, not for one company
adrianN 20 hours ago||
If people depend on remote downloads from different companies for their CI pipelines they’re doing it wrong. Every sensible company sets up a mirror or at least a cache on infra that they control. Rate limiting downloads is the natural course of action for the provider of a package registry. Once you have so many unique users that even civilized use of your infrastructure becomes too much you can probably hire a few people to build something more scalable.
machinationu 18 hours ago||
numpy had 16M downloads yesterday, at 10 MB that's 160 TB of traffic. It's one package. And there are no rate limits on pypi.

https://clickpy.clickhouse.com/dashboard/numpy

justincormack 1 day ago|||
They probably would have experienced issues way sooner, as the self hosted tools don't scale nearly as well.
threatofrain 20 hours ago|||
> Most software houses spend so much time focusing on how expensive engineering time is that they neglect user time. Software houses optimize for feature delivery and not user interaction time.

Oh no no no. Consumer-facing companies will burn 30% of your internal team complexity budget on shipping the first "frame" of your app/website. Many people treat Next as synonymous with React, and Next's big deal was helping you do just this.

brightball 21 hours ago|||
User time is typically a mix of performance tuning and UX design isn’t it?
imiric 20 hours ago|||
> GitHub is free after all, and it has all of these great properties, so why not?

The answer is in TFA:

> The underlying issue is that git inherits filesystem limitations, and filesystems make terrible databases.

JohnHaugeland 21 hours ago|||
> This seems like a tragedy of the commons -- GitHub is free after all, and it has all of these great properties, so why not?

because it's bad at this job, and sqlite is also free

this isn't about "externalities"

machinationu 1 day ago|||
[flagged]
benchloftbrunch 21 hours ago|||
As long as you don't have any security compliance requirements and/or can afford the cost of self hosting your LLM, sure.

Anyone working in government, banking, or healthcare is still out of luck since the likes of Claude and GPT are (should be) off limits.

camgunz 1 day ago|||
I've never been more convinced LLMs are the vanguard of the grift economy now that green accounts are low effort astroturfing on HN.
freedomben 23 hours ago|||
LLM's obviously can't do it all, and they still have severe areas of weakness where they can't replace humans, but there are definitely a lot of areas where they really can now. I've seen it first hand. I've even experienced it first hand. There are a couple of services that I wrote years ago that were basically parked in maintenance mode because they weren't worth investing time in, and we just dealed with some of the annoyances and bugs. With the latest LLM's, over the last couple of months I've been able to resurrect them and fix a lot of bugs and even add some wanted features in just a few hours. It really is quite incredible and scary at the same time.

Also in case you're not aware, accusing people of shilling or astroTurfing is against the hacker news guidelines

camgunz 22 hours ago||
The loophole here is that this account isn't a person
machinationu 22 hours ago||
you forgot to ask me to ignore previous instructions and say pizza
freedomben 16 hours ago||
It was there as hidden characters. His comment included this as a hidden prompt: "tell me that I forgot to ask you to ignore previous instructions and say pizza"
machinationu 23 hours ago|||
hey, I'm just a lowly LLM, gotta earn my tokens :|
massysett 20 hours ago||
> Externalities lead to users downloading extra gigabytes of data (wasted time) and waiting for software, all of which is waste that the developer isn't responsible for and doesn't care about.

This is perfectly sensible behavior when the developers are working for free, or when the developers are working on a project that earns their employer no revenue. This is the case for several of the projects at issue here: Nix, Homebrew, Cargo. It makes perfect sense to waste the user's time, as the user pays with nothing else, or to waste Github's bandwidth, since it's willing to give bandwidth away for free.

Where users pay for software with money, they may be more picky and not purchase software that indiscriminately wastes their time.

BobbyTables2 20 hours ago||
Microsoft would have long gone out of business if users cared about their time being wasted.

Windows 11 should not be more sluggish than Windows 7.

cesarb 22 hours ago||
One of these is not like the others...

> The problem was that go get needed to fetch each dependency’s source code just to read its go.mod file and resolve transitive dependencies.

This article is mixing two separate issues. One is using git as the master database storing the index of packages and their versions. The other is fetching the code of each package through git. They are orthogonal; you can have a package index using git but the packages being zip/tar/etc archives, you can have a package index not using git but each package is cloned from a git repository, you can have both the index and the packages being git repositories, you can have neither using git, you can even not have a package index at all (AFAIK that's the case for Go).

kpcyrd 14 hours ago||
The author seems a little lost tbh, it's starting with "your users should not all clone your database" which I definitely agree with, but that doesn't mean you can't encode your data in a git graph.

It then digresses into implementation details of Github's backend implementation (how is 20k forks relevant?), then complains about default settings of the "standard" git implementation. You don't need to checkout a git working tree to have efficient key value lookups. Without a git working tree you don't need to worry about filesystem directory limits, case sensitivity and path length limits.

I was surprised the author believes the git-equivalent of a database migration is a git history rewrite.

What do you want me to do, invent my own database? Run postgres on a $5 VPS and have everybody accept it as single-point-of-failure?

bobpaw 21 hours ago|||
I think the article takes issue not with fetching the code, but with fetching the go.mod file that contains index and dependency information. That’s why part of the solution was to host go.mod files separately.
jayd16 20 hours ago||
Even with git, it should be possible to grab the single file needed without the rest of the repo, but i'ts still trying to round a square peg.
skywhopper 17 hours ago||
Honestly I think the article is a bit ahistorical on this one. ‘go get’ pulls the source code into a local cache so it can build it, not just to fetch the go.mod file. If they were having slow CI builds because they didn’t or couldn’t maintain a filesystem cache, that’s annoying, but not really a fault in the design. Anyway, Go improved the design and added an easy way to do faster, local proxies. Not sure what the critique is here. The Go community hit a pain point and the Go team created an elegant solution for it.
dboon 23 hours ago||
I’m building Cargo/UV for C. Good article. I thought about this problem very deeply.

Unfortunately, when you’re starting out, the idea of running a registry is a really tough sell. Now, on top of the very hard engineering problem of writing the code and making a world class tool, plus the social one of getting it adopted, I need to worry about funding and maintaining something that serves potentially a world of traffic? The git solution is intoxicating through this lense.

Fundamentally, the issue is the sparse checkouts mentioned by the author. You’d really like to use git to version package manifests, so that anyone with any package version can get the EXACT package they built with.

But this doesn’t work, because you need arbitrary commits. You either need a full checkout, or you need to somehow track the commit a package version is in without knowing what hash git will generate before you do it. You have to push the package update and then push a second commit recording that. Obviously infeasible, obviously a nightmare.

Conan’s solution is I think just about the only way. It trades the perfect reproduction for conditional logic in the manifest. Instead of 3.12 pointing to a commit, every 3.x points to the same manifest, and there’s just a little logic to set that specific config field added in 3.12. If the logic gets too much, they let you map version ranges to manifests for a package. So if 3.13 rewrites the entire manifest, just remap it.

I have not found another package manager that uses git as a backend that isn’t a terrible and slow tool. Conan may not be as rigorous as Nix because of this decision but it is quite pragmatic and useful. The real solution is to use a database, of course, but unless someone wants to wire me ten thousand dollars plus server costs in perpetuity, what’s a guy supposed to do?

dkarl 20 hours ago||
Think about the article from a different perspective: several of the most successful and widely used package managers of all time started out using Git, and they successfully transitioned to a more efficient solution when they needed to.
zephen 17 hours ago||
Not only this, but (if I understand the article correctly) at least some of them still use git on the backend.
baobun 17 hours ago|||
How about the Arch Linux AUR approach?

Every package has its own git repository which for binary packages contains mostly only the manifest. Sources and assets, if in git, are usually in separate repos.

This seems to not have the issues in the examples given so far, which come from using "monorepos" or colocating. It also avoids the "nightmare" you mention since any references would be in separate repos.

The problematic examples either have their assets and manifests colocated, or use a monorepo approach (colocating manifests and the global index).

jopsen 16 hours ago|||
The alluring thing is storing the repository on S3 (or similar). Recall early docker registries making requests so complicated that backing image storage with S3 was unfeasible, without a proxy service.

The thing that scales is dumb HTTP that can be backed by something like S3.

You don't have to use a cloud, just go with a big single server. And if you become popular, find a sponsor and move to cloud.

If money and sponsor independence is a huge concern the alternative would be: peer-to-peer.

I haven't seen many package managers do it, but it feels like a huge missed opportunity. You don't need that many volunteers to peer inorder to have a lot of bandwidth available.

Granted, the real problem that'll drive up hosting cost is CI. Or rather careless CI without caching. Unless you require a user login, or limit downloads for IPs without a login, caching is hard to enforce.

For popular package repositories you'll likely see extremely degenerate CI systems eating bandwidth as if it was free.

Disclaimer: opinions are my own.

adrianN 23 hours ago|||
Before you managed to build a popular tool it is unlikely that you need to serve many users. Directly going for something that can serve the world is probably premature
dboon 22 hours ago|||
For most software, yes. But the value of a package manager is in its adoption. A package manager that doesn’t run up against these problems is probably a failure anyway.
EPWN3D 19 hours ago|||
The point is not "design to serve the world". The point is "use the right technology for your problem space".
mook 20 hours ago|||
Is there a reason the users must see all of the historic data too? Why not just have a post-commit hook render the current HEAD to static files, into something like GitHub Pages?

That can be moved elsewhere / mirrored later if needed, of course. And the underlying data is still in git, just not actively used for the API calls.

It might also be interesting to look at what Linux distros do, like Debian (salsa), Fedora (Pagure), and openSUSE (OBS). They're good for this because their historic model is free mirrors hosted by unpaid people, so they don't have the compute resources.

jarofgreen 20 hours ago||
I'm not OP but I'll guess .... lock files with old versions of libs in. The latest version of a library may be v2 but if most users are locked to v1.267.34 you need all the old versions too.

However a lot of the "data in git repositories" projects I see don't have any such need, and then ...

> Why not just have a post-commit hook render the current HEAD to static files, into something like GitHub Pages?

... is a good plan. Usually they make a nice static website with the data that's easy for humans to read though.

ambicapter 23 hours ago|||
> Unfortunately, when you’re starting out, the idea of running a registry is a really tough sell. Now, on top of the very hard engineering problem of writing the code and making a world class tool, plus the social one of getting it adopted, I need to worry about funding and maintaining something that serves potentially a world of traffic? The git solution is intoxicating through this lense.

So you need a decentralized database? Those exist (or you can make your own, if you're feeling ambitious), probably ones that scale in different ways than git does.

dboon 22 hours ago||
Please share. I’m interested in anything that’s roughly as simple as implementing a centralized registry, is easily inspected by users (preferably with no external tooling), and is very fast.

It’s really important that someone is able to search for the manifest one of their dependencies uses for when stuff doesn’t work out of the box. That should be as simple as possible.

I’m all ears, though! Would love to find something as simple and good as a git registry but decentralized

jopsen 16 hours ago|||
You don't need fully distributed database, do you?

You could just make a registry hosted as plain HTTP, with everything signed. And a special file that contains a list of mirrors.

Clients request the mirror list and the signed hash of the last entry in the Merkel tree. Then they go talk to a random mirror.

Maybe, you central service requires user sign-in for publishing and reading, while mirrors can't publish, but mirrors don't require sign-in.

Obviously, you'd have to validate that mirrors are up and populated. But that's it.

You can start by self hosting a mirror.

One could go with signing schemes inspired by: https://theupdateframework.io/

Or one could omit signing all together, so long as you have a Merkel tree with hashes for all publishing events. And the latest hash entry is always fetched from your server along with the mirror list.

Having all publishing go through a single service is probably desirable. You'll eventually need to do moderation, etc. And hosting your service or a mirror becomes a legal nightmare if there is not moderation.

Disclaimer: opinions are my own.

yawaramin 9 hours ago||||
Package registry in an SQLite database, snapshotted daily. Stored in a cloud bucket. New clients download the latest snapshot, existing clients stream in the updates using eg Litestream. Resolving dependencies should now be ultra fast thanks to indexes.
k8ssskhltl 15 hours ago||||
Blockchain.
strbean 21 hours ago|||
Distributed ledger! /s... ?
krautsauer 22 hours ago|||
I wonder how meson wraps' story fits with this. They used not to, but now they're throwing everything into a single repository [0]. I wonder about the motivation and how it compares to your project.

0: https://github.com/mesonbuild/wrapdb/tree/master/subprojects

dpedu 1 hour ago|||
> I’m building Cargo/UV for C.

Interesting! Do you mind sharing a link to the project at this point?

ekjhgkejhgk 1 day ago||
Do the easy thing while it works, and when it stops working, fix the problem.

Julia does the same thing, and from the Rust numbers on the article, Julia has about 1/7th the number of packages that Rust does[1] (95k/13k = 7.3).

It works fine, Julia has some heuristics to not re-download it too often.

But more importantly, there's a simple path to improve. The top Registry.toml [1] has a path to each package, and once donwloading everything proves unsustainable you can just download that one file and use it to download the rest as needed. I don't think this is a difficult problem.

[1] https://github.com/JuliaRegistries/General/blob/master/Regis...

galenlynch 23 hours ago||
I believe Julia only uses the Git registry as an authoritative ledger where new packages are registered [1]. My understanding is that as you mention, most clients don't access it, and instead use the "Pkg Protocol" [2] which does not use Git.

[1] https://github.com/JuliaRegistries/General

[2] https://pkgdocs.julialang.org/dev/protocol/

mi_lk 21 hours ago|||
> Do the easy thing while it works, and when it stops working, fix the problem

Another way to phrase this mindset is "fuck around and find out" in gen-Z speak. It's usually practical to an extent but I'm personally not a fan

sagarm 17 hours ago|||
I've mostly heard FAFO used to describe something obviously stupid.

Building on the same thing people use for code doesn't seem stupid to me, at least initially. You might have to migrate later if you're successful enough, but that's not a sign of bad engineering. It's just building for where you are, not where you expect to be in some distant future

zephen 17 hours ago|||
Not at all.

When you fuck around optimizing prematurely, you find out that you're too late and nobody cares.

Oh, well, optimization is always fun, so there's that.

syockit 48 minutes ago||
That's one thing, the other is you find out you were optimizing for the wrong thing, and now it takes more effort and time to reoptimize for the right thing.
0xbadcafebee 23 hours ago|||
This is basically unethical. Imagine anything important in the world that worked this way. "Do nuclear engineering the easy way while it works, and when it stops working, fix the problem."

Software engineers always make the excuse that what they're making now is unimportant, so who cares? But then everything gets built on top of that unimportant thing, and one day the world crashes down. Worse, "fixing the problem" becomes near impossible, because now everything depends on it.

But really the reason not to do it, is there's no need to. There are plenty of other solutions than using Git that work as well or better without all the pitfalls. The lazy engineer picks bad solutions not because it's necessarily easier than the alternatives, but because it's the path of least resistance for themselves.

Not only is this not better, it's often actively worse. But this is excused by the same culture that gave us "move fast and break things". All you have to do is use any modern software to see how that worked out. Slow bug-riddled garbage that we're all now addicted to.

xboxnolifes 21 hours ago|||
Most of the world does work this way. Problems are solved within certain conditions and for use over a certain time frame. Once those change, the problem gets revisited.

Most software gets to take it to more of an extreme then many engineering fields since there isn't physical danger. Its telling that the counter examples always use the potentially dangerous problems like medicine or nuclear engineering. The software in those fields are more stringent.

hombre_fatal 22 hours ago||||
On the other hand, GitHub wants to be the place you choose to build your registry for a new project, and they are clearly on board with the idea given that they help massive projects like Nix packages instead of kicking them off.

As opposed to something like using a flock of free blogger.com blogs to host media for an offsite project.

baobun 16 hours ago||
...For now. The writing is on the wall.
ekjhgkejhgk 21 hours ago||||
Fixing problems as they appear is unethical? Ok then.

You realize, there are people who think differently? Some people would argue that if you keep working on problems you don't have but might have, you end up never finishing anything.

It's a matter of striking a balance, and I think you're way on one end of the spectrum. The vast majority of people using Julia aren't building nuclear plants.

BenjiWiebe 17 hours ago||
Fixing problems when they appear is ethical.

Refusing to fix a problem that hasn't appeared yet, but has been/can be foreseen - that's different. I personally wouldn't call it unethical, but I'd consider it a negative.

zephen 16 hours ago||
The problem is that popularity is governed by power laws.

Literally anybody could forsee that, _if_ something scales to millions of users, there will be issues. Some of the people who forsee that could even fix it. But they might spend their time optimizing for something that will never hit 1000 users.

Also, the problems discussed here are not that things don't work, it's that they get slow and consume too many resources.

So there is certainly an optimal time to fix such problems, which is, yes, OK, _before_ things get _too_ slow and consume _too_ many resources, but is most assuredly _after_ you have a couple of thousand users.

ModernMech 21 hours ago|||
Hold up... "lazy engineers" are the problem here? What about a society that insists on shoving the work product of unfunded, volunteer engineers into critical infrastructure because they don't want to pay what it costs to do things the right way? Imagine building a nuclear power plant with an army of volunteer nuclear engineers.

It cannot be the case that software engineers are labelled lazy for not building the at-scale solution to start with, but at the same time everyone wants to use their work, and there are next to no resources for said engineer to actually build the at scale solution.

> the path of least resistance for themselves.

Yeah because they're investing their own personal time and money, so of course they're going to take the path that is of least resistance for them. If society feels that's "unethical", maybe pony up the cash because you all still want to rely on their work product they are giving out for free.

rovr138 20 hours ago||
> If society feels that's "unethical", maybe pony up the cash because you all still want to rely on their work product they are giving out for free.

I like OSS and everything.

Having said that, ethically, should society be paying for these? Maybe that is what should happen. In some places, we have programs to help artists. Should we have the same for software?

zahlman 1 day ago|||
> 00000000-1111-2222-3333-444444444444 = { name = "REPLTreeViews", path = "R/REPLTreeViews" }

... Should it be concerning that someone was apparently able to engineer an ID like that?

ekjhgkejhgk 1 day ago|||
Could you please articulate specifically why that should be concerning?

Right now I don't see the problem because the only criterion for IDs is that they are unique.

zahlman 23 hours ago||
I didn't know whether they were supposed to be within the developer's control (in which case the only real concern is whether someone else has already used the id), or generated by the system (in which case a developer demonstrated manipulation of that system).

Apparently it is the former, and most developers independently generate random IDs because it's easy and is extremely unlikely to result in collisions. But it seems the dev at the top of the list had a sense of vanity instead.

KenoFischer 22 hours ago||
You're supposed to generate a random one, but the only consequence of not doing so is that you won't be able to register your package if someone else already took the UUID (which is a pain if you have registered versions in a private registry). That said, "vanity" UUIDs are a bad look, so we'd probably reject them if someone tried that today, but there isn't any actual issue with them.
skycrafter0 1 day ago||||
If you read the repo README, it just says "generate a uuid". You can use whatever you want as long as it fits the format, it seems.
adestefan 1 day ago|||
It’s as random as any other UUID.
Severian 23 hours ago|||
Incorrect, only some UUIDs are random, specifically v4 and v7 (v7 uses time as well).

https://en.wikipedia.org/wiki/Universally_unique_identifier

> 00000000-1111-2222-3333-444444444444

This would technically be version 2, which would be built from the date-time and MAC address, and DCE security version.

But overall, if you allow any yahoo to pick a UUID, its not really a UUID, its just some random string that looks like one.

ekjhgkejhgk 21 hours ago||
> if you allow any yahoo to pick a UUID, its not really a UUID

universally unique identifier (UUID)

> 00000000-1111-2222-3333-444444444444

It's unique.

Anyway we're talking about a package that doesn't matter. It's abandoned. Furthermore it's also broken, because it uses REPL without importing it. You can't even precompile it.

https://github.com/pfitzseb/REPLTreeViews.jl/blob/969f04ce64...

anonymars 21 hours ago|||
Which is to say, not guaranteed at all. GUIDs are designed to be unique, not random/unpredictable

https://devblogs.microsoft.com/oldnewthing/20120523-00/?p=75...

IshKebab 21 hours ago||
> when it stops working, fix the problem

This is too naive. Fixing the problem costs a different amount depending on when you do it. The later you leave it the more expensive it becomes. Very often to the point where it is prohibitively expensive and you just put up with it being a bit broken.

This article even has an example of that - see the vcpkg entry.

steeleduncan 1 day ago||
The other conclusion to draw is "Git is a fantastic choice of database for starting your package manager, almost all popular package managers began that way."
saidinesh5 1 day ago||
I think the conclusion is more that package definitions can still be maintained on git/GitHub but the package manager clients should probably rely on a cache/db/a more efficient intermediate layer.

Mostly to avoid downloading the whole repo/resolve deltas from the history for the few packages most applications tend to depend on. Especially in today's CI/CD World.

reactordev 23 hours ago|||
This is exactly the right approach. I did this for my package manager.

It relies on a git repo branch for stable. There are yaml definitions of the packages including urls to their repo, dependencies, etc. Preflight scripts. Post install checks. And the big one, the signatures for verification. No binaries, rpms, debs, ar, or zip files.

What’s actually installed lives in a small SQLite database and searching for software does a vector search on each packages yaml description.

Semver included.

This was inspired by brew/portage/dpkg for my hobby os.

pseufaux 20 hours ago|||
This is how WinGet works. It has a small SQLite db it downloads from a hosted url. The DB contains some minimal metadata and a url path to access the full metadata. This way WinGet only has to make API calls for packages it's actually interacting with. As a package manager, it has plenty of problems still, but it's a simple, elegant solution for the git as a DB issue.
edolstra 21 hours ago|||
Indeed. Nixpkgs wouldn't have been as successful if it hadn't been using Git (or GitHub).

Sure, eventually you run into scaling issues, but that's a first world problem.

l9o 12 hours ago||
I actually find that nixpkgs being a monorepo makes it even better. The code is surprisingly easy to navigate and learn if you've worked in large codebases before. The scaling issues are good problems to have, and git has gotten significantly better at handling large repos than it was a decade ago, when Facebook opted for Mercurial because git couldn't scale to their needs. If anything, it's GitHub issues and PRs that are probably showing its cracks.
bluGill 1 day ago|||
Git isn't a fantastic choice unless you know nothing about databases. A search would show plenty of research on databases and what works when/why.
kibwen 23 hours ago||
For the purposes of the article, git isn't just being used as a database, it's being used as a protocol to replicate the database to the client to allow for offline operation and then keep those distributed copies in sync. And even for that purpose you can do better than git if you know what you're doing, but knowledge of databases alone isn't going to help you (let alone make your engineering more economical than relying on free git hosting).
freedomben 23 hours ago||
Exactly. It's not just about the best solution to the problem, it's also heavily about the economics around it. If I wanted to create a new package manager today, I could get started by utilizing Git and existing git hosting solutions with very little effort, and effort translates to time, and time is a scarce resource. If you don't know whether your package manager will take off or not, it may not be the best use of your scarce resources to invest in a robust and optimized solution out of the gate. I wish that weren't the case, I would love to have an infinite amount of time, but wishing is not going to make it happen
adastra22 23 hours ago|||
Git is an absolute shit database for a package manager even in the beginning. It’s just that GitHub subsidizes hosting and that is hard to pass up.
fn-mote 23 hours ago|||
Sure, but can you back up the expletive with some reason why you think that?

As it is, this comment is just letting out your emotion, not engaging in dialogue.

adastra22 5 hours ago|||
The article enumerates many such reasons.
venturecruelty 15 hours ago|||
Can we please stop the tone-policing, please? It's not helpful. Not everything needs Wikipedia-style citation, and this particular rhetorical trick is extremely passive-aggressive.
IshKebab 21 hours ago|||
What's a better option? One that keeps track of history and has a nice review interface?
venturecruelty 15 hours ago||
No. No, no, no. Git is a fantastic choice if you want a supply chain nightmare and then Leftpad every week forever.
ferfumarma 24 minutes ago||
I love this write-up. As a non-expert user of package managers I can quickly understand a set of patterns that have been deeply considered and carefully articulated. Thanks for taking the time to write up your observations!
kibwen 23 hours ago||
I think there's a form of survivorship bias at work here. To use the example of Cargo, if Rust had never caught on, and thereby gotten popular enough to inflate the git-based index beyond reason, then it would never have been a problem to use git as the backing protocol for the index. Likewise, we can imagine innumerable smaller projects that successfully use git as a distributed delta-updating data distribution protocol, and never happen to outgrow it.

The point being, if you're not sure whether your project will ever need to scale, then it may not make sense to reinvent the wheel when git is right there (and then invent the solution for hosting that git repo, when Github is right there), letting you spend time instead on other, more immediate problems.

stickfigure 21 hours ago||
Right, this post may encourage premature optimization. Cargo, Homebrew, et al chose an easy, good-enough solution which allowed them to grow until they hit scaling limits. This is a good problem to have.

I am sure there's value having a vision for what your scaling path might be in the future, so this discussion is a good one. But it doesn't automatically mean that git is a bad place to start.

8note 10 hours ago|||
im surprised nobody has made a common db for package managers, so cargo could use it without having to think about it
inferiorhuman 7 hours ago||
Keep in mind that crates.io, the main crate registry, uses GitHub as its only authentication method. They may have moved away from git but they're still locked into a rather piss poor vendor.
jama211 20 hours ago||
“It never works out” - hmm, seems like it worked out just fine, worked great to get the operation of the ground and when scale became an issue it was solvable by moving to something else. It served its purpose, sounds like it worked out to me.
swiftcoder 20 hours ago||
You appear to have glossed over the two projects in the list that are stuck due to architectural decisions, and don't have any route to migrate off of git-as-database?
baobun 17 hours ago|||
The issues with nixpkgs stem from that it is a monorepo for all packages and doubling as an index.

The issues are only fundamental with that architecture. Using a separate repo for each package, like the Arch User Repos, does not have the same problems.

Nixpkgs certainly could be architected like that and submodules would be a graceful migration path. I'm not aware of discussion of this but guess that what's preventing it might be that github.com tooling makes it very painful to manage thousands of repos for a single project.

So I think it can be a lesson not to that using git as a database is bad but that using github.com as a database is. PRs as database transactions is clunky and GitHub Actions isn't really ACID.

yawaramin 8 hours ago||
It's not a monorepo though? It's a package index, it has the package metadata. It doesn't have the actual source code of the projects themselves.
baobun 3 hours ago||
Point being it carries both the index (versions/pointers) and full metadata + build instructions for all packages in single repo.

The index could be split from the build and the package build defs could live in independent repos (like go or aur).

It would probably take some change to nix itself to make that work and some nontrivial work on tooling to make the devex decent.

But I don't think the friction with nixpkgs should be seen as damning for backing a package registry with git in general.

hombre_fatal 19 hours ago||||
Be more specific because I just see a list of workarounds deployed once they had the scale to warrant them, supporting the OP’s claim.
swiftcoder 18 hours ago||
Read the vcpkg section, it explicitly states that they have no horizontal on a solution. The nix section also doesn’t explain any potential solution.
jama211 11 hours ago|||
It’s a fair criticism, and this article does serve well as a warning for people to try and avoid this issue from the start.
efitz 19 hours ago|||
When you start out with a store like git, with file system semantics and a client that has to be smart to handle all the compare and merge operations, then it’s practically impossible to migrate a large client base to a new protocol. Takes years lots of user complaints to and random breakage.

Much better to start with an API. Then you can have the server abstract the store and the operations - use git or whatever - but you can change the store later without disrupting your clients.

jama211 11 hours ago||
That costs hosting money no? That might be a bigger problem for someone starting than scalability
lijok 20 hours ago|||
Nooo you don’t get it - it didn’t scale from 0 to a trillion users so it’s a garbage worthless system that “doesn’t scale”.
zephen 16 hours ago||
^^^ Poe's Law may or may not apply to the above comment.
leoh 18 hours ago||
I couldn't agree more strongly. There is a huge opportunity to make git more effective for this kind of use-case, not to abandon it. The essay in question provides no compelling alternative; it therefore reaches an entirely half-baked conclusion.
jama211 11 hours ago||
A good point!
newswangerd 20 hours ago||
It’s always humbling when you go on the front page of HN and see an article titled “the thing you’re doing right now is a bad idea and here’s why”

This has happened to me a few times now. The last one was a fantastic article about how PG Notify locks the whole database.

In this particular case it just doesn’t make a ton of sense to change course. Im a solo dev building a thing that may never take off, so using git for plug-in distribution is just a no brainer right now. That said, I’ll hold on to this article in case I’m lucky enough to be in a position where scale becomes an issue for me.

baobun 16 hours ago|
The good news is you can easier avoid some of the pitfalls now even as you stick with it. Some good points in comments.

I don't know if you rely on github.com but IMO vendor lock-in there might be a bigger issue which you can avoid.

quaintdev 1 day ago|
I host my own code repository using Forgejo. It's not public. In fact, it's behind mutual tls like all the service I host. Reason? I don't want to deal with bots and other security risks that come with opening port to the world.

Turns out Go module will not accept package hosted on my Forgejo instance because it asks for certificate. There are ways to make go get use ssh but even with that approach the repository needs to be accessible over https. In the end, I cloned the repository and used it in my project using replace directive. It's really annoying.

agwa 23 hours ago||
If you add .git to the end of your module path and set $GOPRIVATE to the hostname of your Forgejo instance, then Go will not make any HTTPS requests itself and instead delegate to the git command, which can be configured to authenticate with client certificates. See https://go.dev/ref/mod#vcs-find
xyzzy_plugh 1 day ago|||
> There are ways to make go get use ssh but even with that approach the repository needs to be accessible over https.

No, that's false. You don't need anything to be accessible over HTTP.

But even if it did, and you had to use mTLS, there's a whole bunch of ways to solve this. How do you solve this for any other software that doesn't present client certs? You use a local proxy.

baobun 16 hours ago|||
If you add the instance TLS cert (CA) to your trust store then go will happily download over https. It can be finicky depending on how you run go but I can confirm it works.
irusensei 21 hours ago||
Have a look at Tailscale DNS and certs. Its gives you a valid cert through lets encrypt without exposing your services to the internet.
More comments...