Posted by salkahfi 3 hours ago
Since yesterday, me and several colleagues noticed that the pull request lists on the website are incomplete, across many repositories. For example, on https://github.com/gap-system/gap/pulls it says "Pull requests 78" in the "tab list", but the PR list view reports "35 open" (the number 78 is correct, and confirmed by e.g. `gh pr list`)
And that despite <https://www.githubstatus.com> reporting "all systems operational".
Their support acknowledged the issue, but has been silent since then, and the status page still shows nothing other than the potentially-related issue on the 27th. It looks like it has been resolved on some repositories in the meantime, but I still have the issue across multiple orgs and repositories.
Surely a scaling hack where they use "estimation" queries that return "kind of right" results instead of 100% correct data, as it's less load on the infrastructure. Not necessarily a bug as much a shit choice from product perspective.
Sorry, but I don't think there is any way this can be classified as "not actually a bug"
I think I found the issue.
I’m sure survivor bias is at play here, but when I look through the older code bases - especially the data model - it’s an entirely different world than the newer stuff, and it’s clear which of the two was written by people who understand systems.
I understand the rapid growth (because of AI agents), but if such critical software service becomes unstable then it's time to migrate? Thinking about self-hosting GitLab.
Right way to think about this:
> If things we need/see as critical for our work are hosted on a platform with really bad reliability, it's time for us to migrate
My internet connection at home is really shit, and almost every week there is a multi-hour downtime for some reason, not to mention when La Liga games are on TV anything using Cloudflare is unavailable, so I've had to spend extra energy and time to setup things in a way so I can still work whenever this happens.
Leopard, meet face.
Too little too late, yesterday was the straw that broke the camel’s back for us and we’ve started a migration to a self-hosted GitLab.
* we had to resolve a variety of bottlenecks that appeared faster than expected from moving webhooks to a different backend (out of MySQL)
* * redesigning user session cache to redoing authentication and authorization flows to substantially reduce database load.
* we accelerated parts of migrating performance or scale sensitive code out of Ruby monolith into Go.
I'd like to know what database backend they migrated to. I was also surprised to read that the migration from Ruby to a more performant language had not already been completed. I assume this is because it a large code base with many moving parts, etc.
The unlabeled graphs don't help the credibility case. When you are already in the hole on trust, shipping a post that requires readers to assume favorable baselines is exactly the wrong move.