Top
Best
New

Posted by MattIPv4 8 hours ago

GitHub is once again down(www.githubstatus.com)
345 points | 179 commentspage 2
zelphirkalt 7 hours ago|
Man, a while ago I thought: "It happens often, alright, but every 2 weeks? Sounds like a slight exaggeration." But it really is every 2 weeks, isn't it? If I imagine in a previous job anything production being down every 2 weeks ... phew, would have had to have a few hard talks and course corrections.
genewitch 7 hours ago|
i once fixed a site going down several times a year with two t1.micro instances in the same region as the majority of traffic. Instantly solved the problem for what, $20/month?

Another site was constantly getting DDoS by Russians who were made we took down their scams on forums, that had to go through verisign back then, not sure who they're using now. They may have enough aggregate pipe it doesn't matter at this point

jeppester 7 hours ago||
At this rate it will be a matter of time before a "Github is up" parody site reaches the top of HN
ahstilde 8 hours ago||
github is at one nine, basically: https://news.ycombinator.com/item?id=47428035
msandford 7 hours ago||
I once worked at a place with more micro services than engineers. We joked about "we have as many 8s of uptime as you need!"
0x3f 7 hours ago|||
> I once worked at a place with more micro services than engineers.

Currently consulting somwhere with 30 services per engineer. I cannot convince them this is hell. Maybe that makes it my personal hell.

KaiserPro 7 hours ago|||
"Its like family here!"

In that every night you're playing murder mystery, and its never fun.

0x3f 4 hours ago||
I would never trust my family with system design either.
NooneAtAll3 5 hours ago||||
as a person that never touched webdev, I have a question

how is such service spam different from unix "small functions that do one thing only" culture?

why in unix case it is usually/historically seen as nice, while in web case it makes stuff worse?

0x3f 4 hours ago||
There are so many failures in microservices that just can't happen with a local binary. Inter-service communication over network is a big one with a failure rate orders of magnitude higher than running a binary on the same machine. Then you have to do deploys, monitoring, etc. across the whole platform.

You will basically need to employ solutions for problems only caused by your microservices arch. E.g. take reading the logs for a single request. In a monolith, just read the logs. For the many-service approach, you need to work out how you're going to correlate that request across them all.

Even the aforementioned network failures require a lot of design, and there's no standardization. Does the calling service retry? Does the callee have a durable queue and pick back up? What happens if a call/message gets 'too old'?

Also, from the other end, command line utils are typically made by entirely different people with entirely different philosophies/paradigms, so the encapsulation makes sense. That's not true when you're the one writing all the services, especially not at small-to-mid-size companies.

Plus, you already can do the single-concern thing in a monolith, just with modules/interfaces/etc.

msandford 7 hours ago|||
Oooof that's rough.

One strategy to convince is to get someone less technical than you to sit by you while you try and trace everything from one error'd HTTP request from start to finish to diagnose the problem. If they see it takes half a day to check every call to every internal endpoint to 100% satisfy a particular request sometimes that can help.

Also sometimes they just think "this is a bunch of nerd stuff, why are you involving me?!" So it's not foolproof.

0x3f 7 hours ago||
Oh, my non-technical boss agrees with me already. It's actually the engineers who've convinced themselves it's a good setup. Nice guys but very unwilling to change. Seems they're quite happy to have become 'experts' in this mess over the last 5-10 years. Almost like they're in retirement mode.

The real solution is probably to leave, but the market sucks at the moment. At least AI makes the 10-repos-per-tiny-feature thing easier.

anotherjesse 6 hours ago||||
We pride ourself on 9 5s!
the_real_cher 7 hours ago|||
seven nines? That's nothing , bro we got twelve eights!
nuker 5 hours ago||
I have Royal flush :)
rdtsc 7 hours ago|||
From five nines to nine fives
Imustaskforhelp 8 hours ago||
9% ? /s (though To be honest I genuinely wouldn't be surprised if things go down so bad too at this point either)
abound 7 hours ago|||
Unironically, I think 9% uptime would be "one-tenth of a nine".
brookst 7 hours ago||
Are you saying 9.999% isn’t four nines?
munchler 7 hours ago|||
Can’t tell if this is intended as humor, but I LOL’ed.
the_real_cher 7 hours ago|||
It unarguably is.
mememememememo 7 hours ago|||
90% would be one 9 following the sequence back.

99.99

99.90

99.00

90.00

jrm4 7 hours ago||
Do your part; remind people that Github is not git. Git is decentralizable and people should know this.
mayhemducks 6 hours ago||
Does anyone else ever think "that code I just pushed into my repo just took down all of github..." whenever it goes down around the same time you sync your changes?
gchamonlive 6 hours ago||
Just moved a project of mine to Gitlab. Created this very simple component with codex that will keep a mirror updated on GitHub for me, so I can focus development on Gitlab.

https://gitlab.com/gabriel.chamon/ci-components/-/tree/main/...

corvad 8 hours ago||
And this is why I self host a lot of my Git stack with Gerrit...
mememememememo 7 hours ago|
Or just make sure you git fetch repos into $other-place.

That helps with Git not so much issues etc.

corvad 7 hours ago||
Yeah, I think especially Git mirrors can go a long too for maintaining availability and also for reducing load off main infra.
MattIPv4 8 hours ago||
Hitting 500s when trying to push branches and create PRs.
belter 8 hours ago|
[dead]
sc__ 7 hours ago||
Microslop
steeleduncan 8 hours ago|
What has changed at GitHub to cause this?
smartmic 7 hours ago||
AIpocalypse. Eaten too much Copilot dog food.
bartread 7 hours ago|||
Perhaps even AIslopalypse.
KaiserPro 7 hours ago|||
Looking at the status, its not one long outage, but lots of little ones, microslops if you will.
zahlman 7 hours ago|||
I've been using "slopocalypse". People already know AI is responsible, but slop existed before — e.g. conventionally generated SEO spam. It's just... so much worse now.
bartread 6 hours ago||
"Slopocalypse": yeah, I like that. Easier to pronounce too.

At any rate, it seems like GitHub is back up now, so we'll see how long that lasts.

adzm 6 hours ago|||
Weird Al needs to capitalize on this whole AI/Al thing
pixelesque 7 hours ago|||
Possibly a combination of moving infrastructure to Azure, and also a significant increase in the number of PRs and commits due to Vibe-coding?
cyanydeez 7 hours ago||
Perhaps staff cuts having longtails? https://www.itpro.com/software/microsoft/microsoft-layoffs-h...
pera 7 hours ago|||
Microsoft Makes AI Mandatory For Employees

https://www.forbes.com/sites/bernardmarr/2025/07/08/microsof...

voidfunc 8 hours ago|||
Azure
altairprime 7 hours ago||
> Azure

To explain this one-word comment for those unfamiliar, see previously:

GitHub will prioritize migrating to Azure over feature development (5 months ago) https://news.ycombinator.com/item?id=45517173

In particular:

> GitHub has recently seen more outages, in part because its central data center in Virginia is indeed resource-constrained and running into scaling issues. AI agents are part of the problem here. But it’s our understanding that some GitHub employees are concerned about this migration because GitHub’s MySQL clusters, which form the backbone of the service and run on bare metal servers, won’t easily make the move to Azure and lead to even more outages going forward.

0xbadcafebee 5 hours ago||
Age-old lesson: change the tires on the moving vehicle that is your business when it's a Geo Metro, not when it's a freight train.

I'm sure the people with the purse strings didn't care, though, and just wanted to funnel the GH userbase into Azure until the wheels fell off, then write off the BU. Bought for $7.5B, it used to make $250M, but now makes $2B, so they could offload it make a profit. I wonder who'll buy it. Prob Google, Amazon, IBM, Oracle, or a hedge fund. They could choose not to sell it, but it'll end up a writeoff if the userbase jumps ship.

yoyohello13 7 hours ago|||
Vibe coding features.
staticassertion 7 hours ago|||
I assume this is all of the pains of going from "GHA is sorta kinda on Azure", which was a bad state, to "GHA is going full Azure", which is a painful state to get to but presumably simplifies things.
dec0dedab0de 7 hours ago||
You never go full Azure
qudat 7 hours ago|||
Their primary goal in the last year was to move to Azure. Any massive infra migration is going to cause issues.
seneca 7 hours ago||
> Any massive infra migration is going to cause issues.

What? No, no it's not. The entire discipline of Infrastructure and Systems engineering are dedicated to doing these sorts of things. There are well-worn paths to making stable changes. I've done a dozen massive infrastructure migrations, some at companies bigger than Github, and I've never once come close to this sort of instability.

This is a botched infrastructure migration, onto a frankly inferior platform, not something that just happens to everyone.

the_real_cher 7 hours ago|||
A.I. but that acronym can mean a number of things.

Artificial intelligence, Azure integration, many other things.

paxys 7 hours ago||
Senior engineers/leaders getting tired of Microsoft's shit and leaving.
More comments...