GitHub Actions is slowly killing engineering teams

Posted by codesuki 1 day ago

GitHub Actions is slowly killing engineering teams(www.iankduncan.com)

367 points | 196 commentspage 3

apothegm 1 day ago||

This is roughly how I feel about cloudformation. May we please have terraform back? Ansible, even?

anttiharju 1 day ago||

I think cdk is the one to use nowadays. Infrastructure as real code.

staticassertion 1 day ago||

The worst part about CDK is, by far, that it's still backed by Cloudformation.

anttiharju 1 day ago||

What pains are you experiencing? Cdk has far exceeded Ansible and Terraform in my experience.

kortex 1 day ago|||

Hooo boy where do I begin? Dependency deadlocks are the big one - you try to share resource attributes (eg ARN) from one stack to another. You remove the consumer and go to deploy again. The producer sees no more dependency so it prunes the export. But it can't delete the export, cause the consumer still needs it. You can't deploy the consumer, because the producer has to deploy first sequentially. And if you can't delete the consumer (eg your company mandates a CI pipeline deploy for everything) you gotta go bug Ops on slack, wait for someone who has the right perms to delete it, then redeploy.

You can't actually read real values from Parameters/exports (you get a token placeholder) so you can't store JSON then read it back and decode (unless in same stack, which is almost pointless). You can do some hacks with Fn:: though.

Deploying certain resources that have names specified (vs generated) often breaks because it has to create the new resource before destroying the old one, which it can't, because the name conflicts (it's the same name...cause it's the same construct).

It's wildly powerful though, which is great. But we have basically had to create our own internal library to solve what should be non-problems in an IaC system.

Would be hilarious if my coworker stumbled upon this. I know he reads hn and this has been my absolute crusade this quarter.

SamuelAdams 19 hours ago|||

> The producer sees no more dependency so it prunes the export. But it can't delete the export, cause the consumer still needs it. You can't deploy the consumer, because the producer has to deploy first sequentially. And if you can't delete the consumer (eg your company mandates a CI pipeline deploy for everything) you gotta go bug Ops on slack, wait for someone who has the right perms to delete it, then redeploy.

This is a tricky issue. Here is how we fixed it:

Assume you have a stack with the ConstructID of `foo-bar`, and that uses resources exported to `charlie`.

Update the Stack ConstructID to be a new value, ie `foo-bar-2`. Then at the very end of your CI, add a `cdk destroy foo-bar` to delete the original stack. This forces a new deployment of your stack, which has new references. Then, `charlie` updates with the new stack and the original `foo-bar` stack can be safely destroyed once `charlie` successfully updates.

The real conundrum is with data - you typically want any data stacks (Dynamo, RDS, etc) to be in their own stack at the very beginning of your dependency tree. That way any revised stacks can be cleanly destroyed and recreated without impacting your data.

otterley 1 day ago|||

> Dependency deadlocks are the big one - you try to share resource attributes (eg ARN) from one stack to another. You remove the consumer and go to deploy again. The producer sees no more dependency so it prunes the export.

I’m a little puzzled. How are you getting dependency deadlocks if you’re not creating circular dependencies?

Also, exports in CloudFormation are explicit. I don’t see how this automatic pruning would occur.

> Deploying certain resources that have names specified (vs generated) often breaks

CDK tries to prevent this antipattern from happening by default. You have to explicitly make it name something. The best practice is to use tags to name things, not resource names.

staticassertion 1 day ago|||

I'll just echo the other poster with "deadlocks". It's obscene how slow CF is, and the fact that its failure modes often leave you in a state that feels extremely dangerous. I've had to contact AWS Support before due to CF locking up in an irrecoverable way due to cycles.

Storment33 19 hours ago|||

Ansible is CaC(Config as Code) not IaC(Infrastructure as Code) they're for different things.

bigstrat2003 1 day ago||

Why not just use Terraform, if you prefer that?

apothegm 23 hours ago||

Because my employer has already standardized on CF?

sinatra 6 hours ago||

Oh. So when you say “May we please have terraform back?” You mean “May we please have terraform back at my employer?” Why are you posting such an employer specific request on a public forum?

apothegm 4 hours ago||

Because it was meant as a rhetorical device, not a literal request.

robinhood 17 hours ago||

The article might be true for private companies, but as an OSS developer with one popular project and many smaller ones, having free access to a CI that, yes, sucks balls in terms of UX (ohhh the horrible click on a failed job and never be able to come back reliably), but which still work and is still pretty fast for the price I pay (ie 0$), is great. I think it's net positive for the OSS community.

eightys3v3n 6 hours ago|

Buildkite also seems to have a free option but I have no concept of how the value compares to the free option for GitHub Actions.

simianwords 1 day ago||

What I find hardest about CI offerings is that each one has a unique DSL that inevitably has edge cases that you may only find out once you’ve tried it.

You might face that many times using Gitlab CI. Random things don’t work the way you think it should and the worst part is you must learn their stupid custom DSL.

Not only that, there’s no way to debug the maze of CI pipelines but I imagine it’s a hard thing to achieve. How would I be able to locally run CI that also interacts with other projects CI like calling downstream pipelines?

anon7000 1 day ago|

That’s the nice thing about buildkite. Generate the pipeline in whatever language you want and upload as JSON or yaml.

direwolf20 22 hours ago|||

JSON or YAML imply a buildkite DSL as there's no standard JSON or YAML format for build scripts

nickkell 20 hours ago||

I assume by DSL they mean some custom templating language built on top, for things like iterating and if-conditions. If it's plain JSON/YAML you can produce that using any language you wish.

direwolf20 19 hours ago||

I don't think you understood it yet: the JSON or YAML is a DSL

simianwords 1 day ago|||

But do you provide SDKs in the languages? I mean even in gitlab I could technically generate YAML in python but what I needed was an SDK that understood the domain.

mFixman 22 hours ago||

Good place to ask: I'm not comfortable with NPM-style `uses: randomAuthor/some-normal-action@1` for actions that should be included by default, like bumping version tags or uploading a file to the releases.

What's the accepted way to copy these into your own repo so you can make sure attackers won't update the script to leak my private repo and steal my `GITHUB_TOKEN`?

Arbortheus 22 hours ago|

There are two solutions GitHub Actions people will tell you about. Both are fundamentally flawed because GitHub Actions Has a Package Manager, and It Might Be the Worst [1].

One thing people will say is to pin the commit SHA, so don't do "uses: randomAuthor/some-normal-action@v1", instead do "uses: randomAuthor/some-normal-action@e20fd1d81c3f403df57f5f06e2aa9653a6a60763". Alternatively, just fork the action into your own GitHub account and import that instead.

However, neither of these "solutions" work, because they do not pin the transitive dependencies.

Suppose I pin the action at a SHA or fork it, but that action still imports "tj-actions/changed-files". In that case, you would have still been pwned in the "tj-actions/changed-files" incident [2].

The only way to be sure is to manually traverse the dependency hierarchy, forking each action as you go down the "tree" and updating every action to only depend on code you control.

In other package managers, this is solved with a lockfile - go.sum, yarn.lock, ...

[1] https://nesbitt.io/2025/12/06/github-actions-package-manager...

[2] https://unit42.paloaltonetworks.com/github-actions-supply-ch...

jpeeler 18 hours ago||

I work in a monorepo at work, which of course increases complexity and build time due to more work being done. But I keep wondering even with better CI options that properly handle dependencies if solving the problem at that level is too low.

Currently evaluating using moonrepo.dev to attempt to efficiently build our code. What I've noticed is (aside from Bazel) it seems a lot of monorepo tools only support a subset of languages nicely. So it's hard to evaluate fairly as language support limits one's options. I found https://monorepo.tools to be helpful in learning about a lot of projects I didn't know about.

0xbadcafebee 1 day ago||

Personally I like Drone more than Buildkite. It's as close to a perfect CI system as I've seen; just complex enough to do everything I need, with a design so stripped-down it can't be simpler. I occasionally check on WoodpeckerCI to see if it's reached parity with Drone. Now that AI coding is a thing, hopefully that'll happen soon

uzername 12 hours ago||

I use a CICD tool called Vela.(No relationship to the k8s tool also called Vela.) It's mostly docker all the way down. Reminds me of bit bucket pipelines. Maybe worth checking out if GHA is just too opaque.

fmjrey 1 day ago||

Nice write up, but wondering now what nix proposes in that space.

I've never used nix or nixos but a quick search led me to nixops, and then realized v4 is entirely being rewritten in rust.

I'm surprised they chose rust for glue code, and not a more dynamic and expressive language that could make things less rigid and easier to amend.

In the clojure world BigConfig [0], which I never used, would be my next stop in the build/integrate/deploy story, regardless of tech stack. It integrates workflow and templating with the full power of a dynamic language to compose various setups, from dot/yaml/tf/etc files to ops control planes (see their blog).

[0] https://bigconfig.it/

cdaringe 16 hours ago|

Dynamic flow building is something I long wanted, for which we externalized to an external service s.t we could have our dummy CI pull task on many parallel workers after an initial centralized planning step. Each worker does: while (GET /build/123/task) run $task.cmd

Very helpful for a monster repo with giant task graph

More comments...