Use the Mikado Method to do safe changes in a complex codebase

Posted by foenix 4 days ago

Use the Mikado Method to do safe changes in a complex codebase(understandlegacycode.com)

129 points | 65 comments

bob1029 3 hours ago|

My favorite tool for trying scary complicated things in an unknown space is the feature flag. This works even if you have zero tests and no documentation. The only thing you need is the live production system and a way to toggle the flag at runtime.

If you can ship your hypothesis along with an effectively unaltered version of prod, the ability to test things without breaking other things becomes much more feasible. I've never been in a real business scenario where I wasn't able to negotiate a brief experimental window during live business hours for at least one client.

nijave 1 hour ago||

While very powerful, I think it's worth calling out some pitfuls. A few things we've ran into - long lived feature flags that are never cleaned up (which usually cause zombie or partially dead code) - rollout drift where different environments or customers have different flags set and it's difficult to know who actually has the feature - not flagging all connected functionality (i.e. one API is missing the flag that should have had it)

A good decom/cleanup strategy definitely helps

Groxx 10 minutes ago|||

Have them emit metrics when it's triggered. You can do a bulk "names X, Y, Z haven't used branch B in >30 days, delete?" task generator pretty easily. Un-triggered ones are also easy to catch if you force all calls to be grep-friendly (or similar), which is also an easy lint to write: unclear result? Block it, force `flag("inline constant", ...)`.

Personally I've also had a lot of success requiring "expiration" dates for all flags, and when passed they emit a highly visible warning metric. You can always just bump it another month to defer it, but people eventually get sick of doing that and clean it up so it'll go away for good. Make it annoying, so the cleanup is an improvement, and it happens pretty automatically.

enlyth 42 minutes ago|||

Yep, archiving feature flags and deleting the dead code is usually thing number 9001 on the list of priorities, so in practice most projects end up with a graveyard of them.

Another issue that I've ran into a few times, is if a feature flag starts as a simple thing, but as new features get added, it evolves into a complex bifurcation of logic and many code paths become dependent on it, which can add crippling complexity to what you're developing

hinkley 1 hour ago|||

Feature flags are like bloom filters. They make 98 out of 100 situations better and they make the other 2 worse. When performance is the issue that’s usually fine. When reliability is the issue, that’s not sufficient.

If you work on fifty feature toggles a year, one of them is going to go wrong. If your team is doing a few hundred, you’re gonna have oopsies.

Most of the problematic cases are where the code is set up so that the old path and the new one can’t bypass each other cleanly. They get tangled up and maybe the toggle gets implemented inverted where it’s difficult to remove the old path without breaking the new.

jaggederest 2 hours ago|||

You can go even further with something like the gem scientist at the application level, or tee-testing at the data store level. Compare A and A', record the result, and return A. Eventually, you reach 100% compatibility between the two (or only deviations that are desirable) and can remove A, leaving only A'

I also like recording and replaying production traffic, as well, so that you can do your tee-testing in an environment that doesn't affect latency for production, but that's not quite the same thing.

eastbound 2 hours ago||

You’ve just resolved a problem I had. I had this problem on a search engine, but I made it as a “v2”. And I told customers to switch to v2. And you know the v2 problem: Discrepancies that customers like. So both versions have fans, but we really need to pull the plug on v1. You’ve just solved it: I should have indexed even records with v1, odd records with v2. Then only I would know which engine was used.

charles_f 6 hours ago||

Write tests. Most likely those 300k lines of code contain a TESST folder with 4 unit tests written by an intern who retired to become a bonsai farmer in the 1990s, and none of them pass anymore. Things become much less stressful if you have something basic telling you you're still good.

layer8 5 hours ago||

The problem with complex legacy codebases is that you don’t know about the myriads of edge cases the existing code is covering, and that will only be discovered in production on customer premises wreaking havoc two months after you shipped the seemingly regression-free refactor.

ljm 2 hours ago|||

It helps if tests are well written such that they help you with refactoring, rather than just being the implementation (or a tightly coupled equivalent) but with assertions in it.

Rare to see though. I don't think being able to write code automatically means you can write decent tests. Skill needs to be developed.

mehagar 4 hours ago||||

I agree. This is one area I'm hoping that AI tools can help with. Given a complex codebase that no one understands, the ability to have an agent review the code change is at least better than nothing at all.

nijave 1 hour ago||||

You can infer based on code coverage. If coverage is low, tests are likely insufficient and change is risky

UltraSane 4 hours ago||||

If you save a log of input on the production system you can feed it to old and new versions to find any changed in behavior.

karmakurtisaani 5 hours ago|||

The best time to write tests was 20 years ago. The second best is now, provided you've applied to all the companies with better culture.

ipsento606 2 hours ago||

I've been working on react and react native applications professionally for over ten years, and I have never worked on a project with any kind of meaningful test coverage

stronglikedan 2 hours ago|||

over 20 years, many stacks, and same

locknitpicker 2 hours ago|||

> I have never worked on a project with any kind of meaningful test coverage

That says more about you and the care you put into quality assurance than anything else, really.

ipsento606 2 hours ago|||

Presumably you mean me, and every current and former team-member I've ever had? If so, you're talking about hundreds of engineers.

AnimalMuppet 1 hour ago|||

Have you ever worked at a place where you were put on an existing codebase, and that code has no tests? Have you ever worked at a place where, when you try to fix that, management tells you that they don't have the time to do so, they have to crank out new features?

Is ipsento606 working at such a place? I don't know, and neither do you. Why do you jump to the conclusion that it's their personal failing?

nitnelave 1 hour ago||

Also known as "Make the change easy, then make the change"

Something to realize is that every codebase is legacy. My best new feature implementations are always several commits that do no-op refactorings, with no changes to tests even with good coverage (or adding tests before the refactoring for better coverage), then one short and sweet commit with just the behavior change.

collingreen 1 hour ago||

I also do this and try to teach it to others. One thing I add is trying to go even further and making it so the new feature can essentially be a configuration change (because you built the system already in the first steps). It doesn't fit every situation so it's by no means a hard rule but "prefer declaration functionality over imperative".

hinkley 1 hour ago||

That’s just mostly refactoring in general.

Mikado is more of a get out of jail card for getting trapped in a “top down refactor” which is an oxymoron.

Illniyar 5 hours ago||

This is a good method if you are stuck and you don't know what you need to do. It also helps explore a project with a specific task in mind.

It is not very useful in giving you confidence your changes would not cause unexpected side effects, which is usually the main problem working with legacy code.

If you want confidence when working with legacy code, your best bet is to do a strangler fig pattern - find a boundaries for the module you want to work on, rewrite the module (or clone and make your changes), run both at the same time in shadow mode, monitor and verify your new module is working the same as the old one, then switch and eventually delete the old module.

LoganDark 4 hours ago|

Boundaries? Module? I laugh.

hinkley 1 hour ago|||

Mikado is really only powerful when dealing with badly coupled code. Outside of that context you’re kinda cosplaying (like people peppering Patterns in code without an actual plan).

Refactoring is generally useful for annealing code enough that you can reshape it into separate concerns. But when the work hardening has been going on far too long there usually seems like there’s no way to get from A->D without just picking a day when you feel invincible, getting high on caffeine, putting on your uptempo playlist and telling people not to even look at you until you file your +1012 -872 commit.

I used to be able to do those before lunch. I also found myself to be the new maintainer of that code afterward. That doesn’t work when you’re the lead and people need to use you to brainstorm getting unblocked or figuring out weird bugs (especially when calling your code). All the plates fall at that point.

It was less than six months after I figured out the workaround that I learned the term Mikado, possibly when trying to google if anyone else had figured out what I had figured out. I still like my elevator pitch better than theirs:

Work on your “top down” refactor until you realize you’ve found yet another whole call tree you need to fix, and feel overwhelmed/want to smash your keyboard. This is the Last Straw. Go away from your keyboard until you calm down. Then come back, stash all your existing changes, and just fix the Last Straw.

For me I find that I’m always that meme of the guy giving up just before he finds diamonds in the mine. The Last Straw is always 1-4 changes from the bottom of the pile of suck, and then when you start to try to propagate that change back up the call stack, you find 75% of that other code you wrote is not needed, and you just need to add an argument or a little conditional block here and there. So you can use your IDE’s local history to cherry pick a couple of the bits you already wrote on the way down that are relevant, and dump the rest.

But you have to put that code aside to fight the Sunk Cost Fallacy that’s going to make you want to submit that +1012 instead of the +274 that is all you really needed. And by the way is easier to add more features to in the next sprint.

hamandcheese 4 hours ago||||

Replace "module" with "system" - every system has boundaries.

thfuran 4 hours ago||

Some of them are notoriously spaghetti-shaped, and that’s hard to isolate and replace.

nailer 4 hours ago|||

Then your first step is found! Make those boundaries. Isolate dcomponents so you can test them.

yomismoaqui 5 hours ago||

I recommend reading a classic, "Working Effectively With Legacy Code" from Michael Feathers.

nuancebydefault 1 hour ago||

I've been a few times in a situation where I needed to make significant changes in a huge codebase with lot's of tests but also with a lot of corner cases, on my own.

I've spent blood sweat, tears and restless evenings scrolling and ctrl-f-ing huge build and test logs to finally accomplish the task.

But let's take a step back.

So they assign you to get that done. You're supposed to be careful, courageous and precise while making those changes without regression. There's very little up-to-date documentation on the design, architecture, let alone any rationale on design choices. You're supposed to come up with methods like Mikado, tdd, shadowing or anything that gets the job done.

Is this even fair to ask? Suppose you ask a contractor to re-factor a house with old style plumbing and electricity. Will they do it Mikado style, or, would they say - look - we're going to tear things down and rebuild it from the ground. You need to be willing to pay for a designer, an architect, new materials and a set of specialized contractors.

So why do we as sw engineers put up with the assignment? Are we rewarded so much more than the project manager of that house who subcontracts the work to many people to tear down and rebuild?

phito 52 minutes ago|

If you're paid by the hour, then does it really matter if you have to refactor stuff? If it takes a long time to do then it'll be more expensive for your employer.

Does the project manager get paid more by the hour to refactor a house than to build one?

mittermayr 6 hours ago||

While great in theory, I think it almost always fails on "non-existent" testing structures that reliably cover the areas you're modifying. I change something, and if there's no immediate build or compile error, this (depending on the system) usually does not mean you're safe. A lot of issues happen on the interfaces (data in/out of the system) and certain advanced states and context. I wouldn't know how Mikado helps here.

In other words, I'd reword this to using the Mikado method to understand large codebases, or get a first glimpse of how things are connected and wired up. But to say it allows for _safe_ changes is stretching it a bit much.

SoftTalker 6 hours ago||

Yes, most of the time such spaghetti code projects don't have any tests either. You may have to take the time to develop them, working at a high level first and then developing more specific tests. Hopefully you can use some coverage tools to determine how much of the code you are exercising. Again this isn't always feasible. Once you have a decent set of tests that pass on the original code base, you can start making changes.

Working with old code is tough, no real magic to work around that.

agge 5 hours ago||

If you create a graph of what changes are needed to allow for other changes, eventually leading to your goal.

Then by definition you have the smallest safest step you can take. It would be the leaf nodes on your graph?

csours 2 hours ago||

This sounds like torture (as written).

Of course, working in a legacy codebase is also torture.

Software development is a hyper-rational endeavor, so we don't often talk about feelings. This article also does not talk much about feelings.

Reading between the lines, it looks like reverting the code is supposed to affect how you feel about the work. Knowing that failure is an explicit option can help to set an expectation; however, without a mature understanding of failure, that expectation may just be misery.

With a mature understanding of failure, the possibility of a forced rollback should help you "let go" of those changes. It's like starting a day of painting or drawing with one that you force yourself to throw away; or a writing session with a silly page.

----

If someone thinks that they are giving you good advice, but it sounds terrible, then maybe they are expecting you to do some more work to realize the value of that advice.

If you are giving someone advice and they push back, maybe you are implying some extra work or expectations that you have not actually said out loud.

Advice is plagued by the tacit knowledge problem.

castral 1 hour ago|

Maybe it is the framing of the step as a "reversion" or "roll-back" rather than "spike" or "prototype" that is causing that sense. Personally, I would never throw away the code I spent time and effort writing just to stick to a systematized refactoring method like this "Mikado." I don't think the advice is unsound, and I have done exactly this many times in my own career, but instead of throwing it away I would shelve it, put it in a branch, write a document about what has been/needs to be done, and write a corresponding tech debt or feature/fix ticket for it with the new and realistic estimate.

jeremyscanvic 3 hours ago||

Is it possible in practice to control the side effects of making changes in a huge legacy code base?

Maybe the software crashes when you write 42 in some field and you're able to tell it's due to a missing division-by-zero check deep down in the code base. Your gut tells you you should add the check but who knows if something relies on this bug somehow, plus you've never heard of anyone having issues with values other than 42.

At this point you decide to hard code the behavior you want for the value 42 specifically. It's nasty and it only makes the code base more complex, but at least you're not breaking anything.

Anyone has experience of this mindset of embracing the mess?

0xbadcafebee 3 hours ago||

I believe this is called Microsoft Driven Development

(seriously though, this book has answers for you: Working Effectively with Legacy Code, by Michael Feathers)

fc417fc802 2 hours ago||

You misspelled Oracle.

niccl 2 hours ago|||

All. The. Time. And I hate it. Imagine giving a customer a rebate based on buggy code. You fix a bug, the customer comes back and wants to check that the rebate was correct that last time. Now you have to somehow hard-code the rebate they did get so that your (slightly less buggy) code gives the same result. But hard-coding has the risk of introducing other errors on its own. Oh yes, and you've never enough time to do things properly because Customers (or maybe Management). A tangled mess of soul destroying lifeblood-sucking code and pressures ensues.

sublinear 2 hours ago||

I've never seen code truly get that bad, but I can already think of several problems with that approach.

Do you really know all of the expected behavior you're hardcoding in? What happens if your hardcoded behavior is just incorrect enough that it breaks something somewhere else? How can you be sure that your test for that specific value is even correct?

I think the better approach is to let things break naturally and open a bug with your findings. You'd be surprised how often someone else knows exactly what's going on and can fix it correctly. Your hacks are not just pouring gasoline onto the fire, but opening a well directly underneath that will keep it burning for a long time.

woodruffw 1 hour ago|

I was hoping it was a reference to The Mikado, given that the best way to refactor is with a short, sharp shock[1].

[1]: https://en.wikipedia.org/wiki/Short,_sharp_shock

More comments...