Posted by yagizdegirmenci 22 hours ago
However, it's still better to recognise a problem, so you can at least look into ways of improving the situation.
"I really appreciated this piece, as designing good metrics is a problem I think about in my day job a lot. My approach to thinking about this is similar in a lot of ways, but my thought process for getting there is different enough that I wanted to throw it out there as food for thought.
One school of thought 9https://www.simplilearn.com/tutorials/itil-tutorial/measurem...) I have trained in is that metrics are useful to people in 4 ways:
1. Direct activities to achieve goals
2. Intervene in trends that are having negative impacts
3. Justify that a particular course of action is warranted
4. Validate that a decision that was made was warranted
My interpretation of Goodhart’s Law has always centered more around duration of metrics for these purposes. The chief warning is that regardless of the metric used, sooner or later it will become useless as a decision aid. I often work with people who think about metrics as a “do it right the first time, so you won’t have to ever worry about it again”. This is the wrong mentality, and Goodhart’s Law is a useful way to reach many folks with this mindset.The implication is that the goal is not to find the “right” metrics, but to instead find the most useful metrics to support the decisions that are most critical at the moment. After all, once you pick a metric, 1 of 3 things will happen:
1. The metric will improve until it reaches a point where you are not improving it anymore, at which point it provides no more new information.
2. The metric doesn’t improve at all, which means you’ve picked something you aren’t capable of influencing and is therefore useless.
3. The metric gets worse, which means there is feedback that swamps whatever you are doing to improve it.
Thus, if we are using metrics to improve decision making, we’re always going to need to replace metrics with new ones relevant to our goals. If we are going to have to do that anyway, we might as well be regularly assessing our metrics for ones that serve our purposes more effectively. Thus, a regular cadence of reviewing the metrics used, deprecating ones that are no longer useful, and introducing new metrics that are relevant to the decisions now at hand, is crucial for ongoing success.One other important point to make is that for many people, the purpose of metrics is not to make things better. It is instead to show that they are doing a good job and that to persuade others to do what they want. Metrics that show this are useful, and those that don’t are not. In this case, of course, a metric may indeed be useful “forever” if it serves these ends. The implication is that some level of psychological safety is needed for metric use to be more aligned with supporting the mission and less aligned with making people look good."
A jaded interpretation of data science is to find evidence to support predetermined decisions, which is unfair to all. Having the capability to always generate new internal tools for Just In Time Reporting (JITR) would be nice, even so reproducible ones.
This encourages adhoc and scrappy starts, which can be iterated on as formulas in source control. Instead of a gold standard of a handful of metrics, we are empowered to draw conclusions from all data in context.
One good book on the positive impact of a metric that everyone on a team or organization understands is "The Great Game of Business" by Jack Stack https://www.amazon.com/Great-Game-Business-Expanded-Updated-... I reviewed it at https://www.skmurphy.com/blog/2010/03/19/the-business-is-eve...
Here is a quote to give you a flavor of his philosophy:
"A business should be run like an aquarium, where everybody can see what's going on--what's going in, what's moving around, what's coming out. That's the only way to make sure people understand what you're doing, and why, and have some input into deciding where you are going. Then, when the unexpected happens, they know how to react and react quickly. "
Jack Stack in "Great Game of Business."
https://commoncog.com/the-amazon-weekly-business-review/
Over the past year, Roger and I have been talking about the difficulty of spreading these ideas. The WBR works, but as the essay shows, it is an interlocking set of processes that solves for a bunch of socio-technical problems. It is not easy to get companies to adopt such large changes.
As a companion to the essay, here is a sequence of cases about companies putting these ideas to practice:
https://commoncog.com/c/concepts/data-driven/
The common thing in all these essays is that it doesn’t stop at high-falutin’ (or conceptual) recommendation, but actually dives into real world application and practice. Yes, it’s nice to say “let’s have a re-evaluation date.” But what does it actually look like to get folks to do that at scale?
Well, the WBR is one way that works in practice, at scale, and with some success in multiple companies. And we keep finding nuances in our own practice: https://x.com/ejames_c/status/1849648179337371816
Reality has a lot of detail. It’s nice to quote books about goals. It’s a different thing entirely to achieve them in practice with a real business.
As to Jack Stack's book, I think the genius of his approach is communicating simple decision rules to the folks on the front line instead of trying to establish a complex model at the executive level that can become more removed from day-to-day realities. In my experience, which involves working in a variety of roles in startups and multi-billion dollar businesses over the better part of five decades, simple rules updated based on your best judgment risk "extinction by instinct" but outperform the "analysis paralysis" that comes from trying to develop overly complex models.
Reasonable men may differ.
My two questions (a) and (b) were not rhetorical. Let’s get concrete.
a) You are advising a company to “check back after a certain period”. After the certain period, they come back to you with the following graph:
https://commoncog.com/content/images/2024/01/prospect_calls_...
“How did we do? Did we improve?”
How do you answer? Notice that this is a problem regardless of whether you are a big company or a small company.
b) 3 months later, your client comes back and asks: “we are having trouble with customer support. How do we know that it’s not related to this change we made?” With your superior experience working with hundreds of startups, you are able to tell them if it is or isn’t after some investigation. Your client asks you: “how can we do that for ourselves without calling on you every time we see something weird?”
How do you answer?
(My answers are in the WBR essay and the essay that comes immediately before that, natch)
It is a common excuse to wave away these ideas with “oh, these are big company solutions, not applicable to small businesses.” But a) I have applied these ideas to my own small business and doubled revenue; also b) in 1992 Donald Wheeler applied these methods to a small Japanese night club and then wrote a whole book about the results: https://www.amazon.sg/Spc-Esquire-Club-Donald-Wheeler/dp/094...
Wheeler wanted to prove, (and I wanted to verify), that ‘tools to understand how your business ACTUALLY works’ are uniformly applicable regardless of company size.
If anyone reading this is interested in being able to answer confidently to both questions, I recommend reading my essays to start with (there’s enough in front of the paywall to be useful) and then jump straight to Wheeler. I recommend Understanding Variation, which was originally developed as a 1993 presentation to managers at DuPont (which means it is light on statistics).
- Use not one, but many metrics (article mentioned 600)
- Recognize that some metrics you control directly (input metrics) and others you want to but can’t (output metrics).
- Constantly refine metrics and your causal model between inputs and outputs. (Article mentions weekly 60-90min reviews)
Edit: crucial part, all consumers of these metrics (all leadership) is in this.
Is Goodhart's Law as useful as you think?
https://en.wikipedia.org/wiki/Betteridge%27s_law_of_headline...
This is wrong, and the wrongness of it undermines the whole piece, I think:
- A fourth way people respond is to oppose the choice of target and/or metric; to question its value and lobby to change it.
- A fifth way people respond is to oppose the whole idea of incentives on the basis of metrics (perhaps by citing Goodhart's Law... which is a use of Goodhart's Law).
Goodhart's Law is useful not just because it reminds us that making a metric a target may incentivize behavior that makes THAT metric a poor indicator of a good system, but also because choosing ANY metric as a target changes everyone's relationship with ALL metrics-- it spells the end of inquiry and the beginning of what might be called compliance anxiety.
Your proposed fourth and fifth outcome behaviours, on the other hand, are neither. Most importantly, they are transient (at least ideally). Either the workforce and the management come to an agreement and metrics continue (or discontinue) or they don't and the business stays in a limbo. It is an emergency (or some other word with lower impact; incident?). There isn't a covert resistance by some teams specifically working against the metric and lowering it while also hiding themselves from notice.
I am bemused that you deride them, given that they are, in fact, how I have responded to metrics in technical projects since I first developed a metrics program for Borland, in ‘93. (I championed inquiry metrics and opposed control metrics.)