Principles for Cybersecurity Metrics
“For every metric, there should be another ‘paired’ metric that
addresses adverse consequences of the first metric.” - Andy Grove
We talk a lot about the need for good metrics in cybersecurity. However, I think we are in danger of becoming too obsessed with finding the perfect set of metrics for all contexts.
In most lines of work a variety of metrics are developed and used with great effect, then, when they’ve decreased in utility are simply discarded in favor of new ones. That being said, we do need some core set of metrics that can be used as a basis of comparison as we have in accounting, safety, quality, health and many other fields. But, even those are insufficient to fully describe a situation. For example, most businesses that use GAAP accounting standards also have many other metrics they use to measure the financial health of their business. Also, most financial institutions don’t simply rely on VaR (value at risk) to describe their trading risk, rather there are hundreds of other measures that have more precision and context and are better at measuring their actual risks under many different scenarios.
You will need a combination of your own metrics and, ideally, some standardized ones. Most importantly, though, you will need a set of guiding principles to help you construct and manage your metrics and related outcomes. There is plenty of material about what makes good metrics including a body of research, books and conferences (e.g. MetriCon). Even with all this, developing, selecting and actively using the right metrics is still hard. Here are some principles I’ve used for many years to guide such work:
1. Outcome Bias. Metrics should, mostly, be oriented toward measuring outcome / outputs as opposed to simply input. For example: measure the effect of an added resource, not only the amount of resource added.
2. Longevity and Accuracy. Metrics should survive the test of time and measurement be biased toward long term effect. Additionally, meta-metrics (metrics about metrics) should focus on the accuracy and completeness of the underlying metrics data - augmented with back testing.
3. Focus on Purpose not Rewards. The more measurement is for results, not a signal of human behavior to get those results, the better. Extrinsic rewards or punishment should not be directly linked to a specific metric - to avoid gaming of that metric. Yes, it's true, you get what you measure - and that can be a big problem if you only get what you measure. This is especially problematic given your measurements can never fully describe what your organization goals are.
4. Utility Focus. Count what counts, not what you can count. Is what is being measured a proxy for what you really want to know? Don't just report on the things you can currently measure. A highly effective technique is to show on your dashboards the metric you need, but can't obtain, as blank and marked in whatever way you use to signal high risk. This attracts questions, and then often is an impetus to go do the work to get that data. Sometimes your biggest risk is the risk you can't measure what you need to be measuring.
5. Efficiency. Is the usefulness of more metrics for a given area worth the effort and cost of collection? Can metrics be obtained as a by-product (“digital exhaust”) of existing processes rather than being explicitly collected?
6. Expertise Proximity. Are the proposed metrics developed by or agreed with (within reason) the people with expertise on the domain being measured? Bias towards bottoms up metrics based on observed good practice - which is then sustained.
7. Assume Gaming. Even the best measurements will be subject to (possibly inadvertent) corruption of goals. Plan for it. Pair metrics with other metrics to monitor for such gaming or unintended consequences.
8. Directional Consistency. Make sure metrics are at least directionally consistent even if they are not as accurate as you’d like. If a metric indicates things are getting better they should, of course, actually be getting better.
9. Use a Metrics Taxonomy. Develop a taxonomy to shape the type of metrics you collect so you are rigorous in using them correctly. For example:
Baseline Metrics: basic expected controls conformance, for example: patching, identity/access, configuration assurance, etc.
Asset / Service Risk Oriented Metrics: measurement of risk to specific groups of assets / business services in relation to attacker motivation and capability, for example: layers of complementary defense from attacker to target, degree of pressure on controls, dwell time of [possibly simulated] attacks etc.
Capability Improvement: measurement of overall efficiency and organizational capability (people, process, tools), for example: SDLC integration levels, skills density, system stagnancy, blast radius for major incident scenarios etc.
Commercial Outcomes: what direct or adjacent benefits are expected in addition to loss avoidance, risk reduction or other security outcomes, for example: increased customer sign-up rates due to seamless authentication processes, increased collaboration in support of customers, reduction in employee overhead / time spent managing controls etc.
Bottom line: don’t let the perfect be the enemy of the good in selecting and managing outcomes with metrics. By all means strive for standards of comparison, but don’t be afraid of having your own metrics that suit your environment. Just makes sure to follow some principles of construction and apply those with ruthless consistency.