- Phil Venables
Risk = Hazard + Outrage
Updated: Aug 15, 2021
There are four major insights that, above all others, have influenced my approach to security and risk management over the past decades. Two were, I think, my own. Although, to be fair these were developed from the fusion of many ideas in the field and were of course likely conceived of in parallel by others, possibly expressed in different ways, so I’m not claiming absolute originality.
1. Attackers have bosses and budgets too
From this insight comes the notion that you can apply economic forces to shift attacker behavior. You can manage your own trade-offs with a mind to sufficient defenses vs. knowledge of not just the capabilities of the threat actors but also of their motivation to use all of their capabilities on you - or not. From this also comes the ability to counter many assumptions, like attackers only have to be right once but defenders all the time. Knowing attackers have constraints means you can just as easily re-cast this as attackers have to be unseen all the time, yet defenders only have to spot one part of the attack and then act to destroy the attacker’s efforts.
2. We need secure products, not just security products
Yes, this is obvious now. However, when many of us started talking about this 25+ years ago it seemed insightful. This thinking has given us much of what we discuss today with embedding security as early as possible into the design process (so called, shift left). The next two insights were not mine:
3. Reflections on trusting trust
Ken Thompson’s Turing Award Lecture is ever more relevant today when we consider software supply chain attacks.
4. Risk = Hazard + Outrage
This is the core concept of Peter Sandman's large body of work. It was developed mainly as a risk communication approach, but, for me this is an important part of enterprise risk management and informs how to balance logic and perception in practical ways.
Let’s further explore Risk = Hazard + Outrage. I’m not going to even attempt to effectively summarize Peter’s extensive body of work but I’ll dip into it here as to why I think it is important for how we manage modern day enterprise, technology or cybersecurity risk.
As the equation states, Risk is a function of Hazard and Outrage (in fact some of Peter’s latest discussions start to restate this as Risk = f(Hazard, Outrage) to placate some criticism of the original formulation's simplicity). Most traditional risk analysis methods, whether they are quantitative or qualitative focus extensively on hazards. That is, they focus on risk as an impact or consequence of various factors of threat and vulnerability. It’s a purely mechanical view of saying we should care more or less about mitigating something based on its operational or other impact to the system or organization. Occasionally, it considers reputation or brand impact which is a form of hazard consequence, but rarely do these approaches move beyond pure hazard. This is fine in theory but for those of you that have done the job of CISO, Chief Risk Officer, Chief Compliance Officer or similar you know a big chunk of the reality of day to day risk management is as much about Outrage as Hazard. Let’s work through some examples:
Case 1: Risk Acceptance Regret
I knew an organization some time ago that had identified an issue with one of its suppliers not encrypting laptops to protect some employee data on those laptops. This was when encrypting laptops properly needed third party software and a large effort to deploy, manage and support that. Now, of course, this is less of an issue (secure products, not security products). The security team recommended that the vendor be forced to implement third-party full disk encryption software at some expense - support costs included. But the relatively small amount of data at risk, the limited sensitivity of the data in real terms and a longer term plan for the vendor to change its business process to remove data from laptops meant that it was decided to temporarily accept the risk of this situation. This looked at the world purely from a Risk = Hazard perspective. The security team didn’t agree, but management decided from a cost/risk perspective that it was acceptable. As is the way of things, shortly after this decision, one of the laptops was lost and it had to be concluded the data was exposed. Now, the risk having actually been realized, outrage kicked in, despite the actual impact being quite limited. Management decided to no longer accept the risk, decided to fund emergency measures at the vendor to fix their laptop fleet and also to ask the security team why they, management, were permitted to accept this risk (yes, "why did you let me make that decision").
Risk = Hazard led to one decision, in hindsight Risk = Hazard + Outrage led to a different decision. I know, you could argue that the incident in hindsight caused a more pointed assessment of the Hazard, but I've also known plenty of organizations that even then still forget the Outrage and make the prior decision. This is why after an incident it's important to move quickly to capitalize on the new momentum. I know this organization subsequently applied this Outrage element consistently in future decisions and as a result accepted a lot less risk, revisited what risk was accepted more regularly and for risks accepted took additional measures to reduce the scope of exposure to reduce potential Outrage.
Case 2 : Scope Regret
Another organization around about the same time had a related issue that I think most organizations deal with in their third party risk programs, that is criticality ranking. Specifically, dealing with ranking to determine which vendors to focus which depth of review on. This organization had 10,000+ vendors and, correctly (Risk = Hazard) ranked these vendors into various tiers. The higher risk tiers being dominated by vendors that had the organization’s most sensitive internal or customer data or otherwise created a significant resiliency dependency on their operations. The vendors in the higher risk tiers were subject to deep reviews, were directed to mitigate discovered risk and were required to build sustainable risk management and security programs. The lower tier vendors were essentially not inspected beyond a very simple questionnaire. This was working well. Then, a coincidental set of security incidents occurred in various low tier vendors in a relatively short time-frame that alarmed (outraged) management. This was despite the almost negligible consequences to business operations or customer trust. The third party risk assessment team was directed to radically expand the depth and breadth of their reviews under management’s lens of Risk = Outrage. Now, the problem here is the security team were unprepared for the consequences and did not adequately prepare management for what was bound to occur. The consequence was to do extensive reviews of all vendors almost irrespective of the risk tier which, because the lower risk vendors had (often appropriately) less controls, generated 1000’s of issues to risk triage, mitigate at significant cost or risk accept. Now the next level of Outrage kicked in that this was too much cost vs. the actual hazard. After much wrangling, a new method of risk tiering was cast that deliberately used a form of Risk = Hazard + Outrage. This was to not only Hazard rank the vendor list but also to involve business unit leadership to have them pick from the wider list of vendors, that while not making the list due to Hazard, they felt they would not want to see an issue at (Outrage).
This didn’t increase the scope much and more usefully actually created some conversations about the risk ranking methodology’s correctness. It also uncovered some good issues that management were including certain vendors on Outrage criteria because they knew what they were strategically planning for that vendor before it was even formally considered. So Outrage identification became a Hazard predictor. But, ultimately management of the Outrage additions were for the services/vendors that those business units simply never wanted to have to explain even minor or inconsequential events to customers or regulators because of other trust factors (external Outrage).
Case 3 : Optics of Security
One of my favorite examples of Risk = Hazard + Outrage I know many organization’s have gone though is storage encryption in the enterprise. Again, this is before cloud environments gave you this by default (secure products, not security products). In the on-premise environment in the past decade or so many organizations have had to face the practical choice of encrypting all data at rest. Now, most security professionals know (Risk = Hazard) that encrypting disk en-masse doesn’t really help much. When an attacker has penetrated an organization and is operating with privilege, the encryption system is happily decrypting everything for the attacker anyway. There are some situations for which disk encryption is effective but not for many of the mass data exfiltration attacks. So, why do so many organization push to do full disk encryption in the data center. Well, whether they realize it or not, it’s because they are in fact following Risk = Hazard + Outrage. The correct way to mitigate risk is to use tokenization or field encryption and many layers of segmentation and privileged access controls at the O/S, database and application layers. This takes significant effort, is highly situation specific and so takes time to implement - many years in a complex organization. During the interim many organization’s correctly (from an Outrage perspective) implement quick and cheap full disk encryption. Why? Well, if you happen to be unfortunate enough to suffer a data breach you will be asked was the data encrypted? If you answer no (or "well, kind of, but not completely") then that will be used as an emblematic part of your failure. If you answer, yes all the data was encrypted then you get to communicate a deeper story and not suffer as much Outrage. This isn’t necessarily logical but the world we live in isn’t always logical and so the Risk = Hazard + Outrage formulation is important.
Case 4 : Vulnerability Inversion
I’ve seen this in many organization’s vulnerability management programs. They apply tremendous rigor to fix high risk vulnerabilities in critical systems (Risk = Hazard). This is a correct application of resources. The problem is, and I’ve watched many organization’s fall into this trap, if there is an incident that has a relatively minor impact from an exploit of a vulnerability on a lower criticality system then customers, regulators or other stakeholders don’t say or think, “Oh good, that wasn’t much impact, I’m glad the organization is focusing on the most important things.” No, they often say or think something like, “If the organization can’t do the simple lower risk items well how can it be trusted to do the more difficult higher risk items, I’m now worried.” This is the full Risk = Hazard + Outrage lens.
Incidentally, paying attention to this can lead you to a better place as it forces you to think about the economics of vulnerability triage, response and patching to focus less on triage and more on improving the resolution process so you can deal more effectively with more vulnerabilities (raise the baseline by reducing the cost of control). Another organization I knew, as early as 2004, for Windows patching just took the approach to patch everything continuously because it was in the end cheaper to do, rather than applying the effort to make a decision on every patch. Another economic corollary of this effect is when you take into account the time and effort spent communicating your Hazard decision to placate Outrage you might as well just spend that effort to not have to deal with that broader impact. Essentially: Cost of Dealing with Risk = Cost of Dealing with Hazard + Cost of Dealing with Outrage
Returning to Peter Sandman's work more precisely. He identified 3 useful intersections that inform different degrees of Hazard and Outrage.
Precaution Advocacy: High Hazard, Low Outrage. This is where the risk team see a significant risk, but people don’t care (yet). This requires a different communication style and approach. Link.
Outrage Management: Low Hazard, High Outrage. This is where the risk is well managed but to the outsider it doesn’t seem so, or has outrage manufactured for publicity effect. Link.
Crisis Communication: High Hazard, High Outrage. This is where your crisis communication plans come into effect, but need to be balanced to manage the outrage as well as hazard as often hazard reduction does not automatically reduce outrage. Link.
Incidentally, the occasional disconnect between risk views espoused by Boards, CEOs, CIOs and CISOs (as covered in this article ) becomes less puzzling when you look at this as not only a problem of different perceptions of Hazard but also as different perceptions of potential Outrage.
Bottom line: whenever you make a risk decision don’t just focus on the Hazard, also look at the potential Outrage. Sometimes the cost and impact of outrage management might tip the economic balance to mitigate the risk where hazard alone doesn’t. Even if it doesn’t change your work much you will find some of your thinking and some of your approaches change when adopting an additional outrage mindset. Another technique to counter the post-event outrage of “how could this possibly happen!” is to bring that hindsight into foresight using pre-mortems.