- Phil Venables
The Uncanny Valley of Security - Updated
Updated: Dec 5, 2022
Since I first wrote this post 2 years ago I keep seeing it reinforced. The basic premise is that, sometimes, advanced levels of security can look like poor security. Recently Dino Dai Zovi tweeted something which serves to illustrate this:
Specifically, on the face of it, the number of 0-days found and resolved can be naively interpreted as a bad thing, in the sense that there are vulnerabilities and they’ve been exploited. But looking at it another way (many of us would argue the correct way) this means that more 0-days are being discovered in use and are being burned. The possible counter argument when comparing one product vs. another in their relative number of 0-days would have to take in many factors. Factors such as usage and therefore the degree of focus from attackers, the extent to which those vendors are hunting for exploitation, fixing the vulnerabilities and getting those fixes out. A vendor who claims less vulnerabilities in their products need also to be able to show that their products are coming under the same assault, the same self-scrutiny and to show that the relative absence of evidence of 0-day exploitation is not necessarily evidence of absence - would they even know if they’re not aggressively looking for exploitation?
So, this is another example of the uncanny valley of security. Let's revisit this.
The uncanny valley is a famous term in robotics. It is used to describe how we accept robots that don’t attempt to look too human, but, as they approach a near life-like appearance we are repulsed by them. The diagram below illustrates this. Similar issues appear in animation, who remembers Polar Express?
We can also experience a similar uncanny valley when faced with mismatches in our privacy expectations. This, and other work, asserts that uncanny valleys appear when something is expected to be perfect but isn’t quite there yet:
This is all relevant to many parts of risk and security. I wrote about the "don’t fire me chart” as one example of such an uncanny valley. In particular, where teams get so good at identifying issues that it leads management to conclude things are getting worse. This is one of many reasons why environments with no measurement / no transparency usually look (paradoxically) better than those who diligently uncover issues - at least until a blow up occurs. The secret to sustained success in many disciplines is to push past the uncanny valley. The chart below is a risk-oriented generalization of the uncanny valley chart. The vertical access is the perception of what is good - remember this is all about perception not necessarily an absolute logic of reality. The horizontal access is the degree of transparency created in the risk program, or the true effectiveness, depending which scenario we are looking at.
Pick any scenario. You start moving along, creating risk transparency and improving effectiveness. In the early stages you are perceived to be doing well. Then at some point in this journey you hit a moment of stall, disillusionment, or discover a deeper and much harder problem that is in the way. Now, the perception of how good things are doesn’t just decrease, but actually collapses. The latter pattern is common. You discover issues and appear to be doing well to resolve them only to uncover some deeper, wider and more fundamental issues that management (or others) never knew existed. You appear to have “created” this problem and so positive perception might collapse. You have entered the uncanny valley. You then have to abandon the work, be abandoned, or push through - assuming you managed to mentally prepare people for this moment. Let’s explore some illustrative examples, using 3 sequential bullets:
Things are going well.
The uncanny valley.
When you’ve pushed through.
1. Improving a Security Program (the original don’t fire me chart)
You are finding and fixing issues, and are perceived to be making great progress.
You put in place more in-depth risk measurement and reveal substantially more issues than were imagined. Management now has way more issues than ever. The excess of issues are perceived as being created by you. You’re in the valley.
You push through and bring the new (actual) reality of issues under deeper and more sustainable control. You are seen as the savior.
2. Regulatory Issues Reporting
You put in place a means to identify and report specific issues to your regulators. For example, finding and reporting occasional privacy transgressions. You are well regarded for establishing this.
However, the regulators see you as an outlier, as the only company in that market reporting such issues. The reality is you’re the only company even looking for them - but you still stand out negatively. You’re in the valley.
You explain to the regulator that this is the case and encourage them to look at the capabilities of all organizations to find and report such issues. They, predictably, discover no-one else was reporting because no-one else was looking. You’ve now gained some significant kudos.
3. Degree of Coverage in Certifications e.g. SOC1 / SOC2
You develop your certifications' organization scope and range of controls to represent a wide view of risk - what you believe is right for your customers. You get some exceptions, but nothing serious, but it’s still not a “clean” report.
Leadership (informed by customers) hear your main competitors have zero exceptions and, indeed, have never had any in their reports. You look bad by comparison - you can’t believe they’re clean as you know they all have actual issues. You’re in the valley.
You read the reports of your competitors and you (and your leadership) realize their scope is tiny and the list of controls reviewed in no way represents sufficient risk mitigation. Your customers and theirs are starting to realize this to your benefit. You’ve pushed through and are getting clean reports. Your competitors, having had to expand their scope, are now seeing exceptions and in some cases "qualified" reports.
4. Efficiency and Zero Based Budgets
You develop a budget that masks a lot of the detail of what your team does but there’s limited understanding of what is reasonable so you're doing ok for now.
You then develop a full and detailed zero based budget that reveals, inevitably, a number of inefficiencies in how you’ve grown, some wasted projects and duplication of a few things that just need cancelling. Leadership thinks you might have somehow duped them into this over spending. You’re in the valley.
You move very quickly, to re-assign people and budget, trim some failing projects, and focus on demonstrating the degree of efficiency you’ve achieved. You start doing some benchmarking to other organizations based on productivity not just raw budget. You are seen by leadership as having grown efficiently and are a trusted custodian of increased budget.
5. Incident Reporting and Analysis
Your incident (or near-miss) response process is not transparent. It only shares that an event occurred and was resolved. There is limited root cause analysis or learning. But, it is still perceived as better than the situation was before you put in place such a process to find, triage and resolve incidents.
You start to add detail and root cause analysis to the reports and uncover large numbers of thematic issues, especially as you go deeper down the 5Y’s. The situation looks out of control (see item 1) as you pile issues upon issues. You’re in the valley.
You continue with this process and use your 80/20 to find the 20% (or so) of underlying issues that represent the root cause of the majority of the events. You fast track some fixes to reduce incident levels. You’ve sustainably turned it around.
6. Software Security
You start off your software security program focusing on the most critical types of vulnerabilities in your most exposed business services, for example: OWASP Top 10 on your Internet facing applications. Predictably, you find a number of issues but get them fixed pretty quickly.
You then launch a much wider program of work conducting deeper code reviews, design reviews, more aggressive tests and the use of automated tooling, not just on your Internet facing applications but across your entire software estate. You’re now drowning in a myriad of issues, there’s a sense of crisis, your auditors have handed you a bunch of findings based on your self-discoveries and management are not pleased. You’re in the valley.
You partner more closely with your development and developer tooling teams to integrate vulnerability discovery earlier in the lifecycle, you identify many of the common vulnerability patterns and provide tools/toolkits and APIs to mitigate whole classes of issues. These even provide adjacent benefits such as increased reliability.
7. Access Management
You start a basic program of identifying critical assets and business services. You create a centralized view of all employee’s access in those environments and provide a workflow driven report back to supervisors/managers to review and decommission some access. You’re making great progress and have not only reduced risk, but you’ve also cleared up some long-lived audit issues as well.
You extend your access review system to all of your applications, databases, user managed data repositories and anything else you can find. In addition, you get ambitious and start overlaying business rules to identify what you believe might be “toxic combinations” of access. Now you are dealing with large numbers of issues, supervisors/managers are overwhelmed with what they need to review, and most systems don’t even adequately describe the privileges people are being asked to review. There's no role or team grouping to make administration easier. People perceive the situation to be out of control. You’re in the valley.
You start instituting some basic rules and roles such as automatically flagging access to be removed on job transfers. Then you apply some analytics to focus people’s attention on outliers of excess privilege within groups. In parallel you deploy improved workflow systems to manage privilege according to roles and attributes which make most access control decisions automatic. You’ve not only got this under control but have also delivered the adjacent benefit of reduced onboarding time for new personnel.
8. Security Incidents
Security incidents go down as many of your basic controls and hygiene processes take hold. You're pleased with this even though this is what was expected of you.
Now that many of the basic incidents are no longer happening, what incidents that do remain start to look increasingly weird. In fact many look so unusual that people naively think they should have been easy to stop and are incredulous they weren't. You're in the valley.
You increase transparency of what incidents are being thwarted, show comparisons with available industry data to illustrate the correct base rate. You also involve some senior leadership in your new blameless port-mortem process so they see the effectiveness of your team's work.
Managing the Valley
There are some good practices across all scenarios to avoid the valley or cross the valley quickly:
Plan to avoid the valley. As you’ve seen in the examples above, there’s an obvious common pattern that you could have anticipated the valley and adopted the end solution as your approach from the beginning. In other words, being careful on scope, providing tools and support, integrating carefully into business and technology workflows and so on. But, this is often an idealized situation. In reality you may have to solve problems tactically, either because of wider deadlines, risk priorities, audit or regulatory findings or other driving factors. These layers of tactical work can soon put you in the valley. Often the people who compel you to the quick fix will be the ones who are most aghast when you’re in the valley even though you didn’t get the time to phase the strategy correctly. Try and challenge this up-front.
Foretell the valley. The next option is to pre-communicate to your leadership or other sponsors that the valley is there, it’s coming and what it will look like when you’re in it. Often, this communication may help to alleviate some of the drivers that will otherwise put you in it. If not, then at least it may help assuage the revulsion when you get there.
Cross the valley quickly. Ready yourself for the pain. Perhaps even use it to your advantage to secure the funding or prioritization for the next set of tactics that will push you toward your strategic goals. As Churchill said, "If you're going through hell, keep going."
Stop short. Finally, perhaps in some cases it might even be wise to stop short of the end goal - before you enter the valley. Some problems might not be ready to solve.
Bottom line: There's a common theme here. When you bring transparency to find and fix issues you look worse than those who don’t even try. Even though your risk, in the end, is less than theirs. When you don’t prepare your leadership or customers for this you not only get a decrease in perceived trust you get a collapse of trust. You’re in the uncanny valley. You need to be moving with such momentum or have prepared people for this to be able to make it through to the other side.