This is the first of a two part post on research challenges centered on systems, computer science and engineering research challenges. Part 2 : “Carbon” in a couple of weeks will discuss human, behavioral and economics challenges. This post is from a talk I gave at the University of York (UK, not Canada) nearly 10 years ago. As with some of my other recent posts where I’ve gone through some past talks and publications, I am surprised (perhaps disappointed) that many of the challenges remain consistent. However, as we see in other scientific and engineering disciplines, the presence of long term challenges might not necessarily be a bad thing. It is perhaps a sign that we’re on to something worth the focus and long term effort.
I think there is cause of optimism here. The past decade has seen a significant increase in academic research and applied engineering research in information / cybersecurity. There are many fine research programs at Universities all around the world as well as some excellent private sector funded research programs such as at Google Research.
There have also been some great studies on the direction needed in research. I was privileged to be a part of the team that developed the National Academies report on this in 2017 which correctly, in my view, posed the following 4 major themes of research direction.
The content, below, from the talk I gave drops a level of detail below this. I know there will be gaps in this and I’d welcome suggestions for additions.
Many of our issues can be simply stated. That is, we need to specify and codify our security requirements as a set of rules to be enforced and monitored. We need to map those rules to clearly stated policy goals derived from a compilation of risk analysis, laws, regulations and opportunities. The challenge is one of modeling, translating and applying those to massively distributed complex environments of people, objects/data and systems that often appear to behave organically. All of the research challenges below stem from one premise, that security is an emergent property of a complex system rather than being something that is just easily designed in. The fostering of such emergence is, by definition, a dynamic process in need of observation, positive and negative feedback loops and multiple levels of abstraction.
1. Distributed Policy Specification and Enforcement
Real world systems are massive aggregations of multiple components (software, data, network, storage, compute, etc) that need to work in harmony across multiple policy enforcement and decision points to achieve a desired security posture. Configuration of this environment is often too dependent on skilled security and other personnel applying multiple layers of translation from human postulated objectives to machine readable policy. We need an integrated modeling, policy management, rule distribution and enforcement framework.
2. Federated Monitoring and Policy Verification
The problem above is further complicated considering distributed systems extend across the entire supply chain,. We need to be able to make security decisions based on the apparent trustworthiness of an element beyond immediate policy control e.g. a vendor’s system. We need to do this without necessarily obtaining transparency over the full detail that is typically only available to the elements own enforcement and monitoring. Thus a trustworthy federated monitoring and policy enforcement verification approach (say, a reliable technical attestation framework) is needed that can also preserve the necessary privacy of the related entity’s environment and that of its other agents or customers.
3. Data Level Entitlement Policy Enforcement : Rules, Roles, Rights, Requests
Access to data and the meta data that describes it is often codified under a complex array of rules, roles, attributes, rights and request workflow rules. Policy can be explicitly defined or derived from the surrounding organizational context. Work is needed to develop more tools to manage, visualize and verify this and to provide a management environment usable by business risk management or other non-technical personnel.
4. Distributed Interoperable Data Protection
Data is created all the time and flows between organizations. Data access is typically protected in the channel (in motion), in its storage (at rest), and now even in use. Much progress has been made here but there are still challenges in enforcing policy rights once the information has left the security boundary of the originating organization. Digital / enterprise rights management software has been useful but needs more work in terms of policy protocol interoperability, transport/serialization interoperability to facilitate sharing and control across heterogeneous environments as well as linking those rights controls into trusted computing stacks.
5. Predictably Secure Systems and Service Development
We need to even further improve the security and reliability of the systems all organizations produce so as to resist ever more sophisticated attacks. This is not just about code analysis tools, penetration testing, fuzzing or other automated or systems-assisted reviews. Additionally, this is about providing some more fundamental integration of security and reliability objectives in the whole software lifecycle cycle. This is another space where massive progress has been made in integrating security into developer tooling, testing and deployment mechanisms. There have also been significant practical advances in formal verification at larger scale. Tying this together in the years to come, especially for critical systems, will be important.
6. Software, Behavior, Protocol and Zone Least Privilege with Dynamic Adjustment.
[Note: if I were writing this fresh today, I’d probably use the phrase “zero trust”]
We are moving and in many cases have moved to a world where we can no longer sufficiently find, detect and stop “bad stuff” [software, behavior, anomalous protocol communications, flows]. Rather, we need to keep moving to only permit known “good stuff”. This is relatively straight-forward for new, simply constructed (even at scale) environments, but is more difficult for large scale environments that have evolved over time and need to have this approach retroactively applied. Research and tools are needed to help profile, monitor, abstract and enable a move from block-list to allow-list approaches. This is across an array of problems to actually achieve usable and scalable default-deny, protocol/access least privilege and effective zoned / enclave based defense-in-depth at lower granularity and at multiple levels of abstraction.
7. Massive Scale Anomaly Detection and Behavioral Analytics
The complexity of systems, the fast evolution of attacks and the increasing inherent risk of many digitized systems means we have to do more monitoring for bad behaviors or earlier warning signs of attacks. This means increasing the coverage of our sensory apparatus as well as ingesting the digital exhaust (e.g. logs) of our entire environment. Developing models for how to fuse, manage, analyze and make sense of the signals that come from this is a hard problem in need of further research.
Bottom line: across the research community and with engineers/practitioners we have made and continue to make enormous strides on these problems. When I first wrote this list there was research progress but little was making it into the field. Now on each of these points there are large scale deployments of specific controls built on concepts such as zero trust, service meshes, higher assurance trustworthy computing, default encryption, sandboxing / enclaves, hardware assisted security, policy languages and security integration into software development, testing and deployment management. These, and with the notion of controls as code and the gradual application of declarative policy languages that drives the configuration of much of modern computing you can’t help but feel we’re headed in the right direction.