Career Development: 13 Formative Moments (Part 2)

Phil Venables
Oct 21, 2023
12 min read

The skills for your role and your leadership style build up throughout your career. But I’ve found, personally and in talking to others, that there are also significant formative moments that cause a big leap forward in expertise or outlook. These moments could come from good coaching or more often the brutal feedback of life in the real world.

It is interesting to discuss what people’s formative moments are, as everyone is different. In the last and this post are my Top 13 (I like the number 13) that have shaped my worldview and behavior on many things. The nature of them being formative means that they happened earlier in my career and so, given I’ve been around a long time, some of the situations may read like a history lesson. That will either be amusing or depressing since while many things have improved the underlying macro challenges of security remain the same.

In this post I’ll cover the final 6:

8. Ask For Forgiveness

There is a cliche that it’s easier to ask for forgiveness than for permission. This is often true although I wouldn’t recommend trying it in all circumstances and environments. For example, I’m writing this on a plane at 36,000 feet and I really don’t want the pilot veering off course to save some fuel and then asking air traffic control for forgiveness.

In mission critical environments there is naturally going to be a lot of valid reluctance to make changes, but a permission-centric approach might turn out to be overly cautious and drive more risk in the long-run vs. the operating risk such caution is seeking to avoid. In some cases the permission culture might also push back against customer or user impacts, potential cost issues or certain types of reputation issues. This might be right but you’ve still got to try and not let that hold things back. There are three main ways to overcome this:

Experiments. Run some small scale changes that can be easily reversed or have limited permanent impact if they don’t go to plan. Then with the (positive) results of the experiment in hand you can likely get the permission needed for the next step. Naturally, it's a matter of judgment what you experiment with, how you contain it and whether you take the risk of no permission (and then forgiveness) or the approach of telling people what you’re doing and give it a few days for people to intervene before you take silence as approval.
Skunk Works. For many larger organizations there can be enough slack in the system (if you look hard enough) to deploy a few people on a small scale. This could be enough to build a proof of concept to illustrate the possibility that something is worthy of a bigger investment.
Hold Back Freed Up Resources. In many parts of a security or risk program there are always opportunities to economize or create efficiencies either in terms of money, people or other capabilities. If you’re in an environment where it is required to return that to the central budget pot then hold some back anyway for your experiments or skunk works activities.

Now, the absolutely crucial thing to do if you want forgiveness is to operate with 100% integrity and good intent. Doing some reckless off-book stuff is not good in any circumstance. There’s a difference between being a crazy maverick vs. being an “intrapreneur”.

My favorite experience of this was in a prior job when we were tackling another tricky identity and access problem (so many of my examples are related to identity & access as I think it has caused me the most grief over the years). The goal was basically to create a single master record of who had access to what over the entire company. This remains a challenge today. But back in the early 2000’s I proposed the build out of a team to create this capability, when there was no good solution to buy. I was told by the CIO at the time that this was not a priority, would likely cost too much and would not be achievable because it was too complex. Well you know what happened next, we did it anyway. We allocated a small team by taking them off some other things and built a proof of concept data warehouse that sucked privilege tables from some key infrastructure systems and a whole range of business applications. For that scope, at least, we could answer the question of who has access to what. We were then able to start the process of improving privilege descriptors so the answers to the question of who has access to what actually had some plain-language meaning. Incidentally, this was at the same time we were integrating a whole bunch of HR, identity and directory systems so the “who” part of that question also had some meaning along with the “what” part. Not long after this in 2004/2005 the whole Sarbanes-Oxley Act started to kick into gear with most public companies having to do a much better job of reporting and validating who had what access over what parts of critical accounting and financial reporting systems. We were asked to demonstrate this and the same CIO who had not really wanted to do this was actually quite happy when this requirement was met by the system we’d built behind the scenes. I vaguely remember him taking some credit for the foresight. I don’t really know if this was genuine forgetfulness of his original opposition or a bit of gloss. I didn’t press the matter and had a pretty good time in budgets thereafter.

9. Start All Over Again

Sometimes you just have to scrap the thing and start over. I’ve seen countless projects, programs, systems and software that grew organically beyond their original design intent. This resulted in them having a lot of issues. Many failures are the result of a team clinging on to such systems until a crisis point is reached. Then they do a rebuild.

Early in my career I had a painful, but formative, experience of this. I was working on process control software for British Coal (the old nationalized UK coal mining company) and as part of this I needed to build a circuit board that was essentially a combination of a network and bus-tap to monitor layer 7 traffic to do protocol decoding and diagnostics. I was more of a software guy that knew some hardware than the other way round so despite some help I was on shaky ground. This was all proprietary stuff, there was little off the shelf hardware to do this more advanced decoding and so I just had to dive in. The obvious mistake, in hindsight, was to work iteratively without thinking about the overall end goals. So my version 0.1 board did a basic link layer tap. Then I had to add buffering because the processor couldn’t match the decoding speed for the throughput, then I realized the architecture was wrong for this so I had to add some additional components to branch the traffic to try and do some multi-processing. None of this worked that well because it wasn’t designed well from the outset and I couldn’t reliably get the software to compensate for this, at least not in the time frames I needed. I then got some advice from a much more experienced colleague peppered with expletive based questioning of my sanity, parenthood and general value as a human being (British Coal Research was not too different from the actual mines). The advice was basically start again, knowing what you know now and build the right thing and make better use of different types of hardware. After a night of not sleeping and generally being quite depressed it became clear that the right architecture was to build a very simple but effective high speed tap with a larger dual memory buffer and to actually use an off-the-shelf DEC network co-processing board that, while not designed exactly for this, could be adapted. So starting again was effective. But, here’s the rub, I don’t think that final much better design would have been made without the error of the first build - because that led to a deeper understanding of the problem. The only thing I would change is probably to declare failure quicker. You might ask why not seek broader help or copy other designs, but, this was the late 1980’s and there was no World-Wide-Web, very limited bulletin boards and limited selection of component and circuit board catalogs. You were often on your own.

10. Governance is a Tool

Everyone loves a committee. No, of course, they don’t. But we still need them for various reasons from policy decisions to oversight of programs and risk management. The trick with committees is to remember what they truly are, which is a tool not a goal. They are a “root of authority” which lets you, as the security or risk leader, get things done.

Over the past decades, in the formation and use of various risk committees and councils in various companies, one consistent pattern has emerged. That is the existence of an effective committee makes people question whether that committee is in fact needed. This paradox is exactly why you need to keep it place. In most organizations I’ve worked it has gone something like this: you form a committee to provide a form of executive level oversight to drive risk reduction and be a place where risk decisions are taken. This is to decide whether to prioritize the remediation of some risk or to live with the potential downside for some period of time - or more likely something in between. What often happens then, if you have the right senior executives on the committee and the right commitment to get security right then after a couple of cases of not wanting to do much risk acceptance the word gets out. Teams start to originate their claimed need for risk acceptance and the security team then tries to work with them to come up with some balanced remediation but then, if the team really feels like they can’t progress they ask for a risk acceptance. Security then informs them, sure, we can help curate that request but it’s going to the “committee” which is made up of these executives and they’d be happy to review your prioritization challenges. Then, pause, and in all but the very difficult cases the team decides to do the remediation.

You see where this is going, the risk committee starts to see fewer issues and questions what it is there for. So, to counter this, you have to keep showing the results of risk remediation happening without their involvement but because of their existence. Then shift the risk committee to be more theme based on analysis of what’s going on in the risk ledger, to look at more severe scenarios and do some incident / close-call analysis so the risk committee can hold the security team more deeply accountable.

Governance is a tool not an end in itself. Committees are a root of authority whose existence should make other work easier but not actually do the work itself.

11. Hit the Streets

Get down on the ground and see what’s going on.

I once worked for a European bank and when I started there I decided to go on a world-tour to all their major offices. I was then astounded to discover that none of my predecessors (or even their leadership team) had ever actually done this. This, hitting the streets, had some immediate positive effects in that I saw first-hand what the problems were, I got to understand the business much better by getting out to branches, corporate offices, trading floors and a whole range of subsidiary businesses. Best of all, I immediately built some very useful relationships. This last part was perhaps inevitable because many of these places hadn’t seen much of anyone from any discipline from headquarters. Consequently they started to see security as the team that cared.

But, the vital discovery was a whole series of issues where policies were communicated and mostly known but not enforced. To be fair to people this was largely because they were policies with no associated solutions. Noone really had a chance of implementing them. When I probed the various headquarters teams on this there was surprise. Someone literally said, “we are staggered that these policies aren’t being followed, we send the policy memos to each office and they go into the policy binders and those are the instructions to follow.”

So, two things here. First, they had no concept that people don’t necessarily follow instructions on memos. Second, they also didn’t know that when the policy memos turned up from head office they often got filed in binders by administrative assistants and the people who could act on them often didn’t even know there was something new. Worst, when it was all thankfully digitized (in Lotus Notes, so there’s another thing) it was even more silent in its update because no one operationalized policy conformance and ongoing monitoring.

I bet you know what happened next. Yes, we started monitoring for policy conformance in various ways. Then, all of a sudden, I was the guy that just made security massively worse despite me being hired to make it better. This was the formative experience among many when I realized you need to communicate that you’re about to shatter a prior state of blissful ignorance with the hard realities of measured truth - and go through the uncanny valley.

I survived this, with the help of some allies and some nimble footwork across the Board and the Executive Committees. However, all was not well in general. I eventually left that particular organization because we later got a Board mandate to move even quicker on a broad scale security uplift. We were in the odd position of having to push back on this because we knew the “clock speed” of the organization couldn’t adapt to such significant change so quickly and so pervasively. We knew, because we hit the streets, that these things had to be layered and that you couldn’t go from A to Z directly but really did have to go A, B, C….Z. The real kicker though was when the Board “authorized” a very large budget to do this but failed to direct business units to actually fund it, so we were left with an unfunded Board mandate to then have to convince business units to fund what we didn’t think was a good idea. Fortunately (see Build Your Future Network Now), an old boss and an old colleague now in different organizations were keen for me to join them in the US and the rest, as they say, is history.

So, hit the streets and see what is happening on the ground and remember people mostly don’t read policy instructions - you need solutions not policies.

12. Pay Attention to Scale

I remember looking at some patching stats in a prior role when we had become actually quite good at end-point patching. There was no stopping that train every cycle but there was always a degree of failure that needed constant follow up. At the scale we had there was always a long tail of weird issues whether it was a machine in a bad state, temporarily isolated, a bad prior patch that wasn’t detected that held up the new patch and so on. The issue is that at sufficient scale even 0.1% can be a lot of machines. My formative moment of truly understanding this (it often takes visceral reality as opposed to a theoretical appreciation) was 1 unpatched Windows server that was a particularly powerful machine. It had multiple high capacity network connections, dedicated network hardware, had missed a patch cycle, and other compounding factors had stopped the continuous monitoring detecting that. Then, another configuration oddity caused a variant of the SQL Slammer worm (this was a while ago) to come across an non-Internet external network connection which infected that machine. Now it was largely stopped then because there was nothing else to infect but the nature of this machine caused a huge spike in traffic which took down the adjacent network switch fabric and thrashed some WAN routers causing, in effect, an internal DoS event which went on until the server could be shut down using other means. We were 99.9% patched but that didn’t matter because the 0.1% was a beast.

So, remember, 5 x 9’s is great until you hit a certain scale and then there’s still plenty that can go wrong. The answer here, of course, is not always a merciless death-march to 100% but rather architectural defense in depth so that the small number of unresolved issues can’t cause significant impact.

13. Security of Fall Back Systems

Make sure the security of your failover systems and processes are just as secure as the primary. This is an interesting topic because it’s one that falls between the cracks of cybersecurity, technology risk, business operating risk and other domains and is a case where it truly requires the security team to understand business processes.

This was stamped in my memory because of an attack I was told about against a Bank I didn’t actually work at (good old peer to peer information sharing even before ISACs existed). They had a primary payments system which was, for its time, exquisite. It had end to end encryption, per transaction message authentication codes (MACs) generated by the system and additional MACs generated by payment operator hand held authentication calculators. Now, here’s where it gets interesting. The fail-over system was an old manual messaging system where payment messages were hand-written and then a manually calculated authenticator was written on it (what they used to call Telex test key codes). This was then sent by fax, other messaging systems or even actually Telex (told you this would be a history lesson). The problem was these codes were very easy to break, even without getting the code books (which in some banks were orderable stationery items anyway - not joking!). So, no prize for guessing what the attackers did. They cut the exterior network cables to the primary system (and its backup) to force a failover to the system that they could break.

Bottom line: throughout our careers we encounter situations that shape our views and behaviors for the rest of our professional lives. These formative moments, that might not seem so at the time, come to define us. It’s useful to write them down and reflect how they have affected you and those around you and to evaluate if they help or hinder you. A moment that locks in a behavior or view that is now counterproductive because the world has changed is something to be extinguished. Share your moments for the benefit of others.

RISK & CYBERSECURITY

Thoughts from the Field