Resilience

The evolving intersection of risk and resilience

Published on

June 24, 2026

A board decision lens for rebalancing prevention, containment, questions for deciding what to prevent, what to contain, and what to withstand.

Authors’ note

This article brings together two connected perspectives. David Ferbrache approaches the issue through operational resilience, crisis experience, and board-level risk governance. James Hanbury approaches it through cyber risk quantification and the use of loss models to support investment, appetite, and assurance decisions. Our shared argument is that the intersection of risk and resilience is now a board-level investment question: where should organisations reduce the likelihood of an event, where should they limit escalation into severe loss, and where should they strengthen the ability to preserve critical value when disruption still occurs.

1. Classical risk management approaches are being tested

We have come to expect exceptional levels of availability and responsiveness from service providers in our always on digital world. At the same time, organisations are more interconnected and interdependent than ever before, becoming increasingly reliant on technology for all aspects of their operations from highly optimised logistics and supply chains, through complex data transformation and analytics, to AI enabled business processes.

The failure modes of such systems have become harder to predict and the potential impacts of any system failure can cascade in ways which are also unpredictable. Cyber attacks add the complexity of an intelligent and determined adversary to the picture and have become a mainstream cause of operational downtime. The impact of such attacks can now cause significant economic drag and broader supply chain disruption.

In a world of complex and uncertain geopolitics we can expect increasing hostile activity by States as previously accepted norms of international behaviour are challenged. National governments have become concerned over the ability of critical infrastructure providers to absorb shocks and demonstrate resilience to a broad range of disruptive scenarios. Is it sufficient to rely on classical approaches to operational risk management in this world?

2. Boards need to understand how severe loss forms

In many board and executive discussions, cyber risk is still framed primarily through preventive questions: which threats are rising, where key control weaknesses sit, and how defensive investment is progressing. Those are legitimate concerns, but they do not answer the questions that drive loss in major incidents: how quickly the event can be contained, how far business value could fall during disruption, and how the organisation regains enough operating capacity to stabilise the situation.

That gap is partly structural. Different practitioner communities have developed around different parts of the problem. Security teams have focused on understanding threat actors, attack paths, and strength of controls. Operational resilience, crisis management, service continuity, and technology recovery teams have focused on sustaining operations through disruption and restoring them afterwards. Legal, regulatory, and communications teams have focused on managing the consequences and defending brand and reputation. The issue is not that one group is more mature than another. The issue is that the severity of the event is heavily influenced by their collective performance.

This becomes even more important in tightly coupled environments, which now describes most organisations. Critical services depend on interconnected technology rather than discrete systems. Key suppliers support multiple processes at once, so a failure in one relationship can degrade several services together. Identity now sits so centrally in modern estates that compromise can disrupt critical access, administration, recovery, and the safe re-establishment of control at the same time. Restoration is rarely a single technical exercise. It is a sequence of dependencies, hand-offs, prioritisation choices, and business trade-offs made under pressure. That is why severe losses often emerge through cascading failure, delay, and recovery difficulty, not the initial event alone. The consequences of choices made in haste can play out over days, weeks and even months.

This is where the concept of a recovery curve can be useful because it helps describe the operational side of severity in a way leaders can use. It helps answer two practical questions: how far does the organisation’s ability to deliver value fall, and how quickly does it recover enough of that value?

A strong curve restores most critical value quickly, even if full restoration takes longer. A weak curve leaves core services impaired for too long, extends the period of financial loss, and creates more room for secondary consequences to accumulate. The acceptable shape will vary by organisation, industry, and service, and should reflect who is harmed when that service is disrupted. This could include customers, patients, citizens, market participants, counterparties, suppliers, or the wider community. In regulated sectors, that judgement will also be shaped by formal expectations about how much harm to essential services can be tolerated. The key difference is whether disruption remains absorbable or develops into an event that affects annual performance, strategic delivery, or confidence in the institution.

Figure 1: How resilience changes the recovery path of operational disruption

Illustrative example of a recovery curve showing how far critical value falls after disruption and how quickly it returns. Stronger resilience reduces initial drop, raises the degraded operating floor, and shortens time below minimum viable service,

The curve also forces a more precise resilience discussion. Leaders should ask which part of the curve an investment changes: the initial drop, the degraded operating level, the time below minimum viable service, the recovery gradient, the long tail, or some combination of them all.

Ransomware is a useful example because it consolidates several versions of the problem into a single scenario. It tests whether the spread of malware can be limited, whether recovery works at scale, whether business workarounds preserve enough service to keep operating and whether decisions are taken quickly enough to prevent secondary damage. It also increasingly carries confidentiality and legal dimensions through data theft, extortion, regulatory exposure, and claims risk, while demanding that the impacted organisation maintain the confidence of its stakeholders. That combination shows why the upper end of the loss profile is changing. Some events can be operationally contained and still produce a long and expensive tail through litigation, enforcement, contractual consequences, and loss of trust. Many organisations still do not model those drivers explicitly enough when they assess scenarios, set investment priorities, or discuss downside exposure at senior level.

This is why organisations can feel well governed and still be materially exposed. They may have active committees, established frameworks, extensive reporting, and substantial control investment, yet still lack a clear basis for deciding how severe loss is actually created. How much should be spent reducing the chance of an event, and how much should be spent reducing the scale of harm once it occurs? Which scenarios can be reshaped materially through containment, restoration, and degraded operation, and which require greater weight on prevention and assurance because restoration does little to repair the damage once triggered? Where that trade-off remains implicit, funding will usually follow organisational boundaries, established disciplines, or the latest pressure rather than a disciplined view of how severe losses are actually created.

These are the decisions we now need to make explicit and ultimately those which will challenge the executive and board in the midst of a crisis.

3. Loss curves reveal which investment levers to use

Once the decision is framed properly, the next requirement is a tool that makes it easier to evaluate. The loss exceedance curve (LEC) does that well because it measures exposure against an unambiguous financial figure. That creates a unifying lens for a wide range of operational and other loss dimensions that organisations worry about, including disruption, response costs, recovery costs, legal consequences, and wider financial harm. It also shifts the discussion onto a threshold: how much loss, how much operational damage, or how much interruption to critical value delivery would be unacceptable? Once that line is defined, it becomes easier to judge which form of investment does most to reduce the chance of crossing it.

The LEC separates two different ways of changing the outcome. One is to reduce the probability of a severe event. That shifts the curve down by lowering the frequency with which losses exceed a given level. Prevention and detection do that by reducing the chance that the organisation enters severe loss territory at all.

The other is to reduce the scale of loss once an event has occurred. That shifts the curve left by reducing the financial consequences associated with realised incidents. Containment, restoration, all back arrangements, and the ability to sustain critical services in degraded conditions do that by compressing the loss profile once disruption is underway.

Figure 2 illustrates that interaction. It shows how the intersection of risk and resilience can be understood through two investment levers acting on the same loss profile: one reduces the frequency with which severe losses occur, and the other reduces their scale.

Figure 2: The intersection of risk and resilience through the loss exceedance curve

*Illustrative example of how prevention, detection, containment, and restoration can reduce the frequency and scale of severe loss.*

The LEC and the recovery curve described in the previous section reinforce each other. The LEC shows how often losses exceed a threshold. The recovery curve explains how those losses are created operationally once disruption begins. A weak recovery curve extends the period in which business value is impaired, increases the opportunity for secondary consequences to accumulate, and pushes losses further into the tail. A stronger recovery curve restores enough critical value more quickly and limits the scale of loss that follows.

This is important because the operational mechanics underneath resilience are complex. Recovery performance depends on decisions about architecture, segregation, backup design, recovery sequencing, service prioritisation, fallback arrangements, third-party dependencies, and the ability to operate safely in degraded conditions. Those are difficult engineering and operating questions. The value of the LEC and recovery curve is that they give leaders a better way to frame them. They help test which capabilities are likely to reduce severe losses, where the recovery curve needs to improve, and whether the operational design supports that outcome.

Used together, they improve the quality of the trade-off. Leaders can ask whether a proposed investment is more likely to reduce the frequency of severe events, reduce the losses associated with them, or both. They can also ask whether the claimed effect is evidenced. An improvement in resilience should be observable in tested recovery performance, faster containment, more credible degraded operation, or a shorter path back to critical value delivery.

There are, however, important boundaries to this logic. Some outcomes are difficult to repair once triggered and, in extreme cases, effectively unrecoverable. This is more likely where confidentiality or integrity failures carry lasting legal, regulatory, or trust consequences. In those cases, restoration has limited value on its own because service recovery does not repair the full loss. The investment case shifts accordingly toward preventing the event and demonstrating that the relevant controls are operating as intended.

The converse can also be true when there are no cost-effective means of reducing the likelihood of an event occurring, the event is essentially outside our control. In such circumstances the investment case shifts towards mitigating the impact and ensuring resilience. In a world of evolving geopolitics, resilience against low probability but high impact events perpetrated by hostile states is becoming increasingly important.

This article is called The Evolving Intersection of Risk and Resilience because prevention and resilience can no longer be treated as separate board conversations. Loss curves reveal which investment lever is most relevant for each scenario: reducing the likelihood of the event, limiting its escalation into severe loss, or preserving critical value when disruption still occurs. The board task is to use that evidence to fund the lever that most reduces the chance of intolerable loss.

4. Cyber exposes where risk and resilience converge

Cyber complicates these trade-offs because the tail of the loss profile is harder to understand than it first appears. In many other operational risks, the relationship between cause, disruption, and loss appears relatively stable. Cyber is harder because both the event path and the consequence profile are more variable. The scale of loss depends on how compromise begins, how far it spreads, which dependencies are affected, whether containment and recovery are effective, and what consequences follow once customers, regulators, counterparties, and claimants react.

Jurisdiction adds another layer of uncertainty to how consequences unfold. The same event can produce very different outcomes depending on where customers, claimants, regulators, and counterparties sit. Litigation appetite, regulatory posture, contractual frameworks, and assumptions about recoverable harm vary materially across markets. That is one reason cyber losses can look very different even when the underlying technical event looks broadly similar.

DigiNotar remains a useful boundary case for the “unrecoverable” point. In 2011, the compromise of the Dutch certificate authority led to the fraudulent issuing of certificates and a wider loss of trust in its services. ENISA described the incident as an attack on the foundations of secure electronic communications [1], and DigitNotar subsequently filed for bankruptcy after the breach and withdrawal of trust by major relying parties.

The example should not be overextended. Most organisations are not certificate authorities, and most incidents do not make the business model unrecoverable. Its value is that it shows the boundary condition. In some scenarios, the organisations can recover systems but not fully recover trust. That is why the treatment strategy cannot be based on restoration alone.

This is what makes the framework from the previous section necessary in cyber, but harder to apply. Leaders still need to decide when the greatest reduction in severe loss will come from pushing the curve down, shifting it left, or doing both. The difficulty is that cyber losses do not build through a single mechanism. In some scenarios, severity is driven mainly by spread, containment, recovery speed, and the organisation’s ability to sustain critical services while systems remain impaired. In others, a substantial part of the tail sits in legal exposure, regulatory response, contractual consequence, and loss of trust that can continue long after operations are restored. Those drivers of loss are often owned by different teams, evidenced through different disciplines, and funded through different parts of the organisation. Some scenarios are therefore highly sensitive to recovery performance. Others can be operationally contained and still remain financially severe.

Once this is accepted, the next question is how leaders should make that trade-off explicitly, scenario by scenario, in a way that drives funding, evidence, and accountability.

5. The board task is to rebalance prevention and resilience

Frequently we focus on reducing the likelihood of events occurring but pay insufficient attention to the options of reducing impact. We need to make this trade space more explicit and for those scenarios which could cause intolerable outcomes ask what the best approach to addressing risk and resilience is.

Should we seek to move the LEC down and reduce the likelihood of occurrence if the scenario is within our control, or should we move the LEC left and reduce the impact through greater resilience and more rapid recovery?

Which of these options individually, or in concert, gives us the most cost-effective reduction of the risk of causing intolerable harm or impact?

Perhaps we should ask ourselves what do we really mean by cyber risk, and which aspects of that risk should we treat in each way. In the case of operational disruption scenarios should we focus more on availability, continuity, restoration and recovery with resilience engineering giving high leverage. While for trust/legal sanction scenarios should we continue to focus on prevention and assurance.

Reaching an answer to these questions also requires us to bring together diverse experts from security and operational resilience communities who often view the world through very different lenses and live indifferent worlds. Without shared decision for a and common scenarios these trade-offs remain implicit, often unresolved and investment decisions sub-optimal.

So when we talk about cyber risk in future, perhaps executives and board risk committees should challenge themselves over whether they should really be focussing on likelihood reduction (down), impact reduction (left) or whether there is something about the scenario which means it is unrecoverable or beyond our control and the choice is made for us.

Ultimately, we strive for a cost-effective way of reducing the risk of intolerable harm - and in doing so aim for a resilient organisation. Perhaps risk and resilience aren’t all that different after all.

Author

James Hanbury

Global Lead Director, Co-founder

James is the co-founder and Global Lead Director of CRI. He has spent over a decade working with cyber and risk teams, helping them bring more structure and clarity to how cyber risk is measured and communicated. James began building the earliest versions of CRI's models back in 2016, using Excel to explore how organisations could approach cyber risk in a more decision-focused way. That work has since grown into a SaaS-enabled capability now used by clients around the world. Based in London, James continues to work closely with CRI's clients and partners, focusing on how to make cyber risk quantification useful, explainable, and easier to adopt in practice.

Guest Author

David Ferbrache OBE

Managing Director

Beyond Blue Ltd

David is an award-winning UK and international national cyber security expert and the Managing Director of Beyond Blue, with over 30 years of cyber and information security expertise. David has held senior roles as the Head of Cyber and Space for the Ministry of Defence, KPMG’s Global Head of Cyber Future and the Chair of the Scottish National Cyber Resilience Advisory Board guiding the implementation of the country’s cyber resilience strategy.

The evolving intersection of risk and resilience

A board decision lens for rebalancing prevention, containment, questions for deciding what to prevent, what to contain, and what to withstand.

1. Classical risk management approaches are being tested

2. Boards need to understand how severe loss forms

3. Loss curves reveal which investment levers to use

4. Cyber exposes where risk and resilience converge

5. The board task is to rebalance prevention and resilience

Latest Insights

The Future of MDR: From reactive monitoring to intelligence-led attack disruption

Reinventing cyber budgeting: From legacy spend to quantified risk

APT campaigns and their ripple effect on cyber risk

See CRI in action