Rarely a day goes by that we don't see news about the poor state of affairs in cybersecurity. From data breaches at Target, the U.S. Office of Personnel Management, Sony, Disney, Yahoo!, Equi-fax and Marriot, the drumroll continues unabated. We are now in a world, where it's a matter of when, not if, an organization is compromised by a cyber-attack.
Most of us think of cybersecurity as a series of controls (tools and knobs) that an organization has to implement, and it seems perplexing why cyber-defenders in the situations mentioned here failed to take the necessary steps to protect themselves. Our focus on addressing cybersecurity challenges has been around inventing new controls (or enhancing existing ones) and implementing them correctly in the enterprise. This is an inadequate view.
In this Viewpoint, we show why cybersecurity is a very difficult problem. The enterprise attack surface is massive and growing rapidly. There are practically unlimited permutations and combinations of methods by which an adversary can attack and compromise our networks. There is a big gap between our current tools and methods, and what is needed to get ahead of cyber-adversaries.
The Enterprise Attack Surface
In order to better understand the nature and structure of the enterprise attack surface, let's take a quick look at the abstract picture of the attack surface as shown in Figure 1.a On the x-axis we have the different parts of the enterprise's extended network where things can go wrong from a cyber-security standpoint. On the y-axis we have the specific ways in which these things will go wrong—also known as attack vectors or breach methods.
The x-axis includes the organization's traditional infrastructure (servers, databases, switches, routers, and so forth), applications (standard and custom), endpoints (managed, un-managed, mobile and fixed, IoTs, industrial controllers, and so forth), and cloud apps (sanctioned and unsanctioned).
At the right end of the x-axis, we have the organization's third-party vendors. The x-axis effectively repeats itself recursively in the organization's supply chain, where each third-party vendor is an entity with an x-axis and attack surface just like that of the organization, and this brings risk into the enterprise network because of certain trust relationships. The ellipses on the x-axis indicate that these categories of assets are large sets. It is quite difficult for most organizations to even enumerate their x-axis with accuracy.
On the y-axis, we have the different methods of attacks—starting from simple things like weak and default passwords, reused password, passwords stored incorrectly on disk, or transmitted in the clear, on to more complex things like phishing, social engineering, and unpatched software. Further down the y-axis, we have zero-day vulnerabilities—security bugs that are "unknown" until they are first used by an adversary. There are quite literally 100s of items on the y-axis in dozens of categories.
Each point in this x-y graph represents one way by which adversaries can compromise an enterprise asset. Note that each point is a vector not just a single number. To see this, consider the highlighted point of Figure 1, Line-of-business apps (x-axis) and shared passwords (y-axis). This is the idea that perhaps an enterprise employee's password for a personal account (for example, for Yahoo! or LinkedIn) is the same as their password for one of their enterprise app accounts. So, if Yahoo! or LinkedIn is breached, and the passwords were stolen (and were not properly hashed on disk) then the enterprise has a problem,11 perhaps one million enterprise app accounts with reused passwords—easy ways for the adversary to get unauthorized access (Figure 2).
Most cyber-defenders have no idea what this "Password Sharing Risk Vector" looks like for their business. The Verizon Data Breach Investigations Report9 claims more than 80% of breaches involve password issues at some stage of the breach.
This gigantic x-y plot is the enterprise attack surface. In a typical breach, adversaries use some point on this attack surface to compromise an (Internet-facing) asset. Other points are then used to move laterally across the enterprise, compromise some valuable asset, and then to exfiltrate data or do some damage. Figure 3 shows how the Equifax breach8 unfolded. As you can imagine, the number of items on both of these axes grow as we adopt new technologies in the digital transformation of our businesses.
What Is Our Breach Risk?
For a CISO or CIO, the likely top-of-mind question is at what points on this attack surface is the enterprise at risk. What does the risk-heat map look like, that is, what is Risk = Likelihood X Impact at each point of the enterprise attack surface?
We must consider the points of the attack surface where we have vulnerabilities, for example, unpatched software. We also must factor in exposure due to usage—a device with unpatched Internet Explorer is not necessarily a critical risk if the default browser of the user is Chrome. We should prioritize real threats—considering what is currently fashionable with (or possible for) adversaries, and not waste time worrying about theoretical issues when we have numerous open security issues that present clear and present danger.
Furthermore, we must take mitigating controls into account—the enterprise's investments into security controls like firewalls, anti-phishing systems, EDR, and so forth. We want credit for the points on the attack surface where the enterprise has successfully mitigated risk, provided the control is working.
Finally, not every point on the x-axis is equally important. We have to distinguish between critical or important assets and those which are not so important and estimate business impact of a compromised asset.
Cybersecurity checklists lull you into a false sense of security.
The resulting computation and risk heat map factoring in vulnerabilities, exposure, threats, mitigating controls and business criticality might look like Figure 4.b This risk calculation is not simple. To accurately understand an enterprise's security posture and breach risk, we need to (repeatedly) solve a hyper-dimensional math problem over the tens (or hundreds) of thousands of assets and 100+ attack vectors. Attacks need to be modeled as a chain of probabilistic events, starting from the compromise of some asset exposed to the Internet, followed by attack propagation to compromise additional valuable assets.
In order to improve cybersecurity posture and decrease breach risk, we must reason about what actions will bring about the greatest reduction of breach risk for the enterprise. This also requires calculating cyber-resilience—the ability of an enterprise to limit the impact of cyber-attacks.2 Analyzing and improving enterprise cybersecurity posture is not a human-scale problem anymore. Plugging in some numbers into Figure 4, for an organization with a thousand employees, there are over 10 million time-varying signals that must be analyzed to accurately predict breach risk. For an organization with 100,000 employees, we must incorporate several 100 billion time-varying signals in the risk calculation.
Cybersecurity Practice Today
Traditional methods such as vulnerability assessment (for example, with Qualys, Rapid7, or Tenable) and penetration testing are only able to analyze a small fraction of the attack surface of Figure 1. These legacy methods produce output that is voluminous, unprioritized, and often irrelevant (for example, asking to patch an IE CVE on a laptop where the default browser is Chrome). Most organizations are unable to keep their systems patched and free of known vulnerabilities.
Due to a lack of a viable proactive strategy, much effort and money goes into detecting and reacting to cyber-security events. This is the investment to set up and operate Security Information and Event Management systems (SIEMs) and Security Operations Centers (SOCs). Logs and alerts from enterprise systems are collected and analyzed for indicators of compromise. False positive are a huge challenge, and attackers slip through routinely. Recent studies peg the average dwell time of undetected attackers in the enterprise at approximately 200 days.5
Some security teams run cybersecurity checklists aligned against frameworks such as NIST 1.1,6 CIS 100, or SOC 2. This is an important step in the right direction, but inadequate. Compliance checklists don't work in cybersecurity because of the scale of the underlying math, our constantly changing software, and a dynamic adversary.1,3 In hundreds of conversations with CIOs and CISOs of the Fortune 1000 over the course of the last few years, it is clear to this author that the vast majority of organizations do not have an accurate (much less realtime) inventory of their assets—they cannot enumerate the x-axis of Figure 1. Moreover, many important security attributes of assets (y-axis in Figure 1), such as reused passwords, are omitted from the cybersecurity checklist because they are deemed too difficult to measure.4 More generally, individual security practitioners routinely make decisions to accept critical risk factors on their checklists because IT has not figured out how to mitigate these factors and still keep the environment operational. All this leads to a systematic buildup of risk that the CIO or CISO is not aware of. Cybersecurity checklists lull you into a false sense of security.
Ultimately, a poor understanding of the massive attack surface results in waste, frustration, and anxiety. Most discussions on cybersecurity posture and risk between the board of directors and C-suite execs are based on gut and incomplete data. Organizations are unable to answer simple questions such as "What is the risk to our intellectual property from cyber-attacks?" We spend a lot of money on security tools, but don't know what we are getting in return in terms of reduction in breach risk. New products are routinely launched without much thought about cybersecurity. In spite of millions of dollars of annual security spending, most enterprises are just one bad click, one reused password, or a single unpatched system away from a major breach.
It is useful to note the huge gap between the requirements for cybersecurity professionals who can understand and address the challenges of a practically unlimited attack surface (Figure 4), and the education and training being offered in university computer science programs. A recent study7 noted none of the top 10 CS undergraduate programs require a cybersecurity course in order to graduate. While a small (but increasing) number of professional master's degree cybersecurity programs are now being offered by top 40 CS departments, these tend to focus on a handful of basic elements of computer security, particularly crypto—and how to secure a small number of points on the attack surface of Figure 1 using existing tools. Some programs teach incidence response and forensics.
New products are routinely launched without much thought about cybersecurity.
Our cybersecurity training programs do not consider cyber-insecurity as a networkwide (probabilistic) risk optimization problem. We are not teaching our future technologists how to design and create cyber-resilient distributed systems, or how machine learning, automation, and data visualization can serve as powerful tools to understand and mitigate cyber-risk.
Call to Action
As a discipline, CS must start thinking of cybersecurity as a probabilistic risk optimization problem. This author's organization has done some work on being able to discover and quantify cybersecurity posture in the spirit of this Viewpoint. We have developed a system that makes continuous observations of the extended enterprise from multiple vantage points including network, endpoint, configuration, and logs. This data is analyzed by an ensemble of machine learning models to surface inventory, business impact, breach likelihood, and cyber risk. The system also provides a prioritized set of possible mitigating actions to reduce risk along with simulation tools to estimate the pro-forma ROI of contemplated mitigating actions. Early experience with this system at many Fortune 1000 organizations tells a bittersweet story of both despair (red heatmaps) and hope (we can now measure, so we will improve).
Much research needs to be done on understanding the principles of cyber-resilient distributed systems design and feasibility of bolting-on cyber-resilience enhancing controls on top of legacy systems. The work done on developing zero-trust frameworks like BeyondCorp10 is a good beginning.
CS educators should reevaluate course curriculum in order to better prepare students for the cybersecurity realities highlighted in this Viewpoint.
1. Bailey, K. Why compliance does not equal security; http://bit.ly/2H3LymD
2. Banga, G. Balbix Blog: What is cyber-resilience? (2017); http://bit.ly/2Sr0XCD
3. Banga, G. Cybersecurity 101 for the C-suite and board members; http://bit.ly/39hA0YS
4. Das, A. et al. The tangled web of password reuse. NDSS Symposium 2014; http://bit.ly/3bivr2o
5. IBM. Cost of a Data Breach Study by Ponemon (2018); https://ibm.co/374AK1Z
6. NIST. Framework for Improving Critical Infrastructure Cybersecurity (2018); http://bit.ly/3biv9Zm
7. Syed, S. CloudPassage blog: U.S. universities get "F" for cybersecurity education (2016); http://bit.ly/2OBIkLr
8. U.S. Government Accountability Office. Report to Congressional Requesters: Actions Taken by Equifax and Federal Agencies in Response to the 2017 Breach; http://bit.ly/2Sv2DLH
9. Verizon. Data Breach Investigations Report (2017); https://vz.to/2Stn2R4
10. Ward, R. and Beyer, B. BeyondCorp: A new approach to enterprise security. ;login: 39, 6 (June 2014): http://bit.ly/2Oxo3GE
11. Wikipedia. 2012 LinkedIn hack (2012), http://bit.ly/2SqnLmf
b. Figure 4 was generated using real data from a Fortune 1000 customer of Balbix.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2020 ACM, Inc.