spacer
spacer spacer
spacer
Home > CWSS Common Weakness Scoring System (CWSS)  

Common Weakness Scoring System (CWSS™)

The MITRE Corporation
Copyright © 2011
cwe.mitre.org/cwss/

CWSS version: 0.8

Document version: 0.8

Revision Date: June 27, 2011

Project Coordinator:

Bob Martin (MITRE)

Document Editor:

Steve Christey (MITRE)

Introduction
Introduction

When a security analysis of a software application is performed, such as when using an automated code auditing tool, developers often face hundreds or thousands of individual bug reports for weaknesses that are discovered in their code. In certain circumstances, a software weakness can lead to an exploitable vulnerability. For example, a buffer overflow vulnerability might arise from a weakness in which the programmer does not properly validate the length of an input buffer. This weakness only contributes to a vulnerability if the input can be influenced by a malicious party, and if that malicious input can copied to an output buffer that is smaller than the input.

Due to the high volume of reported weaknesses, developers are forced into a situation in which they must prioritize which issues they should investigate and fix first. Similarly, when assessing design and architecture choices and their weaknesses, there needs to be a method for prioritizing them relative to each other and with the other issues of the application. Finally, software consumers want to know what they should worry about the most, and what to ask for to get a more secure product from their vendors and suppliers.

Further complicating the problem, the importance of a weakness may vary depending on business or mission needs, the kinds of technologies in use, and the threat environment.

In short, people need to be able to reason and communicate about the relative importance of different weaknesses. While various scoring methods are used today, they are either ad hoc or inappropriate for application to the still-imprecise evaluation of software security.

The Common Weakness Scoring System (CWSS) provides a mechanism for scoring weaknesses in a consistent, flexible, open manner while accommodating context for the various business domains. It is a collaborative, community-based effort that is addressing the needs of its stakeholders across government, academia, and industry. CWSS is a part of the Common Weakness Enumeration (CWE) project, co-sponsored by the Software Assurance program in the National Cyber Security Division (NCSD) of the US Department of Homeland Security (DHS).

CWSS:

  • provides a common framework for prioritizing security errors ("weaknesses") that are discovered in software applications
  • provides a quantitative measurement of the unfixed weaknesses that are present within a software application
  • can be used by developers to prioritize unfixed weaknesses within their own software
  • in conjunction with the Common Weakness Risk Analysis Framework (CWRAF), can be used by consumers to identify the most important weaknesses for their business domains, in order to inform their acquisition and protection activities as one part of the larger process of achieving software assurance.

Table of Contents
Table of Contents
  • Stakeholders
  • CWSS Design Considerations
  • Scoring Methods within CWSS
  • CWSS 0.6 Scoring for Targeted Software
    • Scoring
    • CWSS Metric Groups
  • Supporting Uncertainty and Flexibility Within Factors
  • Base Finding Metric Group
    • Technical Impact (TI)
    • Acquired Privilege (AP)
    • Acquired Privilege Layer (AL)
    • Internal Control Effectiveness (IC)
    • Finding Confidence (FC)
  • Attack Surface Metric Group
    • Required Privilege (RP)
    • Required Privilege Layer (RL)
    • Access Vector (AV)
    • Authentication Strength (AS)
    • Authentication Instances (AI)
    • Level of Interaction (IN)
    • Deployment Scope (SC)
  • Environmental Metric Group
    • Business Impact (BI)
    • Likelihood of Discovery (DI)
    • Likelihood of Exploit (EX)
    • External Control Effectiveness (EC)
    • Remediation Effort (RE)
    • Prevalence (P)
  • CWSS Score Formula
    • Base Finding Subscore
    • Attack Surface Subscore
    • Environmental Subscore
    • Additional Features of the Formula
  • CWSS Vectors, Scoring, and Score Portability
    • Example 1: Business-critical application
    • Example 2: Wiki with limited business criticality
    • Other Approaches to CWSS Score Portability
  • Considerations for CWSS beyond 0.6
    • Current Limitations of the Scoring Method
    • Community Review and Validation of Factors
    • Additional CWSS Factors
    • Constituency-focused Scoring
    • Impact of CWSS and CWE Changes to Factors and Subscores
  • Future Activities
  • Community Participation in CWSS
  • Appendix A: CVSS
    • CVSS in a Software Assurance Context
    • Adaptations of CVSS
    • Comparison of CWSS Factors with CVSS
    • Other Differences between CVSS and CWSS
  • Appendix B: Other Scoring Methods
    • 2008 CWSS Kickoff Meeting
    • 2010 SANS/CWE Top 25
    • 2010 OWASP Top Ten
    • Other Models
  • Appendix C: Generalized Scoring Approaches
    • Prevalence Assessment
  • Appendix D: Aggregated Scoring Methods: Measuring Weakness Surface
  • Change Log

Stakeholders
Stakeholders

To be most effective, CWSS supports multiple usage scenarios by different stakeholders who all have an interest in a consistent scoring system for prioritizing software weaknesses that could introduce risks to products, systems, networks and services. Some of the primary stakeholders are listed below.

StakeholderDescription
Software developers often operate within limited time frames, due to release cycles and limited resources. As a result, they are unable to investigate and fix every reported weakness. They may choose to concentrate on the worst problems, the easiest-to-fix. In the case of automatic weakness findings, they might choose to focus on the findings that are least likely to be false positives.

Software development managers

create strategies for prioritizing and removing entire classes of weaknesses from the entire code base, or at least the portion that is deemed to be most at risk, by defining custom "Top-N" lists. They must understand the security implications of integrating third-party software, which may contain its own weaknesses. They may need to support distinct security requirements and prioritization for each product line.

Software acquirers

want to obtain third-party software with a reasonable level of assurance that the software provider has performed due diligence in removing or avoiding weaknesses that are most critical to the acquirer's business and mission. Related stakeholders include CIOs, CSOs, system administrators, and end users of the software.

Code analysis vendors and consultants

want to provide a consistent, community-vetted scoring mechanism for different customers.

Evaluators of code analysis capabilities

evaluate the capabilities of code analysis techniques (e.g., NIST SAMATE). They could use a consistent weakness scoring mechanism to support sampling of reported findings, as well as understanding the severity of these findings without depending on ad hoc scoring methods that may vary widely by tool/technique.

Other stakeholders may include vulnerability researchers, advocates of secure development, and compliance-based analysts (e.g., PCI DSS).

CWSS Design Considerations
CWSS Design Considerations

For CWSS to be most effective to its stakeholders, several aspects of the problem area must be considered when designing the framework and metrics. Some of these considerations might not be resolved until several revisions of CWSS have been released and tested.

  • CWSS scoring will need to account for incomplete information throughout much of the lifecycle of a reported weakness. For example, scoring may be necessary before the weakness is even known to contribute to a vulnerability, e.g. in the initial output from an automated code scanning tool. Second, the entity (human or machine) assigning the initial CWSS score might not have complete information available, e.g. the expected operating environment. Finally, some factors in the CWSS score might rely on trend information (such as frequency of occurrence) that is only estimated due to lack of sufficient statistical data. For example, the 2010 CWE Top 25 relied on survey results, because very few sources had prevalence data at the same level of detail as the weaknesses being considered for the list. Incomplete information is a challenge for CVSS scoring, and it is expected to be even more important for CWSS.
  • It is assumed that portions of CWSS scores can be automatically generated. For example, some factors may be dependent on the type of weakness being scored; potentially, the resulting subscores could be derived from CWE data. As another example, a web script might only be accessible by an administrator, so all weaknesses may be interpreted in light of this required privilege.
  • CWSS should be scalable. Some usage scenarios may require the scoring of thousands of weaknesses, such as defect reports from an automated code scanning tool. When such a high volume is encountered, there are too many issues to analyze manually. As a result, automated scoring must be supported.
  • The potential CWSS stakeholders, their needs, and associated use cases should be analyzed to understand their individual requirements. This might require support for multiple scoring techniques or methods.
  • Associated metrics must balance usability with completeness, i.e., they cannot be too complex.
  • Environmental conditions and business/mission priorities should impact how scores are generated and interpreted.
  • CWSS should be automatable and flexible wherever possible, but support human input as well.

Scoring Methods within CWSS
Scoring Methods within CWSS

The stakeholder community is collaborating with MITRE to investigate several different scoring methods that might need to be supported within the CWSS framework.

MethodNotes
Targeted

Score individual weaknesses that are discovered in the design or implementation of a specific ("targeted") software package, e.g. a buffer overflow in the username of an authentication routine in line 1234 of vuln.c in an FTP server package. Automated tools and software security consultants use targeted methods when evaluating the security of a software package in terms of the weaknesses that are contained within the package.

Generalized

Score classes of weaknesses independent of any particular software package, in order to prioritize them relative to each other (e.g. "buffer overflows are higher priority than memory leaks"). This approach is used by the CWE/SANS Top 25, OWASP Top Ten, and similar efforts, but also by some automated code scanners. The generalized scores could vary significantly from the targeted scores that would result from a full analysis of the individual occurrences of the weakness class within a specific software package. For example, while the class of buffer overflows remains very important to many developers, individual buffer overflow bugs might be considered less important if they cannot be directly triggered by an attacker and their impact is reduced due to OS-level protection mechanisms such as ASLR.

Context-adjusted

Modify scores in accordance with the needs of a specific analytical context that may integrate business/mission priorities, threat environments, risk tolerance, etc. These needs are captured using vignettes that link inherent characteristics of weaknesses with higher-level business considerations. This method could be applied to both targeted and generalized scoring.

Aggregated

Combine the results of multiple, lower-level weakness scores to produce a single, overall score (or "grade"). While aggregation might be most applicable to the targeted method, it could also be used in generalized scoring, as occurred in the 2010 CWE/SANS Top 25.

The current focus for CWSS is on the Targeted scoring method and a framework for context-adjusted scoring. Methods for aggregated scoring will follow. Generalized scoring is being developed separately, primarily as part of the 2011 Top 25 and CWRAF.

CWSS 0.6 Scoring for Targeted Software
CWSS 0.6 Scoring for Targeted Software

In CWSS 0.6, the score for each reported weakness (finding) is calculated using 18 different factors in three metrics groups.

Scoring

In CWSS 0.6, the score for a weakness, or a weakness bug report ("finding") is calculated using 18 different factors, across three metric groups:

  • the Base Finding group, which captures the inherent risk of the weakness, confidence in the accuracy of the finding, and strength of controls.
  • the Attack Surface group, which captures the barriers that an attacker must cross in order to exploit the weakness.
  • the Environmental group, which includes factors that may be specific to a particular operational context, such as business impact, likelihood of exploit, and existence of external controls.

spacer

(A larger picture is available.)

CWSS Metric Groups

CWSS can be used in cases where there is little information at first, but the quality of information can improve over time. It is anticipated that in many use-cases, the CWSS score for an individual weakness finding may change frequently, as more information is discovered. Different entities may evaluate separate factors at different points in time.

As such, every CWSS factor effectively has "environmental" or "temporal" characteristics, so it is not particularly useful to adopt the same types of metric groups as are used in CVSS.

Metric GroupFactors
Base Finding Group

* Technical Impact (TI)

* Acquired Privilege (AP)

* Acquired Privilege Layer (AL)

* Internal Control Effectiveness (IC)

* Finding Confidence (FC)

Attack Surface Group

* Required Privilege (RP)

* Required Privilege Layer (RL)

* Access Vector (AV)

* Authentication Strength (AS)

* Authentication Instances (AI)

* Level of Interaction (IN)

* Deployment Scope (SC)

Environmental Group

* Business Impact (BI)

* Likelihood of Discovery (DI)

* Likelihood of Exploit (EX)

* External Control Effectiveness (EC)

* Remediation Effort (RE)

* Prevalence (P)

Supporting Uncertainty and Flexibility Within Factors
Supporting Uncertainty and Flexibility Within Factors

Most factors have three values in common:

ValueUsage
Unknown

The entity calculating the score does not have enough information to provide a value for the factor. This can be a signal for further investigation. For example, an automated code scanner might be able to find certain weaknesses, but be unable to detect whether any authentication mechanisms are in place. The use of "Unknown" emphasizes that the score is incomplete or estimated, and further analysis may be necessary. This makes it easier to model incomplete information, and for the Business Value Context to influence final scores that were generated using incomplete information. The weight for this value is 0.5 for all factors, which generally produces a lower score; the addition of new information (i.e., changing some factors from "Unknown" to another value) will then adjust the score upward or downward based on the new information.

Not Applicable

The factor is being explicitly ignored in the score calculation. This effectively allows the Business Value Context to dictate whether a factor is relevant to the final score. For example, a customer-focused CWSS scoring method might ignore the remediation effort, and a high-assurance environment might require investigation of all reported findings, even if there is low confidence in their accuracy. For a set of weakness findings for an individual software package, it is expected that all findings would have the same "Not Applicable" value for the factor that is being ignored.

Quantified

The factor can be weighted using a quantified, continuous range of 0.0 through 1.0, instead of the factor's defined set of discrete values. Not all factors are quantifiable in this way, but it allows for additional customization of the metric.

Default

The factor's weight can be set to a default value. Labeling the factor as a default allows for investigation and possible modification at a later time.

Base Finding Metric Group
Base Finding Metric Group

The Base Finding metric group consists of the following factors:

  • Technical Impact (TI)
  • Acquired Privilege (AP)
  • Acquired Privilege Layer (AL)
  • Internal Control Effectiveness (IC)
  • Finding Confidence (FC)

The combination of values from Technical Impact, Acquired Privilege, and Acquired Privilege Layer gives the user some expressive power. For example, the user can characterize "High" Technical Impact with "Administrator" privilege at the "Application" layer.

Technical Impact (TI)

Technical Impact is the potential result that can be produced by the weakness, assuming that the weakness can be successfully reached and exploited. This is expressed in terms that are more fine-grained than confidentiality, integrity, and availability.

ValueCodeWeightDescription
Critical C 1.0 complete control over the software, the data it processes, and the environment in which it runs (e.g. the host system), to the point where operations cannot take place.
High H 0.9
Medium M 0.6
Low L 0.3
None N 0.0
Default D 0.6 The Default weight is the median of the weights for Critical, High, Medium, Low, and None.
Unknown Unk 0.5
Not Applicable NA 1.0 This factor might not be applicable in an environment with high assurance requirements; the user might want to investigate every weakness finding of interest, regardless of confidence.
Quantified Q This factor could be quantified with custom weights.

If this set of values is not precise enough, CWSS users can use their own Quantified methods to derive a subscore. One such method involves using the Common Weakness Risk Analysis Framework (CWRAF) to define a vignette and a Technical Impact Scorecard. The Impact weight is calculated using vignette-specific Importance ratings for different technical impacts that could arise from exploitation of the weakness, such as modification of sensitive data, gaining privileges, resource consumption, etc.

Acquired Privilege (AP)

The Acquired Privilege identifies the type of privileges that are obtained by an entity who can successfully exploit the weakness. In some cases, the acquired privileges may be the same as the required privileges, which implies either (1) "horizontal" privilege escalation (e.g. from one unprivileged user to another), or (2) privilege escalation within a sandbox, such as an FTP-only user who can escape to the shell.

Notice that the values are the same as those for Required Privilege, but the weights are different.

The aconym "RUGNAP" can serve as a mnemonic for remembering the key values ("Regular User", "Guest", "None", "Admin", "Partially-Privileged").

ValueCodeWeightDescription
Administrator A 1.0 The entity has administrator, root, SYSTEM, or equivalent privileges that imply full control over the software or the underlying OS.
Partially-Privileged User P 0.9 The entity is a valid user with some special privileges, but not enough privileges that are equivalent to an administrator. For example, a user might have privileges to make backups, but not to modify the software's configuration or install updates.
Regular User RU 0.7 The entity is a regular user who has no special privileges.
Guest G 0.6 The entity acquires limited or "guest" privileges that can significantly restrict allowable activities. This could happen in an environment that uses strong privilege separation.
None N 0.1 No extra privileges are acquired.
Default D 0.7 Median of the weights for None, Guest, Regular User, Partially-Privileged User, and Administrator.
Unknown Unk 1.0 0.5
Not Applicable NA 1.0 This factor might not be applicable in an environment with high assurance requirements that wants strict enforcement of privilege separation, even between already-privileged users.

Note that this factor can not be quantified.

Acquired Privilege Layer (AL)

The Acquired Privilege Layer identifies the operational layer to which the entity gains access if the weakness can be successfully exploited.

A mnemonic for this factor is "SANE" (System, Application, Network, Enterprise).

ValueCodeWeightDescription
Application A 1.0

The entity must be able to have access to an affected application.

System S 0.9

The entity must have access to, or control of, a system or physical host.

Network N 0.7

The entity must have access to/from the network.

Enterprise E 1.0

The entity must have access to a critical piece of enterprise infrastructure, such as a router, switch, DNS, domain controller, firewall, identity server, etc.

Default D 0.9 Median of the weights for SANE.
Unknown Unk 0.5
Not Applicable NA 1.0 This factor might not be applicable in an environment with high assurance requirements that wants strict enforcement of privilege separation, even between already-privileged users.

Note that this factor can not be quantified.

Internal Control Effectiveness (IC)

An Internal Control is a control, protection mechanism, or mitigation that has been explicitly built into the software (whether through architecture, design, or implementation). Internal Control Effectiveness measures the ability of the control to render the weakness unable to be exploited by an attacker. For example, an input validation routine that restricts input length to 15 characters might be moderately effective against XSS attacks by reducing the size of the XSS exploit that can be attempted.

ValueCodeWeightDescription
None N 1.0 No controls exist.
Limited L 0.9 There are simplistic methods or accidental restrictions that might prevent a casual attacker from exploiting the issue.
Moderate M 0.7 The protection mechanism is commonly used but has known limitations that might be bypassed with some effort by a knowledgeable attacker. For example, the use of HTML entity encoding to prevent XSS attacks may be bypassed when the output is placed into another context such as a Cascading Style Sheet or HTML tag attribute.
Indirect (Defense-in-Depth) I 0.5 The control does not specifically protect against exploitation of the weakness, but it indirectly reduces the impact when a successful attack is launched, or otherwise makes it more difficult to construct a functional exploit. For example, a validation routine might indirectly limit the size of an input, which might make it difficult for an attacker to construct a payload for an XSS or SQL injection attack.
Best-Available B 0.3 The control follows best current practices, although it may have some limitations that can be overcome by a skilled, determined attacker, possibly requiring the presence of other weaknesses. For example, the double-submit method for CSRF protection is considered one of the strongest available, but it can be defeated in conjunction with behaviors of certain functionality that can read raw HTTP headers.
Complete C 0.0 The control is completely effective against the weakness, i.e., there is no bug or vulnerability, and no adverse consequence of exploiting the issue. For example, a buffer copy operation that ensures that the destination buffer is always larger than the source (plus any indirect expansion of the original source size) will not cause an overflow.
Default D 0.6 Median of the weights for Complete, Best-Available, Indirect, Moderate, Limited, and None.
Unknown Unk 0.5
Not Applicable NA 1.0

Note that this factor can not be quantified.

Finding Confidence (FC)

Finding Confidence is the confidence that the reported issue:

  • (1) is a weakness, and
  • (2) can be triggered or utilized by an attacker.
ValueCodeWeightDescription
Proven True T 1.0 the weakness is reachable by the attacker.
Proven Locally True LT 0.8

the weakness occurs within an individual function or component whose design relies on safe invocation of that function, but attacker reachability to that function is unknown or not present. For example, a utility function might construct a database query without encoding its inputs, but if it is only called with constant strings, the finding is locally true.

Proven False F 0.0

the finding is erroneous (i.e. the finding is a false positive and there is no weakness), and/or there is no possible attacker role.

Default D 0.8 Median of the weights for Proven True, Proven Locally True, and Proven False.
Unknown Unk 0.5
Not Applicable NA 1.0 This factor might not be applicable in an environment with high assurance requirements; the user might want to investigate every weakness finding of interest, regardless of confidence.
Quantified Q This factor could be quantified with custom weights. Some code analysis tools have precise measurements of the accuracy of specific detection patterns.

Attack Surface Metric Group
Attack Surface Metric Group

The Attack Surface metric group consistes of the following factors:

  • Required Privilege (RP)
  • Required Privilege Layer (RL)
  • Access Vector (AV)
  • Authentication Strength (AS)
  • Authentication Instances (AI)
  • Level of Interaction (IN)
  • Deployment Scope (SC)

Required Privilege (RP)

The Required Privilege identifies the type of privileges required for an entity to reach the code/functionality that contains the weakness.

The aconym "

gipoco.com is neither affiliated with the authors of this page nor responsible for its contents. This is a safe-cache copy of the original web site.