Common Weakness Scoring System (CWSS™)
The MITRE Corporation Copyright © 2011
cwe.mitre.org/cwss/
|
|
CWSS version: 0.8 |
Document version: 0.8 |
Revision Date: June 27, 2011 |
|
Project Coordinator:
Bob Martin (MITRE)
|
Document Editor:
Steve Christey (MITRE)
|
Introduction
Introduction
When a security analysis of a software application is performed, such as when using an automated code auditing tool, developers often face
hundreds or thousands of individual bug reports for weaknesses that
are discovered in their code. In certain circumstances, a software
weakness can lead to an exploitable vulnerability. For example, a
buffer overflow vulnerability might arise from a weakness in which the
programmer does not properly validate the length of an input buffer.
This weakness only contributes to a vulnerability if the input can be
influenced by a malicious party, and if that malicious input can
copied to an output buffer that is smaller than the input.
Due to the high volume of reported weaknesses, developers are forced into a situation in which they must prioritize which issues they
should investigate and fix first. Similarly, when assessing design
and architecture choices and their weaknesses, there needs to be a
method for prioritizing them relative to each other and with the other
issues of the application. Finally, software consumers want to know
what they should worry about the most, and what to ask for to get a
more secure product from their vendors and suppliers.
Further complicating the problem, the importance of a weakness may vary depending on business or mission needs, the kinds of technologies
in use, and the threat environment.
In short, people need to be able to reason and communicate about the relative importance of different weaknesses. While various scoring
methods are used today, they are either ad hoc or inappropriate for
application to the still-imprecise evaluation of software security.
The Common Weakness Scoring System (CWSS) provides a mechanism for scoring weaknesses in a consistent, flexible, open manner while
accommodating context for the various business domains. It is a
collaborative, community-based effort that is addressing the needs of
its stakeholders across government, academia,
and industry. CWSS is a part of the Common Weakness Enumeration (CWE)
project, co-sponsored by the Software Assurance program in the
National Cyber Security Division (NCSD) of the US Department of
Homeland Security (DHS).
CWSS:
- provides a common framework for prioritizing security errors ("weaknesses") that are discovered in software applications
- provides a quantitative measurement of the unfixed weaknesses that are present within a software application
- can be used by developers to prioritize unfixed weaknesses within their own software
- in conjunction with the Common Weakness Risk Analysis Framework (CWRAF), can be used by consumers to identify the most
important weaknesses for their business domains, in order to inform
their acquisition and protection activities as one part of the
larger process of achieving software assurance.
Table of Contents
Table of Contents
- Stakeholders
- CWSS Design Considerations
- Scoring Methods within CWSS
- CWSS 0.6 Scoring for Targeted Software
- Scoring
- CWSS Metric Groups
- Supporting Uncertainty and Flexibility Within Factors
- Base Finding Metric Group
- Technical Impact (TI)
- Acquired Privilege (AP)
- Acquired Privilege Layer (AL)
- Internal Control Effectiveness (IC)
- Finding Confidence (FC)
- Attack Surface Metric Group
- Required Privilege (RP)
- Required Privilege Layer (RL)
- Access Vector (AV)
- Authentication Strength (AS)
- Authentication Instances (AI)
- Level of Interaction (IN)
- Deployment Scope (SC)
- Environmental Metric Group
- Business Impact (BI)
- Likelihood of Discovery (DI)
- Likelihood of Exploit (EX)
- External Control Effectiveness (EC)
- Remediation Effort (RE)
- Prevalence (P)
- CWSS Score Formula
- Base Finding Subscore
- Attack Surface Subscore
- Environmental Subscore
- Additional Features of the Formula
- CWSS Vectors, Scoring, and Score Portability
- Example 1: Business-critical application
- Example 2: Wiki with limited business criticality
- Other Approaches to CWSS Score Portability
- Considerations for CWSS beyond 0.6
- Current Limitations of the Scoring Method
- Community Review and Validation of Factors
- Additional CWSS Factors
- Constituency-focused Scoring
- Impact of CWSS and CWE Changes to Factors and Subscores
- Future Activities
- Community Participation in CWSS
- Appendix A: CVSS
- CVSS in a Software Assurance Context
- Adaptations of CVSS
- Comparison of CWSS Factors with CVSS
- Other Differences between CVSS and CWSS
- Appendix B: Other Scoring Methods
- 2008 CWSS Kickoff Meeting
- 2010 SANS/CWE Top 25
- 2010 OWASP Top Ten
- Other Models
- Appendix C: Generalized Scoring Approaches
- Appendix D: Aggregated Scoring Methods: Measuring Weakness Surface
- Change Log
Stakeholders
Stakeholders
To be most effective, CWSS supports multiple usage scenarios by different stakeholders who all have an interest in a consistent
scoring system for prioritizing software weaknesses that could
introduce risks to products, systems, networks and services. Some of
the primary stakeholders are listed below.
Stakeholder | Description |
Software developers
|
often operate within limited time frames, due to release cycles and
limited resources. As a result, they are unable to investigate and
fix every reported weakness. They may choose to concentrate on the
worst problems, the easiest-to-fix. In the case of automatic weakness
findings, they might choose to focus on the findings that are least
likely to be false positives.
|
Software development managers
|
create strategies for prioritizing and removing entire classes of
weaknesses from the entire code base, or at least the portion that is
deemed to be most at risk, by defining custom "Top-N" lists. They
must understand the security implications of integrating third-party
software, which may contain its own weaknesses. They may need to
support distinct security requirements and prioritization for each
product line.
|
Software acquirers
|
want to obtain third-party software with a reasonable level of
assurance that the software provider has performed due diligence in
removing or avoiding weaknesses that are most critical to the
acquirer's business and mission. Related stakeholders include CIOs,
CSOs, system administrators, and end users of the software.
|
Code analysis vendors and consultants
|
want to provide a consistent, community-vetted scoring mechanism for
different customers.
|
Evaluators of code analysis capabilities
|
evaluate the capabilities of code analysis techniques (e.g., NIST
SAMATE). They could use a consistent weakness scoring mechanism to
support sampling of reported findings, as well as understanding the
severity of these findings without depending on ad hoc scoring methods
that may vary widely by tool/technique.
|
Other stakeholders
|
may include vulnerability researchers, advocates of secure
development, and compliance-based analysts (e.g., PCI DSS).
|
CWSS Design Considerations
CWSS Design Considerations
For CWSS to be most effective to its stakeholders, several aspects of the problem area must be considered when designing the framework and
metrics. Some of these considerations might not be resolved until
several revisions of CWSS have been released and tested.
- CWSS scoring will need to account for incomplete information throughout much of the lifecycle of a reported weakness. For
example, scoring may be necessary before the weakness is even known
to contribute to a vulnerability, e.g. in the initial output from
an automated code scanning tool. Second, the entity (human or
machine) assigning the initial CWSS score might not have complete
information available, e.g. the expected operating environment.
Finally, some factors in the CWSS score might rely on trend
information (such as frequency of occurrence) that is only
estimated due to lack of sufficient statistical data. For example,
the 2010 CWE Top 25 relied on survey results, because very few
sources had prevalence data at the same level of detail as the
weaknesses being considered for the list.
Incomplete information is a challenge for CVSS scoring, and it is
expected to be even more important for CWSS.
- It is assumed that portions of CWSS scores can be automatically generated. For example, some factors may be dependent on the type
of weakness being scored; potentially, the resulting subscores could
be derived from CWE data. As another example, a web script might
only be accessible by an administrator, so all weaknesses may be
interpreted in light of this required privilege.
- CWSS should be scalable. Some usage scenarios may require the scoring of thousands of weaknesses, such as defect reports from an
automated code scanning tool. When such a high volume is
encountered, there are too many issues to analyze manually. As a
result, automated scoring must be supported.
- The potential CWSS stakeholders, their needs, and associated use cases should be analyzed to understand their individual
requirements. This might require support for multiple scoring
techniques or methods.
- Associated metrics must balance usability with completeness, i.e., they cannot be too complex.
- Environmental conditions and business/mission priorities should impact how scores are generated and interpreted.
- CWSS should be automatable and flexible wherever possible, but support human input as well.
Scoring Methods within CWSS
Scoring Methods within CWSS
The stakeholder community is collaborating with MITRE to investigate several different scoring methods that might need to be supported
within the CWSS framework.
Method | Notes |
Targeted
|
Score individual weaknesses that are discovered in the design or
implementation of a specific ("targeted") software package, e.g. a
buffer overflow in the username of an authentication routine in line
1234 of vuln.c in an FTP server package. Automated tools and software
security consultants use targeted methods when evaluating the security
of a software package in terms of the weaknesses that are contained
within the package.
|
Generalized
|
Score classes of weaknesses independent of any particular software
package, in order to prioritize them relative to each other
(e.g. "buffer overflows are higher priority than memory leaks"). This
approach is used by the CWE/SANS Top 25, OWASP Top Ten, and similar
efforts, but also by some automated code scanners. The generalized
scores could vary significantly from the targeted scores that would
result from a full analysis of the individual occurrences of the
weakness class within a specific software package. For example, while
the class of buffer overflows remains very important to many
developers, individual buffer overflow bugs might be considered less
important if they cannot be directly triggered by an attacker and
their impact is reduced due to OS-level protection mechanisms such as
ASLR.
|
Context-adjusted
|
Modify scores in accordance with the needs
of a specific analytical context that may integrate business/mission
priorities, threat environments, risk tolerance, etc. These needs are
captured using vignettes that link inherent characteristics of
weaknesses with higher-level business considerations. This method
could be applied to both targeted and generalized scoring.
|
Aggregated
|
Combine the results of multiple, lower-level weakness scores to
produce a single, overall score (or "grade"). While aggregation might
be most applicable to the targeted method, it could also be used in
generalized scoring, as occurred in the 2010 CWE/SANS Top 25.
|
The current focus for CWSS is on the Targeted scoring method and a framework for context-adjusted scoring. Methods for aggregated
scoring will follow. Generalized scoring is being developed
separately, primarily as part of the 2011 Top 25 and CWRAF.
CWSS 0.6 Scoring for Targeted Software
CWSS 0.6 Scoring for Targeted Software
In CWSS 0.6, the score for each reported weakness (finding) is calculated using 18 different factors in three metrics groups.
Scoring
In CWSS 0.6, the score for a weakness, or a weakness bug report ("finding") is calculated using 18 different factors,
across three metric groups:
- the Base Finding group, which captures the inherent risk of the weakness, confidence in the accuracy of the finding, and strength of
controls.
- the Attack Surface group, which captures the barriers that an attacker must cross in order to exploit the weakness.
- the Environmental group, which includes factors that may be specific to a particular operational context, such as business impact,
likelihood of exploit, and existence of external controls.
(A larger picture is available.)
CWSS Metric Groups
CWSS can be used in cases where there is little information at first, but the quality of information can improve over time. It is
anticipated that in many use-cases, the CWSS score for an individual
weakness finding may change frequently, as more information is
discovered. Different entities may evaluate separate factors at
different points in time.
As such, every CWSS factor effectively has "environmental" or "temporal" characteristics, so it is not particularly useful to adopt
the same types of metric groups as are used in CVSS.
Metric Group | Factors |
Base Finding Group
|
* Technical Impact (TI)
* Acquired Privilege (AP)
* Acquired Privilege Layer (AL)
* Internal Control Effectiveness (IC)
* Finding Confidence (FC)
|
Attack Surface Group
|
* Required Privilege (RP)
* Required Privilege Layer (RL)
* Access Vector (AV)
* Authentication Strength (AS)
* Authentication Instances (AI)
* Level of Interaction (IN)
* Deployment Scope (SC)
|
Environmental Group
|
* Business Impact (BI)
* Likelihood of Discovery (DI)
* Likelihood of Exploit (EX)
* External Control Effectiveness (EC)
* Remediation Effort (RE)
* Prevalence (P)
|
Supporting Uncertainty and Flexibility Within Factors
Supporting Uncertainty and Flexibility Within Factors
Most factors have three values in common:
Value | Usage |
Unknown
|
The entity calculating the score does not have enough information to
provide a value for the factor. This can be a signal for further
investigation. For example, an automated code scanner might be able
to find certain weaknesses, but be unable to detect whether any
authentication mechanisms are in place. The use of "Unknown"
emphasizes that the score is incomplete or estimated, and further
analysis may be necessary. This makes it easier to model incomplete
information, and for the Business Value Context to influence final
scores that were generated using incomplete information. The weight
for this value is 0.5 for all factors, which generally produces a
lower score; the addition of new information (i.e., changing some
factors from "Unknown" to another value) will then adjust the score
upward or downward based on the new information.
|
Not Applicable
|
The factor is being explicitly ignored in the score calculation. This
effectively allows the Business Value Context to dictate whether a
factor is relevant to the final score. For example, a
customer-focused CWSS scoring method might ignore the remediation
effort, and a high-assurance environment might require investigation
of all reported findings, even if there is low confidence in their
accuracy. For a set of weakness findings for an individual software
package, it is expected that all findings would have the same "Not
Applicable" value for the factor that is being ignored.
|
Quantified
|
The factor can be weighted using a quantified, continuous range of 0.0
through 1.0, instead of the factor's defined set of discrete values.
Not all factors are quantifiable in this way, but it allows for
additional customization of the metric.
|
Default
|
The factor's weight can be set to a default value. Labeling the
factor as a default allows for investigation and possible modification
at a later time.
|
Base Finding Metric Group
Base Finding Metric Group
The Base Finding metric group consists of the following factors:
- Technical Impact (TI)
- Acquired Privilege (AP)
- Acquired Privilege Layer (AL)
- Internal Control Effectiveness (IC)
- Finding Confidence (FC)
The combination of values from Technical Impact, Acquired Privilege, and Acquired Privilege Layer gives the user some expressive power.
For example, the user can characterize "High" Technical Impact with
"Administrator" privilege at the "Application" layer.
Technical Impact (TI)
Technical Impact is the potential result that can be produced by the weakness, assuming that the weakness can be successfully reached and
exploited. This is expressed in terms that are more fine-grained than
confidentiality, integrity, and availability.
Value | Code | Weight | Description |
Critical
|
C
|
1.0
|
complete control over the software, the data it processes, and the
environment in which it runs (e.g. the host system), to the point
where operations cannot take place.
|
High
|
H
|
0.9
|
|
Medium
|
M
|
0.6
|
|
Low
|
L
|
0.3
|
|
None
|
N
|
0.0
|
|
Default
|
D
|
0.6
|
The Default weight is the median of the weights for Critical, High,
Medium, Low, and None.
|
Unknown
|
Unk
|
0.5
|
|
Not Applicable
|
NA
|
1.0
|
This factor might not be applicable in an environment with high assurance
requirements; the user might want to investigate every weakness
finding of interest, regardless of confidence.
|
Quantified
|
Q
|
|
This factor could be quantified with custom weights.
|
If this set of values is not precise enough, CWSS users can use their own Quantified methods to derive a subscore. One such method involves
using the Common Weakness Risk Analysis
Framework (CWRAF) to define a vignette and a Technical Impact
Scorecard. The Impact weight is
calculated using
vignette-specific Importance ratings for different technical impacts
that could arise from exploitation of the weakness, such as
modification of sensitive data, gaining privileges, resource
consumption, etc.
Acquired Privilege (AP)
The Acquired Privilege identifies the type of privileges that are obtained by an entity who can successfully exploit the weakness. In
some cases, the acquired privileges may be the same as the required
privileges, which implies either (1) "horizontal" privilege escalation
(e.g. from one unprivileged user to another), or (2) privilege
escalation within a sandbox, such as an FTP-only user who can escape
to the shell.
Notice that the values are the same as those for Required Privilege, but the weights are different.
The aconym "RUGNAP" can serve as a mnemonic for remembering the key values ("Regular User", "Guest", "None", "Admin",
"Partially-Privileged").
Value | Code | Weight | Description |
Administrator
|
A
|
1.0
|
The entity has administrator, root, SYSTEM, or equivalent privileges
that imply full control over the software or the underlying OS.
|
Partially-Privileged User
|
P
|
0.9
|
The entity is a valid user with some special privileges, but not
enough privileges that are equivalent to an administrator. For
example, a user might have privileges to make backups, but not to
modify the software's configuration or install updates.
|
Regular User
|
RU
|
0.7
|
The entity is a regular user who has no special privileges.
|
Guest
|
G
|
0.6
|
The entity acquires limited or "guest" privileges that can
significantly restrict allowable activities. This could happen in an
environment that uses strong privilege separation.
|
None
|
N
|
0.1
|
No extra privileges are acquired.
|
Default
|
D
|
0.7
|
Median of the weights for None, Guest, Regular User,
Partially-Privileged User, and Administrator.
|
Unknown
|
Unk
|
1.0
|
0.5
|
Not Applicable
|
NA
|
1.0
|
This factor might not be applicable in an environment with high assurance
requirements that wants strict enforcement of privilege separation,
even between already-privileged users.
|
Note that this factor can not be quantified.
Acquired Privilege Layer (AL)
The Acquired Privilege Layer identifies the operational layer to which the entity gains access if the weakness can be successfully exploited.
A mnemonic for this factor is "SANE" (System, Application, Network, Enterprise).
Value | Code | Weight | Description |
Application
|
A
|
1.0
|
The entity must be able to have access to an affected application.
|
System
|
S
|
0.9
|
The entity must have access to, or control of, a system or physical
host.
|
Network
|
N
|
0.7
|
The entity must have access to/from the network.
|
Enterprise
|
E
|
1.0
|
The entity must have access to a critical piece of enterprise
infrastructure, such as a router, switch, DNS, domain controller,
firewall, identity server, etc.
|
Default
|
D
|
0.9
|
Median of the weights for SANE.
|
Unknown
|
Unk
|
0.5
|
|
Not Applicable
|
NA
|
1.0
|
This factor might not be applicable in an environment with high assurance
requirements that wants strict enforcement of privilege separation,
even between already-privileged users.
|
Note that this factor can not be quantified.
Internal Control Effectiveness (IC)
An Internal Control is a control, protection mechanism, or mitigation that has been explicitly built into the software (whether through
architecture, design, or implementation). Internal Control
Effectiveness measures the ability of the control to render the
weakness unable to be exploited by an attacker. For example, an input
validation routine that restricts input length to 15 characters might
be moderately effective against XSS attacks by reducing the size of
the XSS exploit that can be attempted.
Value | Code | Weight | Description |
None
|
N
|
1.0
|
No controls exist.
|
Limited
|
L
|
0.9
|
There are simplistic methods or accidental restrictions that might
prevent a casual attacker from exploiting the issue.
|
Moderate
|
M
|
0.7
|
The protection mechanism is commonly used but has known limitations
that might be bypassed with some effort by a knowledgeable attacker.
For example, the use of HTML entity encoding to prevent XSS attacks
may be bypassed when the output is placed into another context such as
a Cascading Style Sheet or HTML tag attribute.
|
Indirect (Defense-in-Depth)
|
I
|
0.5
|
The control does not specifically protect against exploitation of the
weakness, but it indirectly reduces the impact when a successful
attack is launched, or otherwise makes it more difficult to construct
a functional exploit. For example, a validation routine might
indirectly limit the size of an input, which might make it difficult
for an attacker to construct a payload for an XSS or SQL injection
attack.
|
Best-Available
|
B
|
0.3
|
The control follows best current practices, although it may have some
limitations that can be overcome by a skilled, determined attacker,
possibly requiring the presence of other weaknesses. For example, the
double-submit method for CSRF protection is considered one of the
strongest available, but it can be defeated in conjunction with
behaviors of certain functionality that can read raw HTTP headers.
|
Complete
|
C
|
0.0
|
The control is completely effective against the weakness, i.e., there
is no bug or vulnerability, and no adverse consequence of exploiting
the issue. For example, a buffer copy operation that ensures that the
destination buffer is always larger than the source (plus any indirect
expansion of the original source size) will not cause an overflow.
|
Default
|
D
|
0.6
|
Median of the weights for Complete, Best-Available, Indirect,
Moderate, Limited, and None.
|
Unknown
|
Unk
|
0.5
|
|
Not Applicable
|
NA
|
1.0
|
|
Note that this factor can not be quantified.
Finding Confidence (FC)
Finding Confidence is the confidence that the reported issue:
- (1) is a weakness, and
- (2) can be triggered or utilized by an attacker.
Value | Code | Weight | Description |
Proven True
|
T
|
1.0
|
the weakness is reachable by the attacker.
|
Proven Locally True
|
LT
|
0.8
|
the weakness occurs within an individual function or component whose
design relies on safe invocation of that function, but attacker
reachability to that function is unknown or not present. For example,
a utility function might construct a database query without encoding
its inputs, but if it is only called with constant strings, the
finding is locally true.
|
Proven False
|
F
|
0.0
|
the finding is erroneous (i.e. the finding is a false positive and
there is no weakness), and/or there is no possible attacker role.
|
Default
|
D
|
0.8
|
Median of the weights for Proven True, Proven Locally True, and Proven
False.
|
Unknown
|
Unk
|
0.5
|
|
Not Applicable
|
NA
|
1.0
|
This factor might not be applicable in an environment with high assurance
requirements; the user might want to investigate every weakness
finding of interest, regardless of confidence.
|
Quantified
|
Q
|
|
This factor could be quantified with custom weights. Some code
analysis tools have precise measurements of the accuracy of specific
detection patterns.
|
Attack Surface Metric Group
Attack Surface Metric Group
The Attack Surface metric group consistes of the following factors:
- Required Privilege (RP)
- Required Privilege Layer (RL)
- Access Vector (AV)
- Authentication Strength (AS)
- Authentication Instances (AI)
- Level of Interaction (IN)
- Deployment Scope (SC)
Required Privilege (RP)
The Required Privilege identifies the type of privileges required for an entity to reach the code/functionality that contains the weakness.
The aconym "
|