Frequently Asked Questions

The following are answers to frequently asked questions we have received that ensures everybody understands our transparent testing process:

Round 1 was split between an initial cohort and subsequent rolling admissions. The first cohort results were released in a single batch when all vendors in the cohort had completed their evaluation and subsequent review process. The rolling admissions are released as they are completed.

First cohort participants:          Carbon Black, CrowdStrike, CounterTack, Endgame, Microsoft, RSA, SentinelOne
Rolling admission participants:          Cybereason, FireEye

All vendors received a copy of the techniques to be tested and the general evaluation process overview. No vendor had access to the detailed procedures or results prior to their evaluation. FireEye and Cybereason's feedback period occurred after the launch of the ATT&CK Evaluations website.

We hope to announce and release details of Round 2 soon.

We don’t limit based on market segment. Requirements include:
  • Technology must address the detection of post-compromise behaviors as described by ATT&CK
  • Protections/preventions/responses must be disabled to allow for execution of our emulation, but sensors that drive these actions can still be used as data sources to identify behavior
  • Technology must deploy into the Microsoft Azure environment
  • Sensor/data, beyond those provided by default in Azure, must be provided by the vendor

Let us know your needs, and the current limitations of our methodology. This will help us shape our evaluation road map.

Email for more information.

Yes. There was significant demand for unbiased ATT&CK evaluations and MITRE needed to create a mechanism to open up evaluations to the security vendor market. Participating companies understand that all results will be publicly released, which is true to MITRE's mission of providing objective insight.

Vendors get a third-party evaluation of their ATT&CK detection capabilities. These evaluations are not ATT&CK certifications, nor are they a guarantee that you are protected against the adversary we are emulating. Adversary behavior changes over time. The evaluations provide vendors with insight and confidence into how their capabilities map to ATT&CK techniques. Equally important, because we are publicly releasing the results, we enable their customers, and potential customers, to understand how to utilize their tools to detect ATT&CK-categorized behaviors.

ATT&CK evaluations are built on the publicly-available information captured by ATT&CK, but they are separate from the ongoing work to maintain the ATT&CK knowledge base. The team who maintains ATT&CK will continue to accept contributions from anyone in the community. The ATT&CK knowledgebase will remain free and open to everyone, and vendor participation in the evaluations has no influence on that process.

The evaluations use adversary emulation, which is a way of testing "in the style of" a specific adversary. This allows us to select a relevant subset of ATT&CK techniques to test. To generate our emulation plans, we use public threat intel reporting, map it to ATT&CK, and then determine a way to replicate the behaviors. The Round 1 emulated adversary was APT3. APT29 is the anticipated Round 2 emulation. We plan to offer new emulations in subsequent rounds to complement previous evaluations.

The ATT&CK evaluations are based on a four-phased approach:
    1. Setup
      The vendor installs their tool in a MITRE provided cyber range. The tool is deployed for detect/alert only -- preventions, protections, and responses are prohibited.
    2. Evaluation
      During a joint evaluation session, MITRE adversary emulators ("red team") execute an emulation in the style of an adversary group, technique-by-technique. The vendor being tested will provide the personnel who review tool output to detect each technique ("blue team"). MITRE provides the personnel to oversee the evaluation and facilitate communication between red and blue, as well as capture results ("white team").
    3. Feedback
      Vendors are provided an opportunity to offer feedback on the preliminary results, but the feedback does not obligate MITRE to make any modification to the results.
    4. Release
      MITRE publicly releases the evaluation methodology and results of the tool evaluations. For additional details refer to our methodology.

Round 1 results are available now. We released an initial cohort to maximize fairness, giving a group of vendors an equal opportunity to have their results released at the same time. Vendors who participate in the subsequent rolling admissions will have their results released as they complete.

No, all vendors signing up for the evaluation agree to have their results publicly released upon conclusion of their test.

Public evaluations are the only vendor-paid evaluation provided at this time.

We don't score, rank, or rate vendors. We look at each vendor independently, evaluating their ability to detect ATT&CK techniques, and publish our findings.

The stoplight chart (which uses red, yellow, and green to indicate level of confidence for detection of techniques) has been used since ATT&CK's creation because it is a simple yet powerful way to understand ATT&CK coverage. While a stoplight chart may be useful to show coverage and gaps, we do not use this visualization because it is not granular enough to convey our results.

While we understand the importance of minimizing false positives, they are often tied to environment noise. Without a good source of emulated noise in our testing environment, we don't address false positives directly, but rather address them indirectly in a number of ways:
    1. Vendors are required to define how they configured their capabilities. With that provided configuration and the evaluation's results as a baseline, users can then customize detections to reduce false positives in their unique environment.
    2. We do not score results or weigh an alert higher than any other detection category. This should minimize the incentive for vendors to alert on every technique.
    3. We articulate how the tool can perform detection. By releasing how to detect, as well as our methodology, organizations can implement their own tests to determine how the tools operate in their specific environment.

We do not make any judgments about one detection being better than another. We distinguish between different types of detection, and describe our rationale for doing so in Part 1 and Part 2 of a blog series.

  • Feedback. We are always looking for feedback on what works and what doesn’t in our results and methodology. Learning how you use the results and what you want to get out of them helps us shape our work to help you, and your peers.
  • Intel. We frame our evaluations in the context of the known threat to ensure our results are relevant and useful. These emulations are driven by available intel. If you share your insights, you can improve our plans.
          Provide feedback