Home  >  Methodology  >  Round 1

Round 1 Overview

Round 1 participation was defined by vendors who participated in our APT3 evaluations, whether they were part of the “initial cohort” or were included via subsequent “rolling” admissions. We included vendors that had executed contracts by June 29, 2018 in the initial cohort. This cohort had results released simultaneously to ensure fairness. After the initial cohort was executed, rolling admissions began. During rolling admissions, new vendors may participate, new versions of products may be tested, and previous vendors may be re-tested. We will formally close the round of APT3 testing when technical limitations make the test obsolete or unable to be executed, such as Windows version changes.


We chose to emulate APT3 for our initial evaluation because there is substantial reporting of their post-exploit behavior—enough for us to create a suitable emulation plan for evaluations. Their publicly known post-exploit behavior relies on harvesting credentials, issuing on-keyboard commands (versus Windows API calls), and using programs already trusted by the operating system (“living off the land”). Similarly, they are not known to do elaborate scripting techniques, leverage exploits after initial access, or use anti-EDR capabilities such as rootkits or bootkits.

In a reflection of our belief in the value of behavior-based detections, we have focused our adversary emulation on ATT&CK techniques and behaviors, not on the tools or Indicators of Compromise (IoCs) associated with the group. Our desire was to apply a scientific approach that will allow other testers to reproduce our evaluation results, so we chose tooling that is publicly available. For Round 1 evaluations, we chose Cobalt Strike (a commercial tool) and Empire (an open-source tool), as they enabled us to emulate the adversary’s techniques closely. By using more than one tool to test the same technique, we were able to vary the IoCs introduced by a tool, as well as vary implementations of the same technique to evaluate whether capabilities detect different implementations. Our goal with this multi-tool approach was to shift the focus away detecting tools and IoCs and toward detecting techniques and behaviors.

More methodology specific to Round 1: