Technology Assisted Review (TAR): Five Facts in a Flash
The case law, technology, and practice behind technology-assisted review (TAR) are coming together faster and more spontaneously than a flash mob. Do you feel like your head is spinning yet? If so, read these five short “what you need to know” bits about TAR. After, you might feel so enlightened that you will want to gather a group of friends and start dancing in the street!
- Vernacular: Although this technology is the most talked about development in ediscovery, there’s little to no agreement about what to call it. Whether it’s “predictive coding,” “technology assisted review,” “computer assisted review,” “automated document review,” “predictive priority,” or some other two- or three-word combination, there is no consistent use of any single phrase. You could also refer to it as “machine learning,” but that sounds like something straight out of 1940s science fiction.
- Machine learning: Despite the old-fashioned imagery it evokes, machine learning is a key component of this advanced technology. Generally speaking, machine learning uses advanced algorithms and analytics to supplement the judgment of expert human reviewers regarding the responsiveness, non-responsiveness or privilege of documents in a data set. With enough iterations and sampling, the technology learns from the expert reviewer’s decisions to judge the responsiveness of the documents with a high degree of reliability.
- Predictive algorithms: If you shop through online marketplaces or watch movies through streaming services, you’ve likely already been exposed to some sort of predictive technology. For example, amazon.com will suggest products based on your browsing history or past purchases, and Netflix will often make “we think you’ll enjoy” suggestions based on a recommendation algorithm, which accounts for films you’ve watched and how you or others rated them. Many online advertisements are based on a predictive algorithm, too—so these technologies aren’t quite as unfamiliar or untested as some will lead you to believe.
- Key terminology: In addition to the litany of names thrown around for TAR, discussion is likely to revolve around recall, precision, and f-measure. Understanding these key terms is essential to understanding TAR (and sounding like you know what you’re talking about!):
- Recall is a measure of completeness, referring to the fraction of relevant documents retrieved from the total number of relevant documents. Ongoing studies such as TREC Legal Track have shown that traditional, linear search produces an average recall of approximately 60 percent—meaning more than 40 percent of the relevant documents were missed! On the other hand, reviewers leveraging TAR produced much better results, averaging approximately 77 percent recall in the 2009 TREC Legal Track.
- Precision refers to exactness, or the fraction of identified documents that are actually relevant. Studies show that linear reviewers average about 60 percent precision, meaning 40 percent of the identified documents were, on average, incorrectly coded as relevant. Precision averages for TAR are also notably higher at about 75 percent.
- F-measure is the harmonic mean that strikes a balance between recall and precision. Rather than measuring the above metrics independently (which can be cumbersome), the two are often assessed through a combined f-measure percentage. Thus, a higher f-measure typically indicates higher precision and recall, while a lower f-measure suggests lower precision and recall. Currently, there is no industry standard for an appropriate f-measure percentage, and it is up to the parties involved in ediscovery to define a reasonable percentage depending on the particular needs of the case (higher risk cases might demand higher recall and precision).
- Case law – Mainstreaming TAR: In 2012, several opinions opened the door for judicial acceptance of TAR. Notably, in Da Silva Moore v. Publicis Group, U.S. Magistrate Judge Andrew Peck stated that TAR was “judicially-approved for use in appropriate cases,” such as those with larger and/or more complex data sets. Additionally, Peck heavily emphasized that counsel design an appropriate process when implementing TAR. Subsequent opinions, such as Global Aerospace v. Landow Aviation, have similarly emphasized thorough processes with sampling and quality control. Going forward, the judiciary will likely say more about the “dos and don’ts” of TAR, so it’s crucial to pay close attention to stay on top of the latest TAR developments.