Predictive Coding Semantics: Step out of the Rain!

Wednesday, March 19, 2014 by Thought Leadership Team

A few years ago, we wrote about predictive coding going mainstream; shortly thereafter we wrote a series debunking the most common myths about predictive coding; then, we even went so far as to break down predictive coding lifecycles (actually, we did that twice!), explained how to maximize training for machine learning, and taught a short lesson about the most commonly-used stats for evaluating your machine’s work. So, to make a long story short, we’re pretty big on using predictive coding as part of a search methodology, and we’re doing everything we can to demystify the process so that organizations can use this technology to increase their ediscovery efficiency and, in turn, save a significant amount of money. If you don’t believe me, check out one of our case studies illustrating the myriad benefits of employing predictive coding for search, analysis, and review.

According to Fullbright’s 9^th Annual Litigation Trends Report in 2013, however, only 35% of the respondent companies indicated they were using some form of predictive coding. While that percentage is likely a significant increase from the pre-Da Silva Moore days where only a few ediscovery pioneers were brave enough to dive into their data with predictive coding, it still suggests some hesitancy on the part of practitioners.

Defining “Predictive Coding” and “Technology Assisted Review”

Whether the legal community’s hesitancy is related to cost, uncertainty, or a lack of matters they deem appropriate for predictive coding, it’s still a fairly opaque process, and it starts at one of the most basic levels: vernacular. The ediscovery community loves this technology… but it still hasn’t settled on what to call it. Whether it’s “predictive coding,” “technology assisted review,” or “computer assisted review,” there’s a smattering of terms floating around and little consensus about the preferred term.

As I stated before, Kroll Ontrack is committed to demystifying this process, and we’re drawing some lines to help clear up the semantic ambiguities related to predictive coding. It’s best to think of technology assisted review, or TAR, as an umbrella term that encompasses a variety of other advanced litigation technologies designed to help litigation teams work more efficiently and ease the burdens of standard linear review. Under this approach, predictive coding is just one prong of the larger array of complementary TAR and Early Data Assessment tools, such as near-dupe identification, email threading, visual analytics, workflow automation, and other reporting tools.