The Final Act: Predictive Coding take Centre Stage

18 February 2015 by Hitesh Chowdhry

February 2015 marks the twenty-third month since the beginning of my MBA programme. Fourteen exams, twenty-five assignments, two study tours to Asia, a thousand social occasions I couldn’t attend, and a few gallons of (inter alia) coffee later, the curtain has been raised on the denouement – the dreaded dissertation. Given that ediscovery has played the perfect support act throughout my period of study – I worked as a document review lawyer during the first year, and then joined Kroll Ontrack at the start of the second – I thought it only fitting that the industry should now take centre stage, as the subject of my 15,000 word thesis.

Technology underpins much of what is taught in business schools these days. During the last week alone, I have studied Netflix’s algorithmic approach to producing TV shows, and analysed the CRM practices of retailers Target and Ocado. In respect of ediscovery, predictive coding remains the most innovative, talked-about, and frankly down-right sexy technology on the market. As my colleague, Katie Fitzgerald details in her blog about The Imitation Game’s applicability to ediscovery, predictive coding uses machine-learning algorithms to: a) learn how lawyers categorise particular documents on the basis of a sample set; and b) apply that learning across the remaining unreviewed documents in a data set.

The potential implications are revolutionary. Predictive coding can make the process of large-scale document review – a pivotal exercise in many litigations and investigations – significantly more inexpensive, accurate, and strategically advantageous. Lawyers can find potential ‘smoking-gun’ documents more quickly; and wade through fewer non-relevant documents – something which is particularly appealing given the gargantuan data volumes that many companies now hold. No irony lost, then, that I’m planning to write many a page about a technology which is meant to reduce the amount people have to read.

However, whilst ediscovery providers wax lyrical about the technology, its usage is far from prevalent. One recent Kroll Ontrack survey of circa 70 lawyers revealed that 44% did not know what predictive coding was, and a further 36% had never used it before. That paradox prompted me to decide to explore what lawyers’ perceptions are of predictive coding, and what factors are involved in the process of deciding whether or not to use it.

To help shed light on these issues, I have designed a survey which has been sent to thousands of legal professionals; and the results will underpin the conclusions of my paper. The questions have been formulated on the basis of both academic literature on the diffusion of innovations, and qualitative information gleaned from ediscovery practitioners’ experiences of marketing predictive coding to lawyers.

The response so far has been overwhelming – many lawyers simply cannot wait to tell us just how much they love (or hate) the idea of a computer trying to understand and mimic what they’re thinking; whilst others are keen to help because we are donating money to the charity SHINE for every survey completed. Whatever the results may indicate, I hope that the analysis will help all of us understand better what it is that lawyers want from technology, and from technology providers. I welcome all legal professionals out there who may have used ediscovery services (in any capacity) to participate by clicking here. After all, we don’t have an algorithm to predict how you would have filled out a survey…

Yet.