Predictive coding and Benedict Cumberbatch

28 November 2014 by Adrienn Toth

Artificial Intelligence has clearly become a provocative topic in popular culture once again – you’ve only to watch ‘Her’ and ‘Transcendence’ to see that. However, the most recent movie to catch my eye is ‘The Imitation Game’, and not just because the lovely Benedict Cumberbatch has the starring role, but rather because Alan Turing is the central character.

Mr Turing is known as not only the grandfather of computers, but the grandfather of artificial intelligence. In his seminal paper he questioned: “Can machines think?” But what is “thinking”? What would it mean for a machine to “think”? He refined the question and looked instead to the Imitation Game (hence the title of the film…): A machine can be deemed to “think” if it is indistinguishable from a human in its answers to questions. This became fundamental in the philosophy of artificial intelligence.

This question was so over-whelming it became obvious that there would be no-one way to make this happen. And so several sub-fields began to develop in the field of artificial intelligence – data mining, image processing, natural language processing, speech recognition, machine learning etc. With recent web-developments, a culmination of all these techniques means that we are closer than ever before to the Holy Grail of imitation – you just need to watch IBM’s Watson on Jeopardy to see that.

But it’s one particular sub-field that is of interest to me: machine learning. This is the underlying technology used in Predictive Coding. It is for machine learning that the question of machine thinking is incredibly pertinent. To start at the beginning, not all predictive coding technologies were created equal. All the technologies use different algorithms, meaning that the approach to machine thinking is different, and ultimately has different results, to appropriately varying degrees of success.

When reduced to its core parts, predictive coding is a two-step process: First, the machine learns through human intervention. This means the human provides to the machine the criteria a document needs to conform to in order to be considered X. For this process, a small subset of the data corpus is used. Secondly, the machine thinks by applying that learning, and predicting whether unreviewed documents (the remaining data corpus, not used in the first step) meet that criteria.

The learning element is similar for all predictive coding technologies – a human must review documents to input the relevant criteria and teach the machine. Although there are differing schools of thought as to the best approach to this – automatic versus manual training, for example – the fundamentals are the same. A human must input good information for the machine to learn.

It is the thinking element that is the defining factor. How well can the machine think and how well do they play their ‘Imitation Game’? For effective thinking, we really want the machine to be able to ‘actively learn’. Active learning allows the machine to interactively query the user to obtain further information: It allows the machine to say “I’m confused about this shade of grey.”

Why are the shades of grey important? Well an algorithm that is powered by – perhaps an analytics engine – is something more akin to passive, rather than active learning. When documents are processed, an analytics engine will cluster them together based on items like topics. The human will teach the system, the machine will learn – but the thinking element is lacking. The machine will predict that if a document belonging to a particular cluster is X, then all documents in this cluster are X. This is fine if everything were black and white – but there are always shades of grey.

In a case involving baking – sponge cakes belong to one cluster and the human can state that sponge cakes are X. The human can also teach the machine that chocolate biscuits are Y. Then what of chocolate cakes? This is a shade of grey, that results in a cross over between criteria for X and Y. The machine cannot think past black and white to consider the cross over.

On the other hand, active learning can push documents that relate to chocolate cakes forward to the human for clarification. It is smart enough to think and understand the cross over in criterion that requires clarity. By being able to expressly ask for clarity, the machine is far better at joining the dots of the criteria, understanding the subtleties and is ultimately able to make better predictions. For that reason, it is far closer to being able to imitate the thought processes of humans and a step towards being able to think for itself.

So, when choosing predictive coding providers to support your legal document review, consider Alan Turing and ask the question: Can machines think?