Forrester’s TAR Report Underscores Different Predictive Coding Methodologies

When it comes to Predictive Coding, the eDiscovery world is drowning in alphabet soup. Fortunately, Forrester has published a report that explains the different TAR methodologies and clarifies their distinctions. We were was among the 17 organizations interviewed by Forrester; for us, the report validates the message we’ve been promoting for years in webinars, eBooks, seminars, and more.

One Thing We Can All Agree On: TAR Helps

Forrester takes an enterprise-centric approach to discussing TAR, opening with the strong statement that “[m]anual document review can hurt your customers.” This is an appropriate vector to start with as corporations have taken a larger role in dictating the review process.

The report goes on to highlight particular risks facing enterprises that depend on traditional, manual review methods: PII identification, keeping trade secret and confidential business documents safe, and human resource allocation. These are areas ripe for machine augmentation. For example, PII identification frequently revolves around patterns. SSN numbers follow a pattern of XXX-XX-XXXX, and machines excel at detecting such regular expressions without weariness. Indeed, this was a key talking point raised by panelists at our 2016 LegalTech NY session.

…And Continuous Learning Helps the Most

Forrester offers a straightforward explanation contained in a chart that breaks down different technologies. The chart compares and contrasts the main TAR training protocols (simple and continuous), dedicating several paragraphs to the major differences in sampling, training, and validation workflows. You can download the full report below.

Analytics in Concert

The Forrester report does not suggest using TAR in a vacuum. Instead it highlights additional analytics like email threading, deduplication, visualizations, and the pattern identification discussed above. None of these analytics rely on supervised learning; for instance, threading automatically detects emails that are part of a string and groups them together for expedited review. Having all of these analytics plus continuous machine learning in the same platform is a powerful solution to the challenge of protecting client data.

Are We Mainstream Yet?

The Forrester report provides strong validation of the message we’ve been writing about for years. Yet, despite these efforts now reinforced by analyst recommendations, we still see TAR protocols routinely appear in published judicial orders  with stabilization workflows, aggressive transparency, and a lot of unnecessary legal work (for instance, take a look at the collaborative training model put forward in this recent protocol) to get to the same output.

So to that end, we’ve solidified our upcoming LegalTech 2017 session to focus on the ever-broadening use cases of continuous learning across the enterprise—for fact investigations, due diligence, and—of course—litigation. Join our panel of AGCs from leading corporations and experienced outside counsel for the bluntly-titled: Enough Already! Predictive Coding is for Every Matter.

We welcome you to join us at LTNY and continue the dialogue.

Adam Kuhn

Adam is an eDiscovery attorney and the Director of Product Marketing at OpenText Discovery. He holds an advanced certification for the Axcelerate eDiscovery platform and is responsible for research, education and outreach programs. Adam also serves as a Senior Research Fellow at the McCarthy Institute for IP & Technology Law at the University of San Francisco School of Law.

Related Articles