eDiscovery

Redaction blunders in context

Finding and protecting sensitive data for eDiscovery and investigations

There are many great articles that focus on the headache (and harm) that occurs when sensitive data is not redacted properly in eDiscovery, investigations, and regulatory compliance–including data privacy mandates. In many recent redaction blunders, redactions were not permanent and irrevocable and sensitive data was disclosed.

This blog focuses on sensitive data detection as a precursor to applying effective redaction because data can’t be protected unless its existence is known. The types of sensitive data common in eDiscovery and investigations, the types of detection tools within eDiscovery platforms such as OpenText™ Axcelerate™, and what to look for in redaction capabilities will also be discussed.

The types of sensitive data in eDiscovery and investigations 

There are three primary types of sensitive data that must be protected in eDiscovery, investigations, and compliance mandates:

1. Privileged data can exist as entire communications or portions of documents safeguarded under attorney-client privilege. Privileged data is typically text-based, varies by matter, and does not follow common strings or patterns. Failing to find and protect all privileged data can have disastrous consequences, including court sanctions, serving up the smoking gun to opposing counsel and a black eye with current and prospect clients.

Regular Expression engines and search filters are the primary tools to detect privileged data wherever it exists. Redaction is rarely applied because the objective is to safeguard privileged data by preventing it from leaving the private and secure environment where it already resides.

2. Personal data that contributes to the identity of individuals is protected by eDiscovery rules (e.g., FRCP 5.2[a]) that dictate how redactions must be applied before it’s produced. Failing to comply with eDiscovery data protection rules can result in court sanctions and loss of goodwill with clients and prospect clients.

Personal data is also protected under data privacy laws such as GDPR in the EU and CCPA in California (CPRA as of January 1, 2023). Failing to comply with data privacy laws can lead to substantial fines and loss of customer loyalty along with a direct impact on revenue.

Personal data typically follows known patterns, such as email addresses, birth dates, and Social Security numbers. Redaction is universally required to protect personal data.

3. Commercially sensitive data includes IP, trade secrets, communications about pending M&A activities, and pharmaceutical molecule names, among many others. Commercially sensitive data can be either text-based (e.g., internal code names) or pattern-based (e.g., patient identifier numbers in Healthcare or supplier identifier numbers in Manufacturing).  

Failing to protect commercially sensitive data can disclose critical IP and trade secrets to competitors, derail M&A activities, or impact customer loyalty. Redaction is universally required to protect commercially sensitive data. 

Sensitive data detection tools

Pre-configured libraries of personal data patterns are the primary tool used for efficiently and effectively locating data personal data. Regular Expression engines are used for custom personal data patterns but can also be used for keyword searches alongside or instead of search filters. Another valuable tool in some eDiscovery platforms, including Axcelerate, is the automated detection of people, places, and organizations.

Sensitive data detection tools should be displayed as central features within main action menus and should be conjoined with search filters for surfacing relevant data. Data protection can be instilled as a mainstream routine when searches for sensitive data can be stacked with searches for relevant data.

Sensitive data redaction tools  

Redaction tools must permanently and irrevocably mask or remove sensitive data before documents are exposed. Redaction tools should work cleanly and efficiently across all forms of data including documents, email, chat, and Microsoft® Excel®.

Automated QC assures that redactions are executed properly before production to increase assurance of effective sensitive data protection and to ease the burden of manual QC.

Redaction as routine

Sensitive data comes in many forms. The methods to detect and protect sensitive data must be able to accommodate the particular requirements of detecting both pattern-based and text-based data. The threshold is onerous and is set no lower than finding absolutely everything that must be protected. Courts, regulators, clients, and competitors do not care that “most” of the data was detected and protected.

Redaction tools should be flexible and executable as a mainstream routine so sensitive data protection is managed as a top-of-mind activity concurrent with overall data review. The burden of compliance is further reduced when redaction tools are easy to access and easy to use.

Please see the Axcelerate product page for more information on how Axcelerate enables effective sensitive data detection and protection.

Duncan Bradley

With over 25 years experience in Market Intelligence and Product Marketing, Duncan brings a rich set of perspectives as a subject matter expert in a broad array of tech topics including content process automation and analytics, data privacy and regulatory compliance. As Product Marketing Manager, OpenText Legal Tech, Duncan contributes to product positioning and messaging across OpenText’s suite of Legal Tech products. Duncan holds a MBA from the University of British Columbia, a BA Hons in Political Science from the University of Guelph and an Executive Certificate in Negotiation and Conflict Management from the University of Notre Dame.

Related Posts

Back to top button