The seven critical requirements for modern data collection

In an earlier blog, I discussed the trends necessitating a modern approach to data collections to support early case assessment, investigations and eDiscovery: an increasingly…

OpenText profile picture

October 2, 20204 minute read

In an earlier blog, I discussed the trends necessitating a modern approach to data collections to support early case assessment, investigations and eDiscovery: an increasingly remote workforce, new sources of electronically stored information (ESI), including ephemeral data, and a rise in regulatory compliance mandates and litigation, to name a few.

This blog looks at the seven critical requirements for collections software to be effective in today’s environment, at the scale required to address the new world of data.

1. Comprehensiveness

Modern data collection software must be comprehensive to collect data from all relevant sources. This includes physical endpoint sources such as like laptops and desktops, on-premise and cloud-based content repositories, and email systems, including archived PST files. Modern forensic software includes robust connectors to all of these data sources as well as remote agents for collecting data from endpoints.

2. Speed and efficiency

Data collection software must work quickly and efficiently. It should use advanced search filtering to cull data at the point of collection and automatically deduplicate and deNIST data sets to limit the size of collections. This minimizes costly over-collection in the context of eDiscovery where the cost of review is often proportional to the volume of data that requires review. Efficient data collection also helps organizations identify conclusive information as expediently as possible for investigations in which time is of the essence.

3. Ease of use and insight

Modern data collection software should crawl multiple target sources in parallel to expedite collection time and enable frequently used criteria to be templated and automated. Additionally, collections should be granular to be able to target specific sources. For example, individual folders within multi-tiered folder structures should be able to be selected discretely to avoid collecting irrelevant data.

Sophisticated data collection software makes it easy to gain insight into the subject and content of the data itself. With pre-collection analytics, users can rapidly understand the scope of the data before it is collected while advanced search helps extract the specific data of interest from within large volumes of irrelevant data. Collections should also be able to be conducted in parallel so data can be analyzed as individual jobs are completed, instead of waiting for entire processes to finish.

4. Unobtrusive operation

Data collection should run quietly in the background. This avoids monopolizing system resources and interrupting the custodian’s normal operation of the device and the applications they rely on to execute their work.

5. Failsafe measures

With dispersed data, sporadic connectivity is a fact of life. Data collection solutions should maintain logs of their successful processes and automatically reattempt any collection that fails, such as when a device drops off a network. Advanced forensic data collection solutions communicate with remote agents to monitor connection attempts and automatically execute retries until devices re-appear on networks and collections are completed.

6. Defensible process and results

Data collection must not alter metadata or compromise chain of custody. Both the process and the results must be defensible and forensically sound. Data collection methods should generate legally sanctioned output formats such as EnCase Information Assurance’s LEF (Logical Evidence File).

7. Easily transferable outputs

One of the core objectives of data collection is to enable subsequent data review, so data must be collected in an industry-standard format that can be readily ported to industry standard analytics and review platforms. Streamlining the effort required to load data into review platforms makes access to the data faster and easier so review can get underway without delay.

Taken together, these seven criteria instruct a modern approach to data collection at scale that meets all the critical requirements for efficiency, effectiveness and defensibility.

For more information, read the white paper “Modern Data Collection: New Imperatives and Critical Requirements.”

Share this post

Share this post to x. Share to linkedin. Mail to
OpenText avatar image


OpenText, The Information Company, enables organizations to gain insight through market-leading information management solutions, powered by OpenText Cloud Editions.

See all posts

More from the author

Manutan combines digital services with the human touch to delight customers

Manutan combines digital services with the human touch to delight customers

At Manutan, we equip businesses and communities with the products and services they require to succeed. Headquartered in France, our company has three divisions, serving…

4 minute read

Reaching new markets in Europe and beyond

Reaching new markets in Europe and beyond

How information management specialists at One Fox slashed time to market for innovative products with OpenText Cloud Platform Services At One Fox, we’ve driven some…

4 minute read

SoluSoft helps government agencies tackle fraud faster

SoluSoft helps government agencies tackle fraud faster

Fraud, in all its forms, is a pervasive problem, spanning industries and preying on vulnerabilities in federal and state government systems. Each year in the…

3 minute read

Stay in the loop!

Get our most popular content delivered monthly to your inbox.