Early Data Assessment – Too Valuable To Ignore

Wednesday, December 29, 2010 by Thought Leadership Team

Electronic discovery continues to be viewed as one of the most expensive parts of litigation.

Upon recognition that electronically stored information (ESI) grows more exorbitant as the information and technology bubble expands, attorneys must be attuned to essential ediscovery savings available through the utilization of advanced technologies, including early data assessment (EDA). Using EDA to reduce data before proceeding with data review and processing can corral discovery costs, allow for greater efficiency and increase defensibility in the courtroom. Through EDA, counsel can conduct a true “assessment” of the data before making strategic decisions and incurring costs that may prove to be unnecessary.

EDA as an Advantage

The first vital step in reducing litigation costs can be accomplished by narrowing the scope of potentially relevant data. Because attorneys who are involved in complex litigation struggle to review epic volumes of ESI, reducing the number of custodians prior to document review substantially minimizes review efforts later. In doing so, attorneys need to “weed out” irrelevant custodians and maintain only the pertinent custodian files. An overall reduction in the number of custodians decreases the amount of electronic files, which diminishes time associated with data processing and allows for greater expense forecasting.

When using EDA, also take advantage of email analytics – a technology-enabled process where emails in the document set are organized and analyzed. This process recognizes and visually represents relationships between people, events, timelines and communication patterns through advanced visualizations. By representing data in this fashion, people involved in the ediscovery process can analyze emails quickly, form legal case strategy, investigate any potential internal incidents and intelligently collect data in preparation for discovery.

Important Advanced Searching Techniques

A proper EDA tool will also include advanced features such as concept searching, topic grouping, email threading and near-duplication that will allow for a quicker, more efficient and accurate review of the data sets. Using these tools prior to exporting the data for review will further aid in narrowing data sets to save money during the document review process.

Concept searching allows a user to enter a keyword or phrase and obtain conceptually related items. Using this search technique can identify associated terms and concepts, even if they do not match exact search terms. This technological capability delivers faster and more accurate results than conventional searches. For example, in the case of litigation related to the crash of a Boeing 727 aircraft, a query for that phrase would also retrieve documents concerning “Boeing,” “77,” “B77,” “airliner,” “plane” and other pertinent topics without having to search for each permutation individually. The search may also turn up technical terms associated with the structure of the aircraft or other crashes, even if the user is not familiar with those terms and would therefore not be able to search on them.

Another advanced searching technique is topic grouping, which groups similar documents together while labeling them for quick identification. With this technique, users do not need to “seed” the processing engine by providing keywords. Likewise, email threading also allows for greater efficiency in the identification process, by allowing users to identify, group and review email conversations based on content. Using the actual content of the emails to identify threads is a reliable method, as it will not fail to recognize a thread if the subject line changes or if emails are exchanged across different email applications.

Finally, users can also use near-duplication technology to identify and compare documents that are very similar to one another, but are not exact duplicates. This technology assesses the document set’s similarities, identifying the most uniquely representative documents as “the core.” All related documents are then grouped around the core, allowing the user to engage in a side-by-side comparison of the data to quickly determine whether the near-duplicate documents in the set can be discarded as irrelevant, or whether the differences should be reviewed.

Preparing for the Rule 26(f) Conference

Using EDA also aids in preparation for the Rule 26(f) conference. Using visualization, search and reporting features to identify relevant custodians and formulate keyword and date criteria for filtering can inform and support counsel’s arguments on whether and how to reduce or augment the scope of the discovery. Search features allow counsel to test out proposed filtering criteria before they commit to using it in their case. They allow counsel to determine whether a given search term is over- or under-inclusive, or will result in a multitude of false hits. In addition, EDA reporting features can be used to quantify the number of documents or GBs that are responsive to certain keyword or date criteria and thereby estimate the corresponding ediscovery costs that would likely be incurred to process, review and produce those documents. Visualization features can also be used to identify and prioritize custodians for discovery, bolstering arguments for or against the necessity of preservation of data held by certain custodians.


Litigation is a strategic endeavor, but EDA cultivates information shortcuts that prove fertile to subsequent discovery processes. Although discovery consumes most of the costs associated with litigation, those costs can be mitigated by effective case strategy and data-narrowing efforts. EDA not only increases defensibility, but also reduces cost and promotes efficiency in the long run, and any reduction in the amount of electronic data that must be reviewed translates to immediate and sizeable cost reductions.