Multilingual Madness

Wednesday, January 5, 2011 by Thought Leadership Team

Cross-border discovery is an increasing reality in today’s globalized world. Corporations and practitioners alike must begin the process of becoming education with regard to the intricacies involved in multilingual discovery in order to avoid common ediscovery pitfalls.


One of the first challenges will be data collection and the primary considerations include: location, people, tools and laws.

  • Location: It is important to identify as early as possible where the data you need to collect is located in order to create an efficient discovery strategy. In many cases, the data is scattered throughout several countries and is comprised of multiple languages, and you may need a separate plan of attack for gathering the data.
  • People: A thorough and defensible, yet timely collection is of vital importance to your discovery efforts. Be sure to send only qualified and experienced people to work on the collection, especially when the collection occurs abroad. If possible, look to leverage a data collection effort by utilizing local collection experts.
  • Tools: Certain collection tools commonly used for electronic English-language data collection do not support collecting data in other languages. You must ensure that your collection tools support any languages or character sets you might encounter in collection.
  • Laws: Collecting data on foreign soil may raise legal roadblocks not common in the US. It is prudent to confer with local counsel to fully identify local data transfer rules, privacy regulations and jurisdictional issues specific to that location before you deploy your collection team.

Filtering & Processing

After data collection is complete, legal teams are then faced with the challenge of filtering and processing the data. One thing the legal team should consider is character encoding, which allows a computer system to recognize and display English and other language characters. In addition, the legal team should look for an ediscovery provider who can pass along language identification information via a metadata field. Next, legal teams should evaluate what search options will be available with a data set that includes multilingual data and consult with an expert regarding search considerations and methodologies. Finally, legal teams should look for a service provider who can generate reports early in the process regarding language(s) identification, document count for each language and language breakdown by custodian.

Review & Production

Finally, multilingual data presents unique challenges in review and production. In terms of document review, a legal team has a few options: use native speaking attorneys to review documents in their native language, translate the documents into English for a review or use a hybrid of these methods. As a general rule, using a native speaker is more accurate than translation, but is also more expensive. Moreover, human translation is generally more accurate than computer translation, but likewise more costly.

When deciding on which review method to utilize, the prudent practitioner must consider the importance and the volume of the multilingual documents that need to be reviewed. Review by native speaking attorneys or human translation may not be practical to complete within discovery deadlines, financially feasible or necessary. On the other hand, in many jurisdictions the producing party has the burden of proving their privilege review was reasonable for purposes of assessing whether they waived attorney-client privilege in the event of inadvertent disclosure. Deliberately choosing the review method (native review, human translation or computer translation) for multilingual documents increases the defensibility of the review. In other words, the ability to articulate reasons why one review method was chosen over another strengthens the "reasonableness" argument.

Turning to production, several unique considerations arise in the context of multilingual ediscovery production. First, agreeing with opposing counsel early on regarding production language is crucial. A document may either be produced in its native language, in English or both. Moreover, parties should also agree on production format. Production format options for multilingual data are the same as for other electronically stored information: native format, a load file, an online repository or print. Last, parties should agree in advance on production ordering and sequencing. Multilingual documents can be organized for production solely by custodian irrespective of the document's primary language, or can be further organized by both custodian an language. The old saying is true: an ounce of prevention is worth a pound of cure and the key to managing each of these production considerations is the same – agree in advance! The ideal time to discuss and reach an agreement on multilingual ediscovery production options is during a Rule 26(f) conference. Early party consensus on production language, format, and ordering and sequencing is crucial to developing your discovery plan, and may avoid costly disputes and potential re-dos later in litigation.

The bottom line: a legal team can avoid discovery disasters and leverage opportunities throughout the process by educating themselves about the process, planning ahead for the inevitable and partnering with an experienced e-discovery service provider that possesses the technological tools and expertise necessary to navigate the waters of multilingual ediscovery when involved in complex litigation, investigations or compliance matters.