Social Media Data Collection Best Practices

Thursday, December 30, 2010 by Thought Leadership Team

In a vastly changing environment of social whims and woes, social media sites remain a dominating force in connecting the public, businesses and government entities to one another.

Social networking has become so common that hundreds of millions of users subscribe to and communicate using social media sites each day. Indeed, Facebook’s global audience has blossomed to more than 465 million users, with more than 125 million in the United States alone. The growing prevalence of social networking sites, especially in the corporate environment, presents unique challenges. Attorneys and IT experts must take steps to understand the best and worst practices in preserving, reviewing and producing electronic data from these sites in order to obtain evidence that may make the difference in an investigation.

Investigation Techniques

When capturing data from social media sites, investigators often use video cameras to record relevant information from a user’s Web page. In more advanced, technical circumstances, capturing software is the preferred method in recording a user’s Internet activities. This capturing software allows for easy navigation while recording images, conversations, posts, messages or videos. Although hard copies of the data are always acceptable, courts readily favor capturing software during the review process to save time and money.

Investigators are also at liberty to freely search and extract information from an open page. For instance, many Facebook users leave their profiles open so anyone searching for their name can access their profile content, including wall posts, comments and pictures. Twitter also makes an exorbitant amount of information available to the public, and courts often follow “tweets” back to the original conversation, allowing the introduction of such information as evidence if useful to the litigation at hand.

Even if profiles are set to “private,” investigators have access to a number of tools which can further data collection efforts. Despite privacy blocks, security settings can be rerouted to some extent through methods such as searching blog posts or other website content, or by using people search engines. One common Web-based tool that allows an investigator to access older information is the Wayback Machine. This website records data from the past and affords investigators the ability to sift through information showing them what certain websites once looked like. As a result, even though private information may be currently blocked or secure, investigators can still produce electronic evidence or other data that may have been “public” at one point in time.

In addition, sites like Twitter, BoardReader and Bing are developing more advanced search engines which will be able to quickly recall and locate information stored or communicated through social media sites. These new search engines can identify and characterize social media users, preserve older data communicated and produce electronic evidence with a few clicks of the mouse. For these reasons, investigators need to use search engines and other repositories to collect and review data in addition to using sites like Facebook and Twitter.

Avoid Investigation Hazards

It is important to note that not every piece of information on the Web is useable or defensible. Due to confidentiality, privacy and security concerns, most courts will not allow a person to falsely represent themselves as a “friend” to collect data. This is known as “friending someone under false pretenses.” If an individual accesses the private data of another under an alias, or if the person being “friended” is unaware of the individual’s true intentions, courts will not hesitate to disregard information obtained involuntarily or without a user’s actual knowledge.

Even if data from an individual’s social networking page is lawfully obtained through a judge’s order, other evidentiary issues present concerns for investigators. Although an attorney may have found the “smoking gun” evidence, information gleaned from social media sites holds profound hearsay concerns, both in respect to whether they are intended to be “used for the purposes of communication” and whether they are authentic. Attorneys who engage in simple copy-and-paste efforts or screen prints face significant hurdles when trying to get this information admitted into evidence. To combat these hurdles, experts are often used to extract, produce, review and testify on social media data. Experts can capture the evidence in a way that correctly identifies the questions of “who,” “what,” and “where” that juries or judges consider when observing electronic evidence. This identification will help address any authentication issues posed by this evidence.


Although the way we collect social media data is still evolving, one thing is certain: social media will continue to grow as an important data source to consider when conducting an investigation or engaging in litigation. Ultimately, attorneys and IT experts must ensure that steadfast preservation, review and production techniques are implemented. Because even the savviest of attorneys and IT experts may experience preservation or production issues when collecting social media information, be sure to consult with an expert to bolster the credibility and defensibility of the investigation.

Special thanks to Dave E. Canfield, E.J.D., Managing Consultant in Kroll Ontrack’s Electronically Stored Information (ESI) Consulting group. In his role, Mr. Canfield assists clients in the creation of discovery, data collection and case management strategies, and with the creation of systematic processes and documentation to support repeatable, efficient and reliable electronic discovery procedures. Mr. Canfield may be reached for questions at: