Twitter has released a data store of posts from 3,841 accounts that have been identified as being connected to the Internet Research Agency (IRA), the Russian “troll factory” that used Twitter and Facebook to conduct an “influence campaign” aimed at causing political turmoil during the 2016 US presidential election as well as undermining the political process in other countries, including Germany and Ukraine.
The company has also released another set of data connected to 770 accounts believed to be connected to an Iranian influence campaign.
Totaling over 360 gigabytes—including more than 10 million tweets and associated metadata and over 2 million images, animated GIFs, videos, and Periscope streams—the data store provides a picture of how state-sponsored agencies have used the Twitter platform. Some of the content dates as far back as 2009.
In a post announcing the release, Twitter Legal, Policy, and Trust & Safety lead Vijaya Gadde and Twitter’s head of Site Integrity Yoel Roth wrote that Twitter was providing the data “with the goal of encouraging open research and investigation of [state-sponsored influence and information campaigns] from researchers and academics around the world.”
The archive of the IRA’s tweet metadata alone is 5.4GB of comma-separated data when expanded. In many cases, the user ID and screen name of many accounts—those with fewer than 5,000 followers—have been concealed with hash values to “reduce the potential negative impact on real or compromised accounts,” a Twitter spokesperson said in a statement on the data archive. The hash values still allow individual accounts to be analyzed without exposing the actual names associated with them.
Ars is currently pulling the post data into a database for analysis—with that many records, it might take a while—but an initial look at the data shows that the IRA accounts targeted both domestic and foreign audiences. They used a mix of tools to post, ranging from official Twitter Web and Android clients to third-party clients (Sociable, TweetDeck, etc.) to custom-coded “bot” clients with Slavic names—including “rostislav,” “bronislav,” and “iziaslav.” And they aimed for audiences across the political spectrum.
A common tactic used by the IRA was to create “local news” accounts for major US cities, seeding them with posts linking to local news outlets. The accounts, such as “Atlanta Online,” “Baltimore Online,” “Baton Rouge Voice,” “Chicago Daily News,” and “Dallas Top News” would also include tweet-length news headlines with no link (such as “Obama draws sharp contrasts with ‘mean’ Republicans” and “Trump’s immigration stance dividing GOP in Arizona again”).
Aside from hundreds of pro-Trump and alt-right accounts (many registered with the location “Estados Unidos”), some of the most widely followed included:
Gadde and Roth noted that Twitter expects these sorts of campaigns to continue and said that Twitter’s Site Integrity team will “continue to proactively combat nefarious attempts to undermine the integrity of Twitter, while partnering with civil society, government, our industry peers, and researchers to improve our collective understanding of coordinated attempts to interfere in the public conversation.”