The Library of Congress Twitter Archive: A Failure of Historic Proportions

Published in

DMRC at large

5 min readJan 2, 2018

It’s dead: the U.S. Library of Congress has officially pulled the plug on its project to create a full, complete archive of all of Twitter — past, present, and future. The Library will now only archive tweets “on a selective basis”.

The aborted Twitter Archive project was established after Twitter, in a well-publicised move, gifted the Library access to its tweet archive and live feed in 2010. But unfortunately, beyond the fanfare, the Library never provided the project with the support it required.

Twitter’s announcement of its gift to the Library of Congress, in April 2010

Released on Boxing Day, in a period where public scrutiny of official announcements is generally limited, news of the project’s failure arrives at a time when Twitter’s public and political importance has never been greater. The project’s end creates substantial concerns both for present analysts and for future historians.

Exhibit A: Donald J. Trump. The President’s use of Twitter is as famous as it is infamous, and the status of his tweets as official government statements remains a matter of debate. More specifically, some analysts following the Mueller probe into the Trump campaign’s alleged collusion with Russian operatives have already pointed to Trump’s tweets as a potential admission of obstruction of justice.

Clearly, the tweets posted by @realdonaldtrump, @POTUS, and other accounts associated both with the administration itself and with its political opponents in Congress and beyond must be preserved for further study in the shorter and longer term. An impartial public service institution like the Library of Congress is inherently best-placed to address that task; whether the LoC’s much-reduced approach to archiving selected tweets can still achieve it remains unclear.

Similarly, well beyond the United States there is considerable concern at present about the role of fake and automated accounts on Twitter and other social media platforms. Acting in unison and controlled by nefarious state and commercial actors, these accounts are suspected of seeking to affect the public perception of particular issues and individuals. Their mis- and disinformation, their ‘fake news’, their ‘computational propaganda’, has been implicated in the context of the Brexit referendum and various national elections.

Many research projects around the world are now attempting to unravel the networks and activities of the accounts engaged in such practices, on Twitter and elsewhere. But in the absence of a comprehensive dataset on contemporary social media activities, each is picking only at a handful of strands of the network, and there is a real sense that the bot-herding information warriors behind it all remain one step beyond those who would seek to stop them.

A comprehensive, all-inclusive Twitter Archive in the shape initially envisaged by the Library of Congress would have provided a major boost to such efforts to protect public debate from outside manipulation. While methods to analyse the activities of Twitter accounts have much improved in recent years, access to the large-scale datasets to which such methods could be applied remains the major bottleneck. The LoC archive could have provided a master dataset to this critical effort.

Both these examples relate to our current needs. But however the world manages its many contemporary challenges, future historians will no doubt also want to retrace how we got here, and what role social media played in all this. Are the critics right, and did social media lead to a fragmentation of the public into different echo chambers, or did we also use these tools to organise and fight back against those who sought to undermine civic society to further their own interests?

This is not a question which can be answered by studying a Twitter archive, however comprehensive, on its own. Ideally, we would need similar archives for Facebook and other social media platforms, as well as large-scale archives of mainstream media around the globe. But we crucially need the LoC’s Twitter Archive, or something like it, as part of the mix, and the future’s understanding of our troubled present will be all the poorer for its absence.

All this makes the Library’s decision to give up on the project all the more tragic. That said, it is not unexpected: especially under its recent leadership, the Library of Congress has remained a deeply traditional institution with a limited understanding of new technologies. Beyond the initial fanfare about the project, LoC updates on its progress have remained scant and infrequent. Visiting researchers who had expected to work with the dataset found it unavailable.

A variety of grassroots, scholarly, and commercial projects have already sprung up around the world to support social media archiving attempts of their own, at varying scales: they range from capturing only the deleted tweets of selected politicians to tracking entire national Twitterspheres, as my colleagues and I have attempted in Australia.

Even in combination, though, none of these can possibly come close to what the Library of Congress Twitter Archive as originally conceived had promised: a live, comprehensive archive of all tweets, from Twitter’s launch in March 2006 to the present day, and continuing into the foreseeable future. That goal can only be achieved with the active support of Twitter itself.

Now that the Library of Congress, after years of prevarication, has finally given up on its Twitter Archive project, it is therefore high time that Twitter found one or more new partners for this initiative — the Internet Archive or other national libraries, perhaps. Ideally these would be partners with a greater affinity for digital content than the Library of Congress has shown to date.

To do so would also provide a major signal of Twitter’s commitment to corporate social responsibility, at a time when its reputation is under siege both from Donald Trump’s exploits and from substantial concerns about the volume of malicious human and automated actors present on the platform. The establishment of a comprehensive Twitter Archive — properly this time — might even shame Facebook and other platform providers, traditionally even more reluctant than Twitter to engage with public-interest social media research, into following suit.

So, now that the Library of Congress has failed to deliver, Twitter has an opportunity to regain the initiative. But whether it seizes that opportunity or not, we must not allow the role of social media in contemporary society to remain poorly documented. With Twitter’s support or without it, we must continue to push forward.

The Library of Congress Twitter Archive: A Failure of Historic Proportions

Written by Axel Bruns