The struggle to preserve today’s digital archives

Much of our contemporary lives are experienced through a screen. Moments that were once captured exclusively in journals, newspapers and books now also, and sometimes only, exist digitally. Libraries and archives that were adept at preserving the analog components of cultural memory have been slow to adapt their methods of capture and access to the nature of digital content.

ITU_blog_digblackholetalk BLG

Archiving today’s digital content poses different challenges to analog archiving.

But how can we keep up?  2.5 quintillion bytes of data are created every day. On social media alone, we send an average of 500 million tweets per day; upload an average of 300 hours of video to YouTube a minute; and share over 95 million photos and videos on Instagram a day.

The Internet Archive

As an institution borne of a will to preserve our digital online lives while the Internet was still in its infancy, the Internet Archive has been a leader in protecting digital heritage since 1996.

Brewster Kahle started the Internet Archive with the goal of collecting the web. By 2001, the famous Wayback Machine provided a way to provide access to what is now a collection of the incremental history of over 500 billion URLs. Now, the Internet Archive has expanded far beyond the Wayback Machine–it is a non-profit public library of millions of free books, movies, software, music, television, and more, with a stated mission of ‘universal access to all knowledge’. Such a lofty goal cannot be easily achieved without help, so we seek out partnerships all over the world with like-minded libraries who want to preserve and provide widespread access to their cultural materials.

The Internet Archive created the open source web archiving technologies for crawling and access which are used by us and by organizations around the world. While our own global crawling for the Wayback Machine captures a huge swath of the rapidly growing web, the nature of the evolving web is such that we recognize the need to consistently adapt to change. Our strategy is to build partnerships to help curate our web archiving collections. We engage memory professionals and renegade archivists, alike, to do a better job of saving the most important parts of the web. Researchers help us to tailor our global crawls. We also offer a crawling service for national libraries to capture their national web domains, as well as Archive-It, a subscription service with well over 400 partners, for creating, managing, accessing and storing web archive collections.

Of late, we’ve been focused on significant technological progress to stave off being outpaced by the fast evolving web. We are improving access to the Wayback Machine to allow for text searching, building better tools for collecting social media and audiovisual content, and continuing our outreach to find collaborative opportunities to do what we do better.  Over the years, we’ve also expanded our collections as we have all witnessed our civilization shift to online interactions for academic research, news, finding and reading books, watching television and movies, listening to music and radio, and even our social connections. It has become clear that to offer universal access to all knowledge, we must build new partnerships focused on more than just collecting the web.

In my Library Science Talk, I will discuss new ways we’re improving our strategies for collecting the web as well as the strides we are making towards massive scale online book lending, access to music, television and film, and many other programs. I hope you’ll come and engage in an exciting discussion about the future of the Internet Archive and how you can participate in collecting our world’s memory together.

The ITU Talk will be webcast live on 12 September 2016 at 15:30 (GMT+2). Watch it here.

Courtney Mumma

mummaHeadCourtney is a Program Manager at the Internet Archive, focusing on collaborative partnerships, grants management, community development, research initiatives, new services and sustainable innovation in web archives. Her career has worked to advance the digital cultural heritage preservation field, including helping to build the Archivematica open source digital preservation system and community. She has also taught and lectured in several cultural heritage venues on topics related to digital preservation and curation.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: