Researching the Internet Archives

When a print publication is issued an ISBN number in the United States, an archive copy of the publication is sent to the Library of Congress.   Similar procedures for creating a permanent record of print publications are in place in other countries. This isn't the case for web-publications. Indeed, some fear we've stumbled into an Internet Dark Age, with our digital cultural history vanishing as we speak.   However the need to archive Internet information is being recognized around the world.   Increasingly national libraries are making copies of culturally significant web pages.   Of particular note is Egypt's multilingual archive effort: The Library of Alexandria. http://www.bibalex.org/website .

The Internet Archive

In the United States a partnership between Alexa and the Internet Archive is creating a huge collection that documents the World Wide Web back to 1996. The Internet Archive claims to have 30 billion pages in the vault. The Internet Archive includes web pages, moving images, texts, and audio files.   These archived pages can be accessed via The Wayback Machine: http://www.archive.org

screen shot of wayback machine search box, part of the Intenet archive

Unlike a search engine you cannot search the Wayback Machine by concept, keyword, or popularity ranking.   You'll need at least the domain name of the website you are seeking.   The Internet Archive has released Recall, a new tool that will perform a full text search of about a third (11 billion) of the pages in the archive. 

Cartoon Image of a computer reading from paper text. Listen


Authored by Dennis O'Connor 2003