The most powerful archiving tool on the Internet is in danger


This month in the United States of America Published today Excellent report Which revealed how US Immigration and Customs Enforcement Delayed disclosure of basic information about its effects Detention policies. The authors used the Internet Archive’s Wayback Machine to compile and analyze detention statistics from ICE and track how the agency has changed under the Trump administration. This story is one of countless examples of how the Wayback Machine, which crawls and archives web pages, helps preserve information. The common good. It was also, says Wayback Machine director Mark Graham, “a bit ironic.”

USA Today Co., the publishing group formerly known as Gannet that operates its newspaper of the same name and more than 200 additional media outlets, is prohibiting the Wayback Machine from archiving its work. “They’re able to compile their anecdotal research because there’s a Wayback Machine,” Graham says. “At the same time, they’re blocking access to it.”

A number of other major newspaper organizations have also recently done so Go to Restrict Wayback machine. From archives of their stories, including the New York Times. According to analysis by detection startup Originality AI, 23 major news sites currently block ia_archiverbot, the web crawler commonly used by the Internet Archive’s Wayback Project. The social platform Reddit as well. Other outlets limit the project in different ways: The Guardian does not ban the crawler, but excludes its content from the Internet Archive API and filters articles from the Wayback Machine interface, making it difficult for laypeople to access archived versions of its articles.

USA Today spokesperson Lark Marie Anton stressed that “this effort is not about blocking the Internet Archive specifically” but is instead part of the company’s broader efforts to block all scraping bots. Robert Hahn, director of business affairs and licensing at The Guardian, says he has been in conversation with the archive about “concerns about potential misuse by AI companies of content sets crawled for preservation purposes.”

Now, individual reporters are bucking the trend. This week, advocacy organizations, including the Electronic Frontier Foundation and Fight for the Future, rallied journalists around the Wayback Machine issue. The coalition collected more than 100 signatures from working journalists who recognize the value of the tool and submitted a letter of support to the Internet Archive. Signatories range from TV mainstay Rachel Maddow to freelance reporters like Spitfire News’ Kat Tenbarge and User Mag’s Taylor Lorenz. “In previous generations, journalists would turn to the physical archives of a local newspaper or local public library to access historical reports and follow present-day leads back into history,” the letter said. “With many newspapers closing, and no clear path for local public libraries to preserve digital-only reporting, the work of protecting the press record increasingly falls to the Internet Archive.”

Laura Flynn, a signed podcast producer and moderator at The Intercept, says the Internet Archive has been an “essential tool” throughout her career, playing an integral role in validating and presenting audio clips. Another signatory, Chicago Reader writer Mikko Caporale, says the Wayback Machine helps when writing about old bands and cultural figures by providing access to old fan sites that might otherwise be lost to time.

Caporale says the tool has also been helpful in their role as a union organizer. “I’ve also been using the Wayback Machine a lot in union organizing work to find old job listings so we know what the company claimed to hire people for versus what tasks were actually assigned or to see how different positions were retooled at different points,” Caporale says. “These posts also help us track wage fluctuations across the organization over time.”

Leave a Reply

Your email address will not be published. Required fields are marked *