The Internet Age that vanished

There are no yellowing, faded newspaper Web sites. Except for the WayBack machine — the best we have, but admittedly spotty — the whole dawn of Internet newspapers from the mid-1990s on could vanish, never to appear in a yard sale or a treasure proffered up on the PBS’s Antiques Roadshow.

A database crash, a decision to shut down some old servers, or even some spirited housecleaning, and, blink, days, months, years of an electronic newspaper could be gone

It’s already happening. The Wayback archive does have some early Knoxnews home pages, but many of the graphics and photos are either gone or moved to a different location and don’t work. In our early days in Vignette, we aged stories off in a couple of weeks, including online only stories for which there was no archive elsewhere. They are just gone. The organization of many static packages has been lost. Even if we still have them, we don’t know where they are. When we transferred our articles from our old Fast Forward Web publishing platform to our current Ellington/Django platform, the stories were ported, but all comments were lost.

It’s ironic that the digital forms of paper newspapers are making for a better historical record than the now often more robust online versions. For example, we have an archive solution (in fact, more than one) for the printed paper version of a dramatic jury trial in our digital archives, but the online version probably also contains important supporting documents as PDF files, such as motions and rulings, that would be of use to future researchers. The comments added to the story also provide some glimpse into the public’s view at the time of the events.

Other than posting on the Web site, there is no archiving done with future use in mind. There’s a good chance that over time the links will be broken as some new platform or technology comes along and changes everything.

Michael Miner explores this issue using the dormant site and how the newspaper archives that are being transferred to the Denver library system don’t include those on the Web site. It’s a good piece except for an odd aside about online comment management.

He writes:

The point is that real archiving’s not a business–it’s a public service. The digital newspapers of the early 21st century will be unknown in the 22nd unless they’re aggressively safeguarded. They won’t sit around in boxes until they’re shredded or burned. Simple neglect will destroy them. 

Do any newspapers have explicit archiving strategies for Web content?


  1. I attended a seminar on archiving animation elements; the Disney guy was asked: “In the old days original animation cels were archived for 20 years before reevaluating their value. What is the policy at Disney for digital elements?” He answered: “It doesn’t matter. All the media is corrupt within five years anyway.”

  2. Remember the scene in Rollerball with James Caan?
    The computer that stores all history loses the 12th or 13th century.
    Oh well, who will care in a world that barely remembers campaign promises from 8 months ago.

  3. I’ve thought it would be nice to save comments on stories on The Oak Ridger’s Web site, but apparently those comments get lost at some point when the stories become dated and move toward/into archives.
    Sometimes the comments are worth saving, I think. Occasionally, I’ll print some, but this seems like a waste of paper when, theoretically, we should be able to save them digitally.

  4. Personally, I wouldn’t trust the IT guys at my local newspaper to back things up, even if they had a surefire way to to do. I know them. They’re not the most skilled guys to start with, and most of the time, they’re too busy answering people’s questions about their voice-/e-mail or wondering how they caught a virus from that e-card they opened.
    The Web site, and concerns for its posterity, fall by the wayside.

Comments are closed.