Child pages
  • Preservation Policies

Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.


  1. Start documenting policy decisions. This page, its siblings, and its children are good starting points.
    1. As we make decisions, determine what documents we should sort the decisions into, and what person/group should manage each document. Many documents will be managed by the Repository Manager on a daily basis, but regularly reviewed by the Repository Preservation Board.
  2. Look over documentation from other preservation repositories. Possibilities include:
  3. Determine who should be invited to join a Repository Preservation Board. This group needs broad representation across the libraries and the major users of the repository, but needs to be small enough to make decisions effectively. This group should meet regularly to ensure that policies meet the needs of the repository's users, and ensure that policies are being followed.
  4. As collections are added to the repository, validate whether they meet the current policies. Provide some central way to track this...
  5. Move towards a better Fedora/HPSS connection, so we can take advantage of preservation tools that are developed by the Fedora community.
  6. Set up an initial meeting of a Repository Preservation Board.

RLG "Trusted Digital Repository" checklist

A new version of this checklist, called the TRAC, Trusted Repository Audit and Certification, is now availablehas been released.

The most important document related to preservation is RLG's Audit Checklist for the Certification of Trusted Digital Repositories. While this checklist is a good place to start working on our preservation system, we won't treat it as a mandate.

Melanie Schlosser has started a workspace to organized the items on the checklist and document how the current DLP activities relate to them.


  • Failure to access HPSS
  • Failure of HPSS tapes
    • Simultaneous failure in both locations is possible (nuclear war)
  • Fire, tornado, or other catastrophic failure in WCC machine room
  • Disgruntled employee or random attacker (who obtains the password) attempts to delete everything from HPSS
  • Disgruntled employee attempts to delelte random sets of items from HPSS


  • We need a system to provide periodic reports on the existence of orphaned and incomplete objects, so we can keep the contents of the repository "clean".
  • Eventually, we would like to set up a "digital preservation advisory board" to proof our decisions. This page and its children will eventually be maintained by the board.
  • We want to preserve metadata as well as the master files, so all preservation activities should apply to the local disk space that Fedora manages in addition to the HPSS storage.
  • For maximum security, we will need to make offline copies of everything. But how can we verify the integrity of the offline copies? What if the verification process destroys the copies? The only way to manage this is to ensure that the offline copies are read-only.

Open questions

  1. Can we eventually store lots of small files (page images) directly in HPSS via the filesystem interface without aggregation, or will it simply take too long to retrieve these files?
  2. Should we store copies of our metadata in HPSS, or are the regular server backups good enough for this?
  3. How do we manage preservation packages for materials that will be accessed through Variations? We don't want to store duplicate copies of the derivative files, because they are quite large. Perhaps Fedora can store these as Redirect datastreams, and just keep the appropriate metadata in the actual repository directories.
  4. What is the best way to provide "proof" that something hasn't been altered since it was originally digitized?