Child pages
  • Preservation Policies

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

If all goes well, an integrity checker will never report a problem. But a non-existent or non-running integrity checker would exhibit the same behavior. So we must set up tests that ensure the integrity checker is regularly exercised. One simple way would be to put an object in the repository and purposefully corrupt it, then measure how long it takes for the integrity checker to notice.

Other thoughts

  • We need a system to provide periodic reports on the existence of orphaned and incomplete objects, so we can keep the contents of the repository "clean".

Open questions

  1. Can we eventually store lots of small files (page images) directly in HPSS via the filesystem interface without aggregation, or will it simply take too long to retrieve these files?
  2. Should we store copies of our metadata in HPSS, or are the regular server backups good enough for this?
  3. How do we manage preservation packages for materials that will be accessed through Variations? We don't want to store duplicate copies of the derivative files, because they are quite large. Perhaps Fedora can store these as Redirect datastreams, and just keep the appropriate metadata in the actual repository directories.