We want to run more than one repository, at least one for cataloging/testing use and one for production use. Will Cowan thinks it may be useful to keep one centralized repository for the master metadata and periodically export that data to one or more production repositories.
Fedora will eventually have built-in support for federated repositories.
There are two Fedora repositories, "development" and "production". The production repository always contains the master copy of the data. The contents of the development repository are never considered authoritative; they are only there for testing purposes. Specialized projects, like Evia, may have their own development repository. Individual developers may maintain their own development repository.
All features are deployed on the development repository for testing. If the development repository does not contain relevant data to test the feature, data may be copied from the production repository, or new test objects may be created. Any new features should be tested to ensure they do not break existing features or corrupt existing data.
All data is ingested to the "development" repository while a collection is first being developed. This is to ensure that the production repository does not become littered with test data. Once the collection has been deemed stable, it will be moved to the production server.
Incremental updates to a published collection will be performed directly on the production repository. This allows users of the Cataloging Tool to see the results of their updates immediately. It also allows us to better track changes to the collection. Any changes that will affect a large portion of the collection should be tested on the development repository before being applied to the production repository.
- How can we keep "in progress" items from ending up on normal user's screens? For example, in a published collection of books, we don't want to display a book that only has half of the pages digitized.
- Will large ingests have an impact on the performance of the production repository?
- What kind of load will the preservation system put on our production repository? Probably not much, because it will spend most of its time waiting for retrieval of files from HPSS and processing the files.
- Do we need to separate statistics for our preservation integrity service from regular use statistics? If so, how? By IP?