Child pages
  • Coordinating Fedora Instances
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

We want to run more than one repository, at least one for cataloging/testing use and one for production use. Will Cowan thinks it may be useful to keep one centralized repository for the master metadata and periodically export that data to one or more production repositories.

Fedora will eventually have built-in support for federated repositories.

Tentative plan:

  • All data is ingested to the "development" repository.
  • As each collection is finished, its contents will be exported, and imported to the production instance.
  • Periodically, an incremental update will be run to sync the development content with the production content

Big questions:

  1. Do we need 3 Fedora instances: development, cataloging (master), production??
    • Issues related to data stored on the instances:
      • There won't be much "in progress" data, and it will primarily be for collections that haven't been published yet.
      • Is there a need to separate cataloging from production? We don't want "in progress" items ending up on user's screens, but the catalogers need to view their items in the context of a real system.
    • Issues related to features:
      • The development server may go down more than the production server, as new features are tested.
      • We want to make sure new features for one collection don't break anything for other collections. New features should always be tested on the dev Fedora before moving to others.
      • Catalogers need a stable platform to work on.
      • Digitizers/submitters (people using ImageProc or requesting upload of data) need a stable platform to work on.
      • As we develop the Cataloging Tool and have a broader user base, people will expect their data to show up in the production system immediately, so the cataloging system should be the production system. (But we need a way to determine when something has enough data to be visible...)
  2. Are we preserving the data in the development repository or the production repository?
    • Variations treats the cataloging server as the master copy of the data, although some of the data in that server may be "in progress".
    1. What kind of load will the preservation system put on our server? Is this load better suited to the dev server or the production server?
    2. Does the dev server have enough space to store everything?
  3. Is it possible to "update" objects on the production server and maintain version history for these updates only?
  4. Is it better to do incremental updates per collection? Or over the entire repository? Does it hurt to have "junk" data in the production repository?
  • No labels