Child pages
  • Unused page

Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.


Heidi Dowding
Brian Wheeler
Julie Hardesty
Nianli Ma
Randall Floyd
Will Cowan


  • File versioning
    • individual files vs entire object (might do one or the other but probably not both)
  • Archivematica - UK effort, integrate activities through Hydra
  • ArchiveSphere - Penn State, don’t know exactly how it fits with Sufia or if it’s standalone or can be incorporated into Hydra heads
  • Fixity checking - where do we want that to happen? Fedora, SDA, Sufia/Hydra
  • Share [Code4Lib article], look into ArchivesSphere, see if Ben Armintor has notes from Hydra and Digital Preservation session somewhere ([Emory University gap analysis] link)
  • Use manifest from bag to populate HPSS fixity checks as technical metadata on access, prod, pres files (files we get from Memnon/IU)
  • QUESTION: do we want to run fixity checks on derived files (ffprobe, high, med, low)?
    • potentially a lot of space use to do this, maybe collection by collection?
  • Metadata at collection level that can help decide how to do fixity checking on items? maybe base on file size?
  • 2018 moving to next release of HPSS - data validation between disk spool and tapes; need to figure out how to make this possible on the server without needing to do the transfer
  • Also need to do on-demand fixity check based on events - what is immediacy of that need?
  • Need meeting with Kristy Callback-Rose and Danko Antolovic to consider HPSS functionality for this size of data - HPSS is write once, read rarely; we are different in that we get things out with cron all the time - maybe Heidi to schedule that meeting? - DONE
  • Streaming server - $1/G/year for Fedora (5-6T total); high-speed disk and just need near line storage (?); Amazon S3 server is new possibility (a little less costly); not a file system but REST based system, can’t read line by line, whole object in or whole object out; looking at this for streaming, 3-5 seconds wait time before it starts streaming, local cache on streaming side (AMS doesn’t support this and has to continually clear cache); testing right now
    • think there will be 350T of derivatives total
    • access controlled through Avalon (dark or light)
  • Light Avalon (vs Dark Avalon) will be completely separate
  • Thinking about automatically putting items into Dark Avalon as part of HPSS ingest process so they are available to collection managers at same time as Mike/Susan QA process
  • Date/timestamp question on diagrams on file names related to how we implement versioning
    • if we do per file versioning then it makes sense to keep file name/object name separate; if object level versioning, only object would have datetimestamp info
  • By January, probably just put everything as it is in HPSS into Avalon and then iterate over this again to fix things - HydraDAM2 is another piece that connects

This page needs to be deleted or used for something else