Child pages
  • 2015-09-28 HydraDAM2 at IU Meeting notes
Skip to end of metadata
Go to start of metadata

Date

Attendees

Notes

  • File versioning
    • individual files vs entire object (might do one or the other but probably not both)
  • Archivematica - UK effort, integrate activities through Hydra
  • ArchiveSphere - Penn State, don’t know exactly how it fits with Sufia or if it’s standalone or can be incorporated into Hydra heads
  • Fixity checking - where do we want that to happen? Fedora, SDA, Sufia/Hydra
  • Share Code4Lib article : Moab Design for Digital Object Versioning, look into ArchivesSphere, notes from Hydra and Digital Preservation session (Emory University mentioned LIFE2 gap analysis http://www.researchgate.net/publication/32894761_The_LIFE2_final_project_report)
  • Use manifest from bag to populate HPSS fixity checks as technical metadata on access, prod, pres files (files we get from Memnon/IU)
  • QUESTION: do we want to run fixity checks on derived files (ffprobe, high, med, low)?
    • potentially a lot of space use to do this, maybe collection by collection?
  • Metadata at collection level that can help decide how to do fixity checking on items? maybe base on file size?
  • 2018 moving to next release of HPSS - data validation between disk spool and tapes; need to figure out how to make this possible on the server without needing to do the transfer
  • Also need to do on-demand fixity check based on events - what is immediacy of that need?
  • Need meeting with Kristy Kallback-Rose and Danko Antolovic to consider HPSS functionality for this size of data - HPSS is write once, read rarely; we are different in that we get things out with cron all the time - maybe Heidi to schedule that meeting?
  • Streaming server - $1/G/year for Fedora (5-6T total); high-speed disk and just need near line storage (?); Hitachi Content Platform using Amazon S3 protocol is new possibility (a little less costly); not a file system but REST based system, can’t read line by line, whole object in or whole object out; looking at this for streaming, 3-5 seconds wait time before it starts streaming, local cache on streaming side (AMS doesn’t support this and has to continually clear cache); testing right now
    • think there will be 350T of derivatives total
    • access controlled through Avalon (dark or light)
  • Light Avalon (vs Dark Avalon) will be completely separate
  • Thinking about automatically putting items into Dark Avalon as part of HPSS ingest process so they are available to collection managers at same time as Mike/Susan QA process
  • Date/timestamp question on diagrams on file names related to how we implement versioning
    • if we do per file versioning then it makes sense to keep file name/object name separate; if object level versioning, only object would have datetimestamp info
  • By January, probably just put everything as it is in HPSS into Avalon and then iterate over this again to fix things - HydraDAM2 is another piece that connects

Action items

  • Meet with Kristy Callback-Rose and Danko Antolovic - Heidi and Brian [DONE]