Child pages
  • 2016-11-14 Demo notes
Skip to end of metadata
Go to start of metadata

Date

Attendees

Goals

  • Check in RE where we stand with a minimum viable product
  • Get community feedback on design and implementation of features, especially in regards to PREMIS 

Discussion items

TimeItemWhoNotes
    

Notes

Andrew Woods : why not implementing community component in terms of the PREMIS events logging? 

Drew M: Esme pointed out wouldn't need to do a data migration; what we want for this type of a plugin is trying out with ease - if someone has standing fedora, would have to provide a pre-packaged version. User that gets logged in audit log service is the fedora user, not tracking the actual logged-in user. We want our own workflow around what gets logged and when, would need filtering to get rid of all the noise of what events we want to track. PREMIS events needs are pretty basic, not a complex data model. If we did enable audit log service, integration points would be a lot lower level.

Andrew Woods: collectively this makes sense, small factors add up to this being a good decision

Drew: combining Sufia CC and plugin type model, the more that those plugin type things can live at the higher level, the more this makes sense for us

Demo

Drew: idea is that this will be an installable gem, but the process to make this a gem is more to do with Rails machinery and less with features

Decision in the September meeting: create a secondary Blacklight instance for tracking preservation events; preservation events don't fall into the category of a Work, more single log entry; lumping discovery into existing catalog controller stuff would have been problematic; development challenge was finding what CC offers to create this secondary search interface and getting rid of what's not needed

Looking at how to facet these things. Currently has date facet, type needs to be indexed differently, this is just an example

For storing users, can take the mailto uri and use this to compare to the users table in Hydra, if you're registered within HD2; you don't necessarily need to be authenticated user in hd2 to be logged as a premis agent

Jon: Search for objects based on preservation events as being problematic because we've separated these two in the search interface

Drew: ticket in to create more user stories for this sort of thing to define; As far as saving down preservation events to the fileset level, what would be needed is what level of detail would you need to save down to on the filset; one thing is to reach down to the filename, if we have a use case to specify that, you can search based on the filename

Jon: Important for manual auditing

Will: esp once we get to auto-fixity checking

How far up do we need to push this? Collection level? How high up do we need to be able to see events?

Andrew Myers: Any consideration for storing these events in a triple store? Parents and parents and parents would make sense for querying triplestore

Drew: any type of query logic is going to the solr level when we run an index; haven't considered triplestore 

Andrew: possible benefits: various types of queries, can service those queries we've defined without having to do any advanced... Curious to what extent we're talking about PREMIS here? To what degree is the effort to conform to what PREMIS is saying; if serving the standard PREMIS ontology, opens the door for easier interoperability and discovery with others also using PREMIS ontology

https://wiki.duraspace.org/display/FEDORA4x/Audit+Events+for+External+Processes#AuditEventsforExternalProcesses-AddingEventstoanExternalTriplestore

Drew: dates - not as easy as adding a date in PREMIS; also have this whole class of affiliated dates, we're not using the complexity of PREMIS; we should consider that bc some of these events take sometime, so having a duration is not well represented currently

Andrew: to what extent are the plans to expose this data beyond the search interface?

Jon: what about other potential events outside of the system? Eg replicating to DPN - how do we capture that? Are there examples of other institutions that are using PREMIS to record replication in outside repos?

Andrew: DPN, APT institutions, but not as far along as HD2 in terms of preservation events

Jon: If there were standard tools for capturing that, we'd want to adhere, but it sounds like early days

Andrew: Understands argument for keeping things at the Hydra level, but there are functional reasons for keeping things at the Fedora level - created with internal and external events in mind

Looking at what we're showing, seems like it's the same info being captured, so it would be nice if ; same as like access controls, same information but not quite captured in the same way, makes sense to go for full alignment if there aren't any arguments really against this

Drew: If we were to embrace event logging in Fedora, the layers of the stack that would need to be created, kind of translate to all layers of the stack; seemed to make sense to start at the CC level; you do end up with a lot of duplicative data though.

Andrew: Curious, if there's any reason not to at the Hydra level, when you're persisting at the PREMIS event that you use the same details that are the wiki page above, would just depend on what property is named; even though you're completely managing in CC, what shows up in the works side is what you're tooling

Drew: Would be worth some reconciliation there to not deviate too much from what Fedora is giving you

If we deviate from what Fedora is doing, if using a system that uses Fedora's audit log, they would be different from what our audit log is capturing

General discussion needed for how we need to discover this PREMIS information, etc etc

Drew: If understanding work priorities in place; this is on Drew's local machine, need to get it into a state, what's our production instance

Mike: fine focusing on this currently, if we put it off now we won't pick it back up easily; would rather prioritize this work and have it more stable

Will: good thing to discuss tomorrow