Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
maxLevel2
stylesquare

Workflow documentation

This is an attempt to decouple the ingest processes into something generic that can be abstracted into a gem and easily overridden by institutions who do not want to use the default behaviors. A first pass includes adding hooks for different events, extending the initializer, and writing an abstract ingest handler that can proxy for requests through both HTTP and batch workflows.

Configuration

An initializer underneath /config is used to set up the list of steps along with define any event hooks needed by the application. A default implementation might look something like the following code snippet.

Code Block
themeEclipse
languageruby
titleconfig/initializers/hydrant.rb
linenumberstrue
include Hydrant::Workflow::Steps
 
ingest_steps.load({
  Steps.file_upload,
  Steps.structure,
  Steps.metadata,
  Steps.access_control,
  Steps.preview
})
 
after_step :file_upload  { |context|
  # Kick off processing with Matterhorn
}

Event listeners

At various points in the workflow custom handlers can be injected to deal with events. One key event that can be trapped is the generation of derivatives which is asynchronous compared to the rest of the ingest steps.

Before_ingest / After_ingest

Events can be configured both before and after the ingest process begins. The before hook is used to initialize environment variables that might be needed and called just before the object is originally created. After_ingest is invoked when the workflow process terminates. This may be naturally, when an exception is thrown, or if the workflow is canceled. The intent is to act as both a safety valve for cleaning up state as well as any additional handling that needs to take place within the application.

Code Block
themeEclipse
languageruby
titleBefore Ingest example
linenumberstrue
before_ingest do { |context|
  archivists_to_notify = Archivists.get_email_list
  MediaNotifier.send("Additional objects have been added to the system")
}

 

Before_step / After_step / Around_step

These three hooks are for use around specific steps to take care of things that are not encapsulated within the step definition. One example is noted above - an after_step is used to begin the file conversion process. around_step is configured by default to trap exceptions and log them. Additionally it can be used for performance tuning, logging, or whatever is needed to manage state during the ingest process.

Code Block
themeEclipse
languageruby
titleAfter Step example
linenumberstrue
after_step :metadata do { |context|
  metadata_librarians = getSubscribedUsers(@media_object)
  MediaNotifier.send("Metadata is available for the object #{@media_object.title} (#{media_path(@media_object)})")
}

 

Before_file_conversion / During_file_conversion / After_file_conversion

To prepare files for conversion use the before_file_conversion. Things that you may want to do here are set up workflow properties for the conversion pipeline, create checksums, or validate content. It is assumed that legal file formats will be vetted by the file_upload step.

during_file_conversion and after_file_conversion are intended to track the state of the conversion process. Which one gets called is determined by a callback handler which queries the external conversion tool. If it reports that the file is still being converted during_file_conversion will be invoked. If the handler reports that is complete then after_file_conversion will be called. A default application supports three behaviours for this case

  1. Delete the master files from the system and do not keep a back up copy
  2. Move the master files to spinning disk so they can be archived
  3. Retain the master files for future reconversion in case there was a problem with the initial derivative creation process

Code Block
themeEclipse
languageruby
titleFile removal
linenumberstrue
after_file_conversion :move_files, "/mnt/video_archives/#{@media_object.pid}"

 

Module API

IngestWorkflow

Expand
titleAttributes

Active?

Completed?

Completed_at

Skipped?

 

Current step

Next step

Previous step

Expand
titleMethods

Advance

Repeat

Skip

IngestStep

Expand
titleAttributes

Label*

Summary*

Options

template

depends_on

(other extra properties)

Expand
titleMethods

Initialize

Perform

Rescue