Page tree
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 10 Next »

Introduction

Avalon's Batch Ingest feature provides a method of building one or more media objects at a time from uploaded content and metadata outside the user interface. A batch ingest is started by uploading an Ingest Package consisting of one Manifest File and zero or more Content Files to the Avalon dropbox.

Ingest Packages

An Ingest Package is the combination of content and metadata that make up a single batch.

Package Layout

The package must be uploaded somewhere within the batch subdirectory of the Avalon dropbox. It can either be at the root of the batch directory, or in any subdirectory thereof. The following is a very simple Package that has been uploaded:

Manifest File Format

The manifest file is a spreadsheet (xls, xlsx, csv, or ods) containing the metadata for the objects to be created, as well as the names of the content files that make up each object. In this case, the manifest file is named batch_manifest.xlsx.

 ABCDE
1Michael's First Test Batchmichael.klein@northwestern.edu   
2Main TitleCreatorDate Created  
3Test Object 1Klein, Michael B.2012content/file_1.mp3content/file_2.mp4
4Test Object 2Northwestern1951content/file_3.mp4 

Row 1, Column A contains a reference name for the batch.

Row 1, Column B contains the submitter's email address (to be used for notifications and exceptions).

Row 2 specifies the names of the metadata fields supplied in the following rows. Main TitleCreator, and Date Created are required. Each subsequent row represents a single MediaObject to be created. Metadata values are specified first, followed by a list of content files to be attached to each object. (It is important that the content file columns not have headers, or they will be misinterpreted as metadata.) Content filenames are relative to the location of the manifest file itself.

Multivalued fields are specified by multiple columns with the same header, e.g.:

 ABCDEF
1Michael's Second Test Batchmichael.klein@northwestern.edu    
2Main TitleCreatorDate CreatedTopical SubjectTopical Subject 
3Nachos: A MemoirKlein, Michael B.2012-12-22MeatCheesetasty_tasty_nachos.mp4

Supported Field Names (required fields in bold) (ATTN: Questions for Michael, Karen, and Julie also in bold)

  • Main Title
    • MODS mapping: titleInfo/title
    • Not repeatable
    • Required field – This should be the title used for display in browsing and search results
  • Alternative Title
    • MODS mapping: titleInfo@type=”alternative”
    • Repeatable
  • Translated Title
    • MODS mapping: titleInfo@type=”translated”
    • Repeatable
  • Uniform Title
    • MODS mapping: titleInfo@type=”uniform”
    • Repeatable
  • Creator
    • MODS mapping: name/namePart
      • ATTN: Can we assign “creator” for this role to distinguish it from other names included with an item?  That would mean auto assignment of “creator” or some other role for name/role/roleTerm within this name element.
    • Not repeatable
    • No ability to specify Corporate Body in batch at this time
      • ATTN: Is this editable in the form after ingest?
    • Required field – This should be the main person or body associated with the item
  • Contributor
    • MODS mapping: name/namePart
      • ATTN: I don’t think there’s any role we can automatically assign for name/role/roleTerm with this name element?
    • Repeatable
    • No ability to specify Corporate Body in batch at this time
      • ATTN: Is this editable in the form after ingest?
  • Statement of Responsibility
    • MODS mapping: note@type=”statement of responsibility”
    • Not repeatable
  • Resource Type
    • MODS mapping: typeOfResource
    • Not repeatable
    • This will help sort results and browse-able content.  Please use one of the following:
      • sound recording-musical
      • sound recording-non-musical
      • sound recording
      • still image
      • moving image
  • Genre
  • Publisher
    • MODS mapping: originInfo/publisher
    • Not repeatable
  • Place of Origin
    • MODS mapping: originInfo/place/placeTerm
    • Not repeatable
  • Date Created
    • MODS mapping: originInfo/dateCreated@encoding=”edtf”
    • Not repeatable
    • Date Created should only be used if Date Issued is a re-issue date.  Then Date Created would contain the original publication date.
    • Enter date information in a format consistent with the options shown in Extended Date/Time Format (EDTF) 1.0 
  • Date Issued
    • MODS mapping: originInfo/dateIssued@encoding=”edtf”
    • Not repeatable
    • Required field – This should be the main date associated with the item to be used for sorting browse and search results.
    • Enter date information in a format consistent with the options shown in Extended Date/Time Format (EDTF) 1.0 
  • Copyright Date
    • MODS mapping: originInfo/dateIssued
    • Repeatable
    • This field does not need to be formatted to any certain encoding standard.
  • Language Code
  • Language Text
  • Abstract
    • MODS mapping: abstract
    • Not repeatable
  • Note
    • MODS mapping: note
    • Repeatable
    • No ability to distinguish type of note for batch upload at this time
      • ATTN: Are we using an automatic type for note at this point or not specifying a type?
  • Topical Subject
    • MODS mapping: subject/topic
    • Repeatable
  • Geographic Subject
    • MODS mapping: subject/geographic
    • Repeatable
  • Temporal Subject
    • MODS mapping: subject/temporal
    • Repeatable
  • Occupation Subject
    • MODS mapping: subject/occupation
    • Repeatable
  • Person Subject
    • MODS mapping: subject/name@type=”personal”/namePart
    • Repeatable
  • Corporate Subject
    • MODS mapping: subject/name@type=”corporate”/namePart
    • Repeatable
  • Family Subject
    • MODS mapping: subject/name@type=”family”/namePart
    • Repeatable
  • Title Subject
    • MODS mapping: subject/titleInfo/title
    • Repeatable
  • Related Item ID
    • MODS mapping: relatedItem/identifier
    • Repeatable
    • No ability to specify type of relation in batch at this time
      • ATTN: Are we using an automatic type at this point or not specifying a type?

In addition to the descriptive fields, there is one supported operational field, Publish (default: false) for which a value of "True" will cause the newly ingested media object to be published immediately after ingest.

Notes

The batch ingest process will verify that the package is complete (i.e., all content files specified in the manifest are present and not open by any other processes) before attempting to ingest it. If the package is incomplete, it will be skipped and returned to on a subsequent pass.

 

  • No labels