This page is being updated for Release 2. For the Release 1 version of this page, see v.43 found under Page History.
Avalon's Batch Ingest batch ingest feature provides a method of building one or more media objects at a time from uploaded content and metadata outside the user interface. A batch ingest is started by uploading an Ingest Package ingest package consisting of one Manifest File manifest file and zero or more Content Files content files to the Avalon dropbox. For For your convenience there is a demo batch available for to download and importing import into test systems.
An Ingest Package ingest package is the combination of content and metadata that make up a single batch.
The package must be uploaded somewhere within the
batch subdirectory of the Avalon dropbox. It can either be at the root of the batch directory, or in any subdirectory thereof. The following is a very simple Package package that has been uploaded:
Manifest File Format
The manifest file is a spreadsheet (
ods) containing the metadata for the objects to be created, as well as the names of the content files that make up each object. In this case, the manifest file is named
batch_manifest.xlsx. See batch_manifest_template_R2.xlsx for an Excel example file. Required Required fields are in bold.
|1||Michael's First Test Batchemail@example.com|
|2||Main Title||Creator||Date Issued||Collection||File||Label||File||Label|
|3||Test Object 1||Klein, Michael B.||2012||Northwestern Video Collection||content/file_1.mp3||Part 1||content/file_2.mp4||Part 2|
|4||Test Object 2||Northwestern||1951||Northwestern Video Collection||content/file_3.mp4|
Row 1, Column A contains a reference name for the batch. This is mostly for your reference so we recommend naming the batch file according to what will help you remember the contents.
Row 1, Column B contains the submitter's email address (to be used for notifications and exceptions). The submitter's email must be listed as a manager, editor, or depositor for each collection included in the manifest.
Row 2 specifies the names of the metadata fields supplied in the following rows. Main Title, Creator, Date Issued, Collection and File are required. These fields are bolded shown in bold in the Excel example file. Each Each subsequent row represents a single Media Object media object to be created. Metadata values are specified first, followed by a list of content files to be attached to each object. (It is important that the content file columns not have headers, or they will be misinterpreted as metadata.) Content filenames are relative to the location of the manifest file itself.
Content files listed in the manifest file must have the correct path noted for where those files are located in the Avalon dropbox, relative to the manifest file. AdditionallyAdditionally, all content files must include a file extension. If If necessary, include any directories or subdirectories (note the paths listed in columns D and E in the above example).
Multivalued fields are specified by multiple columns with the same header, e.g. Topical Subject in the following example:
- Main Title
- MODS mapping: titleInfo/title
- Not repeatable
- Required field. Title Title is used for display in search results and single item views. Only Only the first 32 characters of a title are included in search results listings. Recommended use is to reflect the content captured in digitized media files (such as the title of the piece performed or a short description of the content of a home movie).
- Editable after ingest in "Title" field of Resource Description form.
- MODS mapping: name@usage="primary"/namePart (role/roleTerm set to "Creator")
- No ability to specify Corporate Body in batch at this time
- Required field. Main contributors are the primary persons or bodies associated with the creation of the content. Main contributors will be included in search results display and aggregated for browsing access. At this time there is no ability to specify a main contributor as a corporate body. When possible, use the Library of Congress Name Authority File.
- Editable after ingest in "Main contributor(s)" field of Resource Description form.
- MODS mapping: name/namePart (role/roleTerm set to "Contributor")
- Contributors are persons or bodies associated with the item but not considered primary to the creation of its content. Examples of this would be performers in a band or opera, conductor, arranger, cinematographer, and choreographer. At this time this is no ability to specify a contributor as a corporate body. When possible, use the Library of Congress Name Authority File.
- Editable after ingest in "Contributor(s)" field of Resource Description form.
- MODS mapping: genre
- Genre can be used to categorize an item by form, style, or subject matter. For consistency and to allow for sorting and aggregating, use terms from the Open Metadata Registry labels for PBCore: pbcoreGenre.
- Editable after ingest in "Genre(s)" field of Resource Description form.
- MODS mapping: originInfo/publisher
- Publisher of the content of the item.
- Editable after ingest in "Publisher(s)" field of Resource Description form.
- Date Created
- MODS mapping: originInfo/dateCreated@encoding=”edtf”
- Not repeatable
- Creation date should only be used if Date Issued is a re-issue date. Then Creation date would contain the original publication date. Enter date information in a format consistent with the options shown in Extended Date/Time Format (EDTF) 1.0.
- Editable after ingest in "Creation date" field of Resource Description form.
- Date Issued
- MODS mapping: originInfo/dateIssued@encoding=”edtf”
- Not repeatable
- Required field. Date should be the main publication date associated with the item to be used for sorting browse and search results. Enter date information in a format consistent with the options shown in Extended Date/Time Format (EDTF) 1.0.
- Editable after ingest in "Publication date" field of Resource Description form.
- MODS mapping: abstract
- Not repeatable
- Abstract provides a space for describing the contents of the item. Examples include liner notes, contents list, or an opera scene abstract. This field is not meant for cataloger's descriptions but for descriptions that accompany the item. The first 15-20 words are included in search result listings.
- Editable after ingest in "Summary" field of Resource Description form.
- Topical Subject
- MODS mapping: subject/topic
- Subject should be used for the topical subject of the content. For consistency and to allow for sorting and aggregating, use terms from the Library of Congress Subject Headings. For temporal subjects (time periods), use Temporal Subject and for geographic subjects (locations), use Geographic Subject. See below.
- Editable after ingest in "Subject(s)" field of Resource Description form.
- Geographic Subject
- MODS mapping: subject/geographic
- Geographic Subject should be used for the location associated with the content. For consistency and to allow for sorting and aggregating, use terms from the Getty Thesaurus of Geographic Names.
- Editable after ingest in "Location(s)" field of Resource Description form.
- Temporal Subject
- MODS mapping: subject/temporal
- Temporal Subject should be used for the time period of the content (for example, years or year ranges). Enter date information in a format consistent with the options shown in Extended Date/Time Format (EDTF) 1.0.
- Editable after ingest in "Time period(s)" field of Resource Description form.
- MODS mapping: relatedItem@type="host"/titleInfo/title
- Not repeatable. Each item can only belong to one collection.
- Required field. Collection is used for display in search results and single item views. Collection must be created in Avalon system prior to batch ingest. The collection field in the batch manifest must be exactly the same as the Collection Name created in Avalon.