Dashboard > Digital Library Infrastructure > ... > Repository Tools > Object Ingest Tool
Digital Library Infrastructure Log In   View a printable version of the current page.
Object Ingest Tool
DIDO -- Digital Images Delivered Online DLC -- Digital Library of the Commons Hohenberger Photographs Lilly Sheet Music U.S. Steel Cushman Victorian Women Writers Project Variations2 Wright American Fiction Project Hoagy Carmichael Collection
Added by Muzaffer Ozakca, last edited by Muzaffer Ozakca on Jul 17, 2008  (view change)

System:

Fedora POLICY mechanism is a work-in-progress

General set up

The difference between this tool and the Ingest Tool is that Ingest Tool uses an elobarate mix of config file, EAD collection and transformations and file names to ingest items correctly. This tool uses a simple configuration file, directories and file names to perform ingests.

Running the ingest tool

The program requires three parameters:

  • Path to collection directory: collDir
  • Path to item directory: itemDir
  • Path to config directory: configDir
    • Repository.properties for fedora configuration

Image Collection Ingests

In the structure below, the Hohenberger directory is the collDir and photos is the itemDir.

-- Hohenberger
 |
 |- IngestConfig.properties
 |- ATM-MC2-7-1-1-10
   |- ATM-MC2-7-1-1-10.tif
   |- ATM-MC2-7-1-1-10-full.jpg
   |- ATM-MC2-7-1-1-10-screen.jpg
   |- ATM-MC2-7-1-1-10-thumb.jpg
   |- ATM-MC2-7-1-1-10.txt          // OCR'ed page text
   |- ATM-MC2-7-1-1-10-mods.xml
   |- ATM-MC2-7-1-1-10-dc.xml
   |- mets-properties.xml           // optional
   |- policy.xml                    // optional
 |- ATM-MC2-7-1-1-11
   |- ...

Paged Document Ingests

This directory structure shows how each book item should be laid out in the file system. Files belonging to each paged document is in a directory where the directory name is the assigned ID.

-- MassDigitization
 |- IngestConfig.properties
 |- VAA4276
    |- VAA4276-0001
      |- VAA4276-0001.tif
      |- VAA4276-0001-full.jpg
      |- VAA4276-0001-screen.jpg
      |- VAA4276-0001-thumb.jpg
      |- VAA4276-0001.txt              // OCR'ed page text
      |- mets-properties.xml           // optional
      |- policy.xml                    // optional
    |- VAA4276-0002
      |- ...
    |-metadata
      |- VAA4276-marc.xml
    |-pdf
      |- VAA4276.pdf
    |-text
      |- VAA4276.xml           // e.g. TEI
    |-mets-properties.xml           // optional
   |- policy.xml                    // optional
 |- VAA4592
    |- ...

Multi Copy Paged Document Ingests

In this case, the object hierarchy has three levels: manifest->paged doc->page image. In the structure below, isl-aad-8761 is the manifestation level object and defines its own metadata. Sheet books are at the book level (isl-aad-8761-01 and isl-aad-8761-03). Page/Image level objects are under these.

-- isl
 |- IngestConfig.properties
 |- isl-aad-8761
    |-metadata
      |- isl-aad-8761-mods.xml
      |- isl-aad-8761-dc.xml
    |-isl-aad-8761-01
      |-isl-aad-8761-01-01
        |- isl-aad-8761-01-01.tif
        |- isl-aad-8761-01-01-full.jpg
        |- isl-aad-8761-01-01-screen.jpg
        |- isl-aad-8761-01-01-thumb.jpg
        |- isl-aad-8761-01-01.txt          // OCR'ed page text
        |- mets-properties.xml             // optional
        |- policy.xml                      // optional
        |- ...
      |-isl-aad-8761-01-01
        |- ...
      |-pdf
        |- isl-aad-8761-01.pdf
      |- mets-properties.xml           // optional
      |- policy.xml                    // optional
    |-isl-aad-8761-03
      |- ...
    |- mets-properties.xml           // optional
    |- policy.xml                    // optional
 |- isl-aad-8765
    |- ...

Journal Ingests

There are four levels: volume, issue, article and page image. In the structure below, VAA4025-060 is the volume identifier and VAA4025-060-4 is the issue identifier.

-- imh
 |- IngestConfig.properties
 |- VAA4025-060                       // volume level
   |- metadata
      |- VAA4025-060-tei.xml          // TEI header
   |- VAA4025-060-4                   // issue level
      |-VAA4025-060-4-001             // these are pages
        |- VAA4025-060-4-001.tif
        |- VAA4025-060-4-001-full.tif
        |- VAA4025-060-4-001-screen.tif
        |- VAA4025-060-4-001-thumb.tif
        |- ...
        |- mets-properties.xml           // optional
        |- policy.xml                    // optional
     |-VAA4025-060-4-002
        |- ...
      |-articles
        |-VAA4025-060-4-a01
          |-page-list.txt            // flat file with 1 page-id per line
          |-pdf
            |-VAA4025-060-4-a01.pdf
          |-metadata
            |-VAA4025-060-4-a01-mods.xml
            |-VAA4025-060-4-a01-dc.xml
            |-VAA4025-060-4-a01-tei.xml		// TEI Independent header. 
          |- mets-properties.xml           // optional
          |- policy.xml                    // optional
      |-metsnav
        |- VAA4025-060-4-metsnav.xml
      |-text
        |- VAA4025-060-4.xml
      |- mets-properties.xml           // optional
      |- policy.xml                    // optional

Configuration

Removed the outdated config attachment.
See the attached IngestConfig.properties file for a list of configuration items.

There are currently no attachments on this page.

METS Customization (Digital Library Infrastructure)

Powered by Atlassian Confluence, the Enterprise Wiki. (Version: 2.5.4 Build:#809 Jun 12, 2007) - Bug/feature request - Contact Administrators