In order to facilitate automatic generation of METS documents for text collections upon ingest into Fedora, we are mapping descriptive and structural information in TEI documents to METS.
- Wiki page describing our use of METS in Fedora: Fedora Metadata Storage Philosophy
- Sample Fedora METS documents:
- We can use the TEI Header to MODS mapping to create MODS for the descriptive metadata
- Need to find source for <fileGrp> and <structMap>
- Relevant Fedora content models:
General Issues
METS Area |
Source |
---|---|
Physical Struct Map (page sequence) |
TEI or File system |
Logical Struct Map (internal structure |
TEI |
Actual page numbering |
TEI |
Derivatives (different image sizes) |
File system |
Whole documentation representation (e.g., PDF) |
File system or Config file |
Descriptive metadata |
TEI Header and possibly other places like MARC records |
- Need to finish MODS mapping for descriptive metadata, including mapping for issue-level object (Needs input from Jenn)
- Need to find/create MODS-DC mapping for OAI purposes
- Need to determine which structural elements in METS can be generated from the TEI itself and create mapping.
- Jenn and Michelle will work on this mapping
- Working to develop a process that will use both the structural information from the TEI, as well as from the file system, and will check them against each other (verification step).
- Help ensure filenaming is accurate and match between TEI and image file names especially when TEI and images aren't auto QCd in batch. If both are available need to have guidelines for what is ingested first: TEI or corresponding images.
- Ingest process needs to be able to accept existing METS document and add information that can't be pre-generated from the TEI (e.g. image derivatives, PDFs, etc.).
- Need to generate XSLT at ingest (or pre-ingest) that draws what we need from TEI document, but that can be further manipulated to include information required from non-TEI sources.
- XSLT needs to be configurable per collection (or new XSLT per collection; hard to generalize)
- Need to generate XSLT at ingest (or pre-ingest) that draws what we need from TEI document, but that can be further manipulated to include information required from non-TEI sources.
- Draft workflow section for preparing text collections for Fedora ingest
- Determine structural needs (what objects are necessary, which ones will contain what metadata)
- Consider applications that will be using the collection (e.g. search, OAI, METS Nav, etc.)
- Customize (if necessary) MODS, DC, and METS mappings
- Generate MODS and DC records
- Generate METS files using MODS, DC, and TEI
- Determine structural needs (what objects are necessary, which ones will contain what metadata)
For the IMH ...
- See the Journal Content Model page for METS structure
- We will have Fedora objects representing articles, as well as issues, but the text can be stored at either level. Need to ask David what would be easiest for XTF to handle.
- METS Documents: for Fedora and METS Navigator
- METS Navigator: Generate METS document for Issue with pointers to other METS documents for each article (to support article-level pagination and issue-level navigation); pre-generated METS documents
- Fedora: for managing all the related components of the collection?; auto-generated
- There are page objects at two different levels - front- and back-matter pages are children of the issue-level object, other pages are children of article-level objects. (See Ryan's note on the Journal Content Model page.)
- This will need to be addressed in the fileGrp and structMap sections, but should it be duplicated in the issue and the article? How exactly?
Preliminary Mappings
Descriptive Metadata <dmdSec>
- Map TEI Header to MODS (for repository and OAI)
- Map MODS to QDC (for OAI and repository)
- Pointer to TEI Header <mdRef> (for all the other stuff in the header, not captured in MODS)
Administrative Metadata <amdSec>
- TEI does not contain source copyright statement, only electronic file statement. How do we handle this in MODS versus <rightsMD>?