MODS capture by Zotero as observed in

American Social History Online

http://www.dlfaquifer.org/home

 

Preliminary notes for Aquifer Metadata Working Group

Laura Akerman, 2008-10-29, revised by Laura 2009-03-25

liblna@emory.edu

 

Issues arranged by Zotero field name:

 

creator:  #4 , #5

creatorTypes:  #6

identifier:  #2

itemType: #1 , #9   #14 , #15 , #16

language:  #3

place:  #7

publisher:  #8

title:  #13

 

Tags tab:  #10

no Zotero field mapping; don't we need one?  #11 , #12

 

Analysis arranged by MODS element; issues are numbered sequentially

 

abstract

 

Mapping appears OK.  MODS abstract maps to Zotero abstractNote.  (NOTE that this kind of note appears as a field in the Info tab of Zotero, whereas other types of notes such as in the MODS "note" field, are "pushed" to the Notes tab.).

 

accessCondition

 

Mapping appears OK.  MODS accessCondition of any type, maps to Zotero rights element.

 

classification

 

Mapping appears OK.  MODS classification element is mapped to a Zotero callNumber element.

 

extension

 

Mapping appears OK;  no Zotero mappings for extension were found, or expected.

 

genre

 

Issue #1 .  MODS genre is only mapped if it matches one of the Zotero itemType types.  There are many more types of form/genre terms used in that element, more detailed and from different angles (literary or film genres such as "cartoons" "mystery stories" etc. or physical genres such as "stereographs").  If Zotero could add this field to all itemType field sets, that would be lovely, if all genre terms could be captured there.  If it can't, could we consider mapping these terms to the Tags tab, along with subjects?  (Can Zotero differentiate kinds of Tags - for subject and genre?)

 

Examples of Aquifer MODS records containing a <genre> element:

 

title:  

Journal of a voyage across the Atlantic: with notes on Canada & the United States, and return to Great Britain in 1844  (genre - Biography)

title:

Every girl pulling for victory : Victory Girls : United War Work Campaign  (genre - Posters)

 

identifier

 

Issue #2 .  Cannot fully assess processing of the MODS identifier element to a Zotero identifier field (?) from the translator code; it apparently calls other code not present in the translator (processIdentifiers).  It would be helpful to know how that code operates, then we can determine if there are any issues.

 

Example Aquifer MODS records with identifier elements:

 

title:

Every girl pulling for victory : Victory Girls : United War Work Campaign  (genre - Posters) -- <identifier>msp00003</identifier>

 

title:

Wild wild women

<identifier type="local" displayLabel="Call number">091074</identifier>

 

Note:  in Zotero, or when exported to either Zotero RDF or MODS, no identifier field appears.

 

language

 

Issue #3 .  No mapping of MODS language/languageTerm (with type attribute of either term or code) could be found.  This is puzzling because Zotero has a "language" item field.

 

Example Aquifer MODS records with language elements:

 

title:  Cortés Nos Chingó In A Big Way The Hüey (has two elements for English and Spanish)

 

title:  A travers la somme devastee : le cimetiere Allemand de Nontecourt, dans le fond, o        droite, vue des ruines du village de Nontecourt, o gauche, des ruines du bourg de        Monchy-la-Gache (French)

 

location

 

Mapping appears OK. 

 

MODS location/physicalLocation is mapped to Zotero archiveLocation element.

 

MODS location/url is mapped to Zotero url element.

 

name

 

Issue #4 .  Some names are not coming through in the Zotero metadata record; only the date at the end of the name string appears.  NACO Authority File names are often qualified by date.

 

Example:  captured MODS for "Famous actresses of the day in America" from Aquifer, shows

 

Author:  1869-1935 ,    (first)

 

The MODS record has

 

<name type="personal">

<namePart>Strang, Lewis Clinton</namePart>

<namePart type="date">1869-1935</namePart>

<role>

<roleTerm authority="marcrelator" type="text">creator</roleTerm>

</role>

</name>

 

It appears that the code is dealing appropriately with MODS name/namePart elements that that have "family" or "given" attributes (mapping them to Zotero creator.lastName and creator.firstName elements),  but then assumes that any other namePart subelement be stored in variable "backup name" and run through the Zotero "cleanAuthor" utility.  This is probably designed for namePart elements with no attribute which (if in AACR2 form), have a form  Lastname, Firstname M. I. 

 

However, there are two other defined MODS namePart type attributes that are not dealt with:  @type=date and @type=termsOfAddress.  These need to be either specifically ignored, or mapped in to the end of the name somehow (if this is possible in Zotero?).  The results make it appear that the Zotero translator is processing them through "clean author" as if they were a name.

 

Issue #5 .  There is a comment in the code area where Zotero "creator" field is mapped: "// TODO: institutional authors".  Please follow through on this.  Right now, MODS name elements with type "corporate" or "conference" are showing up in Zotero looking like this:

 

Author:           United States,      (first)

 

Where the MODS record has:

 

<name type="corporate">

<namePart>United States</namePart>

<role>

<roleTerm authority="marcrelator" type="text">creator</roleTerm>

</role>

</name>

 

Corporate bodies don't have a first name, so the (first) need not display.

 

Example Aquifer records containing corporate body names include:

 

Title:  Olympic Boulevard, State Route 173, looking east from point 200 feet west of Irolo Street, Los Angeles County, 1940

 

Title:  Organization and historical sketch of the Women's Anthropological Society of America

 

Issue #6 .  Only 3 terms, when found in MODS name/role/roleTerm, are mapped to Zotero's "creatorTypes", and these are only mapped if the "code" form of term is used.  Many more mappings are possible.  The current mapping handles code "edt" mapped to "editor", "ctb" mapped to "contributor", and "trl" mapped to "translator". 

 

This MODS roleTerm element can contain either a code or a term (governed by the "type" attribute), and a standard vocabulary used is the MARC Relators code list (http://www.loc.gov/marc/relators/relators.html) referred to in the MODS documentation.  Below are examples of more mappings that could be made from this list.  Zotero team may wish to review the definitions in the MARC documentation to see if they are in harmony with Zotero definitions or if more mappings could be made.

 

MARC term: Editor

map to Zotero: creatorTypes.editor

 

MARC term: Contributor

map to Zotero: creatorTypes.contributor  

 

MARC term: Translator

map to Zotero: creatorTypes.translator

 

MARC code: ive      

or MARC term: Interviewee

map to Zotero:      creatorTypes.interviewee

 

MARC code: ivr

or MARC term:           Interviewer

map to Zotero:             creatorTypes.interviewer

 

MARC code: drt
or MARC term: Director

map to Zotero: creatorTypes.director  

 

MARC code: aus
or MARC term: Author of screenplay, etc.

map to Zotero: creatorTypes.scriptwriter

 

MARC code: pro
or MARC term: Producer

map to Zotero: creatorTypes.producer

 

MARC code: act
or MARC term: Actor

map to Zotero: creatorTypes.castMember

 

MARC code: spn
or MARC term: Sponsor

map to Zotero: creatorTypes.sponsor

 

MARC code: inv
or MARC term: Inventor

map to Zotero: creatorTypes.inventor  

 

MARC code: rcp
or MARC term: Recipient

map to Zotero: creatorTypes.recipient

 

MARC code: prf
or MARC term: Performer

map to Zotero: creatorTypes.performer

 

MARC code: cmp
or MARC term: Composer

map to Zotero: creatorTypes.composer  

 

MARC code: lbt
or MARC term: Librettist

map to Zotero: creatorTypes.wordsBy

 

MARC code: ctg
or MARC term: Cartographer

map to Zotero: creatorTypes.cartographer

 

MARC code: prg
or MARC term: Programmer

map to Zotero: creatorTypes.programmer

 

MARC code: art
or MARC term: Artist

map to Zotero: creatorTypes.artist

 

MARC code: cmm
or MARC term: Commentator

map to Zotero: creatorTypes.commenter

 

MARC code: cwt
or MARC term: Commentator for written text

map to Zotero: creatorTypes.commenter

 

Example Aquifer records containing some of these terms:

 

Title:  Map of city and county of San Francisco (Cartographer)

Title:  Performance by Tito Vasconcelos (prf)

Title:  White Eagle and Pura Fé sing Rudy Martin's songs (cmp, prf, as well as additional  codes not mentioned above (mus, lyr, voc) 

 

note

 

Mapping appears OK.  MODS note element is assigned to a variable and "pushed" to the Zotero notes tab for this item.

 

originInfo

 

Mappings that appear OK: 

 

MODS originInfo/edition subelement is mapped to Zotero edition field.

 

There are mappings from MODS originInfo subelements, either copyrightDate, dateIssued, or dateCreated (in that order) to one Zotero date field.

MODS originInfo/dateModified is mapped to a Zotero lastModified element.

 

MODS originInfo/dateCaptured is mapped to a Zotero accessDate element.

 

Issue #7 .  MODS originInfo/place/placeTerm is mapped to Zotero place field, only if type-"text".   It could be possible to use a table with the MARC  Code List for Countries ( http://www.loc.gov/marc/countries/ ) to lookup the text form, when only type="code" is present here.  For most MODS records, especially those mapped from MARC, a type="text" form is likely to be present, so this may not be worth the effort.

 

Aquifer records where only a type="code" form of originInfo/place/placeTerm is present in the MODS record:

 

Title:  White Eagle and Pura Fé sing Rudy Martin's songs

Title:  Biographical dictionary and portrait gallery of representative men of Chicago and the World's Columbian Exposition

 

Issue #8 .  MODS originInfo/publisher.  Not sure about this one - it appears that MODS publisher maps to a Zotero "publisher" element, except when the Zotero itemType is "website" or "webpage", in which case, it is mapped to Zotero "publicationTitle" !

Is this because the "publisher" field is not defined for the "webpage" set of elements?  If so, this is problematic from two fronts: 

   a.  Items published on the web as webpages or websites can have publishers (entities responsible for the webpages or sites). 

   b.  See under "physicalDescription", the note about setting itemType to Zotero webpage based on value "electronic" in the MODS physicalDescription/form element.  This means that, for example, digitized books could end up with an itemType of "webpage" and their publisher would not be captured.

 

This appears to have been partially improved since the first draft of this analysis.  For records getting type "Website", the publisher element is now not appearing as publisher (still!)  but is not showing up as "Website title" either.  MODS <relatedItem type="host"> appears to be mapped to "Website title" which makes more sense.

 

Aquifer records whose Zotero itemType comes through as "website" or "webpage", which have an <originInfo><publisher> element:

 

Title:  Studies on Inbreeding.   Publisher is The Wistar Institute of Anatomy and Biology;

Title:  The woman who wouldn't.    Publisher is G.P. Putnam's Sons,

 

Note that publisher (in these cases, of the original item) does not show up anywhere in Zotero record.   Both of these items are actually digitized books.

 

part

 

Mapping to Zotero "volume" "issue" or "section" field:  may be OK; not a complete mapping.  The code maps part detail elements that have type "volume" "issue" or "section" to Zotero fields with the same name.  It uses variables, first looking for subelement part of the relatedItem element, then looking at part as a top-level element.  The code maps part/detail/number if it is present; otherwise it maps detail as text (Is this possible?).  It seems to ignore the part/detail/caption and part/detail/title subelements, as well as part/detail "level" attribute

 

The Aquifer group may have more comment on this later if/when we have time and examples of MODS records using "part" element, to test with.

 

Page(s):  Seems OK.  Maps start and ending pages to a Zotero "pages" element, separated by a dash if start and end are different pages.

 

I have done a lot of hunting but have been unable to find Aquifer records using the top-level MODS Part element.  This is a newer MODS element and has apparently gotten limited application in the Aquifer metadata collections.

 

 

physicalDescription

 

Issue #9 .  Zotero is using the MODS physicalDescription/form element with @authority="marcform", where content is "electronic", to set the Zotero itemType element to be "webpage". 

 

This may have unintended effects, because almost anything from the web captured in Zotero could get the designation "electronic" (particularly if it is a MODS record converted from MARC, where this is mapped from a "fixed field code" that's widely used for web resources of all types). 

 

It would be better to omit this mapping. 

 

Aquifer records containing <physicalDescription><form authority="marcform">electronic</form></physicalDescription> which are getting inappropriately mapped to "Web Page" instead of "Book" itemType:

 

Title:  Studies on Inbreeding.   Publisher is The Wistar Institute of Anatomy and Biology;

Title:  The woman who wouldn't.    Publisher is G.P. Putnam's Sons,

 

 

recordInfo

 

Mapping is OK.  There is a mapping of content of recordInfo/recordContentSource to Zotero's source field, and a mapping of recordInfo/recordIdentifier to Zotero's accessionNumber field.

 

relatedItem

 

Mapping appears OK. 

 

MODS relatedItem type="host" subelement title/titleInfo type="abbreviated" is mapped to the Zotero journalAbbreviation element,  and to the publicationTitle element if that has not been mapped from other content yet.  Since serials are generally the types of "hosts" for which abbreviated titles are supplied, seems safe.

 

MODS relatedItem type="series" subelement titleInfo/title is mapped to Zotero's
series element; titleInfo/partTitle for a series is mapped to Zotero's seriesTitle element; titleInfo/subtitle is mapped to Zotero's seriesText element; titleInfo/partNumber is mapped to Zotero seriesNumber.

 

 

 

subject

 

Issue #10 .  Mapping of MODS subject subelements is missing a lot!  Subject is "pushed" to the Tags tab in Zotero.  Only the MODS subject/topic subelements are mapped; this leaves out many other types of subject or parts of subjects, which could be useful to Zotero users (who wouldn't likely care about separation of the "subject facets").  

 

Sublements for name, titleInfo, geographic, temporal, and occupation could be mapped directly; geographicCode,  hierarchicalGeographic, and cartographics might present more difficulty to map and are less critical to use (usually records containing these elements have other types of subject terms used for the same entities, that are more easily mapped). 

 

The genre subelement under subject is a special case.

 

#10a .  Would it be possible, based on attribute authority="lcsh" in the subject element, to map all the subelements of such a MODS subject into one string, sequentially, with a space, two dashes, and a space as a delimiter?  (This is how LC subject headings are intended to be viewed but may not fit with Zotero's functionality.)

 

#10b .   Right now there seems to be no way to "group" or differentiate types of tags... is anything in the offing?   If we mix different kinds of subjects there (or subject plus other kinds of descriptors such as genre), it makes it difficult to "map back out" the tags field to MODS or other metadata formats that make these differences.  But, that being said, mapping subject/genre or MODS genre to the tags tab is an option.

 

If neither 10a or 10b is possible, it would be better to leave the subject/genre subelement unmapped.

 

An example MODS subject, from Aquifer record "Washington and his comrades in arms"

 

<subject authority="lcsh">

   <geographic>United States</geographic>

   <topic>History</topic>

   <temporal>Revolution, 1775-1783</temporal>

</subject>

 

In "LCSH display form":  United States -- History -- Revolution, 1775-1783.

 

The translator will only pick up "History" from this subject element.  That's missing a lot.

 

Other Aquifer records with multifaceted subjects:

 

Title:  Southern women in the recent educational movement in the South (topic/temporal/geographic)

Title:  A history of Williams College (corporate name/topic)

Title:  The memorial life of General William Tecumseh Sherman (personal/topic, geographic/topic/temporal

 

Some of the Aquifer records don't separate out the facets into separate subelements, but just have a string with dashes.  From an example title,  Letter to Adelina from Juanita Wolfskill:

 

<subject authority="lcsh">

  <geographic>Orange County (Calif.)--History</geographic>

</subject> 

 

Because this example is under the subelement <geographic> it doesn't get mapped to tags in Zotero.

 

But if  the <topic> subelement was used (which is how the MODS instructions say to treat undifferentiated "subject headings"),  it would have mapped, dashes and all, as in the Aquifer examle record titled:  Ruins of Prager's Department Store:

 

<subject>

  <topic>Earthquakes--California--San Francisco--Photographs</topic>

</subject>

 

Zotero captures this and two other similar <topic> subjecs as Tags that look like this:

 

Earthquakes--California--San Francisco--Photographs

 

Note that this example does not identify the subject as an LC subject heading.  It follows the form of printed LCSH but doesn't have the subelement structure.

 

In summary, the options for Zotero seem to be:

 

1.  Map all types of subject subelements (name, title, topic, chronological, geographic, and form) separately as Tags.

2.  Map multiple subelements of subjects having authority="lcsh" as a string, with each subelement in the sequence separated from others by two dashes

3.  Allow "typing" of Tags so that topic tags, name tags, geographic tags, time period tags can be together (probably a "new feature).

.

 

tableOfContents

 

Issue #11 .  The translator does not appear to make use of this element at all.  If there is a Zotero "abstractNote" field, why not have a "contentsNote" field?  Or, if the size of some tables of contents might be an issue, could it at least get "pushed" to the Note tab?

 

An example of an Aquifer ASHO record with a TOC is the title "The people of the Eastern Orthodox churches, the separated churches of the east, and other Slavs: report of the Commission Appointed by the Missionary Department of New England to Consider the Work of Co-operating with the Eastern Orthodox Churches, the Separated Churches of the East, and Other Slavs"

 

targetAudience

 

Issue #12 .  There appears to be no Zotero field to map the content of targetAudience field to.  This is not a major issue, but if adding a Zotero field is not feasible, could this be a separate kind of Tag, in which case the element could be mapped to the Tags tab?

 

Example Aquifer record with a targetAudience field: 

 

titleInfo 

 

Issue #13 .  It appears that the current code creates a Zotero newItem.title for each MODS titleInfo/title, if the @type is not equal to "abbreviated".  However, when there are multiple titleInfo elements, only one seems to get picked to appear in the brief list and the "title" box when viewing in the Zotero plugin.  Zotero might even prefer a titleInfo element with an attribute - but it shouldn't.   

 

The titleInfo element without any type attribute is the actual title of the work and should be the primary one to capture and display - and although MODS does not require it, standard practice is to always have at least one "typeless" titleInfo element (and usually only one).  If there's a way to capture and use titles with type attributes (translated, alternative, uniform as well as abbreviated) elsewhere, that would be nice, but one of these should not get to be the "title" if a type-less titleInfo exists, just because the translator encounters it in a certain order (last?).  Please change the logic to prefer a titleInfo element with no type attribute as the source for title.

 

An Aquifer record's multiple titleInfo elements:

 

<titleInfo>

  <nonSort>The </nonSort>

  <title>Constitution of the United States of America</title>

  <subTitle>

as proposed by the Convention, held at Philadelphia, September 17, 1787, and since ratified by the several states : with the several amendments thereto

  </subTitle>

</titleInfo>

<titleInfo type="uniform">

  <title>Constitution</title>

</titleInfo>

 

The only title showing up in Zotero is "Constitution".

 

 

typeOfResource

 

Issue #14 .  There is no Zotero type for "sheet music", so "book" is being used.  We want to request a "sheet music" type.  A Metadata Working Group member has agreed to work on the request for Zotero elements for this type;  that information will be furnished later.

 

Aquifer records for sheet music may be found in several collections of sheet music (Music for the Nation: American Sheet Music, 1820-1860, 1870-1885, Musical Scores, and the Starr Sheet Music Collection.

 

Title:  Who tied the can on the old dog's tail? Did you tie the can on the old dog's tail?

Title:  If I should take a notion to jump into the ocean.

 

Issue #15   Zotero's mapping to its itemType element does not take MODS typeOfResource values into account at all.   These are more likely to be present than genre elements and are usually required for many contexts, including Aquifer.  While a one-to-one mapping isn't possible for all Zotero types, the following could help provide some better defaults than the all-purpose "book" if other data doesn't supply a different type .

  • text              could map to Zot. itemTypes.book by default (could also be periodical, newspaper, theses, letter, but those mappings would have to come from genre)
  • cartographic   could map to itemTypes.map by default
  • notated music    Need an itemType for this!   book by default...
  • sound recording - or,
    • sound recording-musical
    • sound recording-nonmusical
      all 3 could map to Zot.  itemTypes.audioRecording
  • still image     could map to itemTypes.artWork  by default, although that's too specific (since there's not a generic "image" itemType for Zotero).
  • moving image    could map to itemTypes.videoRecording by default  
  • three dimensional object     (uh, probably won't encounter one of these on the web, perhaps itemTypes.artWork would be a better guess than itemTypes.text).
  • software, multimedia      could map to  itemTypes.computerProgram
  • mixed material    not sure there's a Web equivalent of this ("collections") and probably won't get any but closest match is itemTypes.webpage

Example Aquifer records with typeOfResource:

 

text:               

- Biographical sketches of the founder and principal alumni of the Log college (book; captures as Book)

- Witness log (2 page handwritten document; captures as Book)

 

text: 

- The New England home magazine  (a serial:  captures as Journal article)

 

cartographic: 

- Peninsula between Delaware & Chesopeak Bays (a map; captures as Book)

 

notated music: 

- Oh! You beautiful doll, you great, big beautiful doll! (sheet music, captures as Book)

sound recording-nonmusical:

- Title: Poetry reading and Creator: Frost, Robert, 1874-1963 (streaming audio, captures as Book)

 

still image:

-

Flowering Currant at Boonville Mendocino county (digitized photograph, captures as Book)

 

moving image: 

- Visitin' 'round at Coolidge Corners (streaming video; captures as Film)              

- Chavela Vargas en vivo en El Hábito (versión sin editar) (streaming video; captures as Book)

 

Aquifer doesn't contain any true examples of software, multimedia right now, although some records are mis-coded as that type (but are really electronic texts). 

 

mixed material:   is the MODS type for collections (such as archival collections).

- W. Stewart Evans Collection, 1967-1979

 

Other issues:

 

Isssue #16 .  There's a "TODO:  thesis type" note in the code section that deals with Zotero itemTypes.  We feel it would be very useful to map to this Zotero type from MODS if possible.  Note that the MARC Genre Terms ( http://www.loc.gov/marc/sourcecode/genre/genrelist.html ) which maps to various MARC fixed field values, contains the term "thesis"; MODS records using that vocabulary in the genre element will have an authority attribute "marcgt".   This would be one "hook" in a MODS record that could map to a "thesis" itemType; there may be other possibilities.

 

Aquifer records for theses:

 

The development of Chicago and vicinity as a manufacturing center prior to 1880 (does not contain a genre element with "thesis", but contains a "thesis note"; captured as Book)

 

The progress of the fire in San Francisco April 18th-21st, 1906 : as shown by an analysis of original documents (contains a genre element with "academic dissertations" and a "thesis note"; captured as Book)

 

Wasn't able to find an Aquifer example of use of genre element containing "thesis".  Presence of the word at the beginning of a note element may be a more likely marker at present, expecially for MODS records mapped from MARC.  Presence of "dissertation" in the genre field might also be useful.