Skip to end of metadata
Go to start of metadata

PURLs are "persistent URLs". They may serve as a type of Identifier.

We use PURLs for two primary purposes:

  1. Advertising our content to people outside the DLP. These are always collection-level and item-level references. We make a commitment to maintain the PURLs in working order.
  2. Maintaining working references to files that are used by multiple machines/processes within the DLP. These can be references to individual files. Unfortunately, once a PURL is created, it is impossible to tell which machines/processes depend on it, so we must make some effort to maintain these in working order as well.

Creating PURLs:

  • The preferred form is http://purl.dlib.indiana.edu/iudl/collectionID/\[formatID/\]shortItemID
    • collectionID includes the name of the institution that holds the physical materials as well as the name of the collection itself. In the case of a collection held by the main IU library system (not a special library/archive), the institution name will be omitted and only the name of the collection will be used.
    • formatID is omitted for the default view.
    • formatID is typically a single word, like "screen".
    • Not all PURLs listed below follow this format. We will move to it over time as collections move into the repository. We may phase out the old formats (since the image-level PURLs haven't been used much outside the DLP), or we may have these PURLs point to PURLs in the preferred format.
  • The item-level PURL should always resolve to the item in context (renderFullView).
  • A PURL is somewhat abstract. For example, a formatID of "screen" will resolve to an image "suitable for on-screen viewing". It doesn't refer to any specific image resolution. As screen sizes change, this PURL may be updated to reference a different size of image.
  • Notes for Finding Aids:

Technical implications

Our current PURL resolver architecture has some technical implications that may unfortunately impact the collection identifier assignment for items.

  • The java PURL resolution servlet only considers collectionID, formatID and content model when resolving a PURL, therefore:
    • two similar items within the same collection must resolve to the same place for each format. This typically means that branding, unless it can be included in parameters to the URL must be bound to the collectionID.

MIME types

We haven't yet decided what to do with file types and MIME types. In most cases, we don't need to refer to a specific file type, but some applications may require this. Two options:

  • Allow a file extension to be appended to the PURL. This means that the regular PURL would return obtain the "current best" type of file for the required purpose, but applications could append specific file extensions to request other representations of the same object. This solution has the advantage of being simple, but the disadvantage that it would result in legal PURLs (even if unadvertised) that would live on long after the actual file formats had fallen out of use. We don't want to imply persistence if we're not committing to keep these files around.
  • Use some other URL. This removes us from persistence claims, but it is not intuitive, and would be difficult to maintain in parallel with the "real" PURLs.

Sample PURLS

Item-level:

Page/image level:

Old format image level (may want to migrate):

Sample collectionID values

Main IU library collections:

  • general/pageturner
  • rifias
  • dido
  • hoagy
  • variations/sound
  • variations/score
  • variations/access
  • variations/program

IU archives:

  • archives/adminrecords
  • archives/papers
  • archives/photo
  • archives/cushman

Lilly library:

  • lilly/hohenberger
  • lilly/devincent
  • lilly/janejohnson
  • lilly/slocum
  • lilly/starr

Other IU collections:

  • nw/cra/ussteel

New Harmony Workingmens Institute:

  • workingmens/branigan
  • workingmens/archive

Sample formatID values

  • thumbnail
  • screen
  • large (formerly full)
  • mets
  • printable (for PDF datastreams)
  • encodedtext (for TEI datastreams)
  • scalable (for JPEG2000 datastreams)
  • text (tentative, used for the textual representation of an item, typically OCR)

See also

  • No labels