PURLs are "persistent URLs". They may serve as a type of Identifier.
We use PURLs for two primary purposes:
- Advertising our content to people outside the DLP. These are always collection-level and item-level references. We make a commitment to maintain the PURLs in working order.
- Maintaining working references to files that are used by multiple machines/processes within the DLP. These can be references to individual files. Unfortunately, once a PURL is created, it is impossible to tell which machines/processes depend on it, so we must make some effort to maintain these in working order as well.
- The preferred form is http://purl.dlib.indiana.edu/iudl/collectionID/\[formatID/\]shortItemID
- collectionID includes the name of the institution that holds the physical materials as well as the name of the collection itself. In the case of a collection held by the main IU library system (not a special library/archive), the institution name will be omitted and only the name of the collection will be used.
- formatID is omitted for the default view.
- formatID is typically a single word, like "screen".
- Not all PURLs listed below follow this format. We will move to it over time as collections move into the repository. We may phase out the old formats (since the image-level PURLs haven't been used much outside the DLP), or we may have these PURLs point to PURLs in the preferred format.
- The item-level PURL should always resolve to the item in context (renderFullView).
- A PURL is somewhat abstract. For example, a formatID of "screen" will resolve to an image "suitable for on-screen viewing". It doesn't refer to any specific image resolution. As screen sizes change, this PURL may be updated to reference a different size of image.
- Notes for Finding Aids:
- Finding aids will typically receive a PURL that is part of the main "finding aids" collection, such as http://purl.dlib.indiana.edu/iudl/findingaids/archives/abc1234
- Objects referenced by node number from the finding aid will typically receive a PURL that indicates the collection for which the object was first digitized, such as
Our current PURL resolver architecture has some technical implications that may unfortunately impact the collection identifier assignment for items.
- The java PURL resolution servlet only considers collectionID, formatID and content model when resolving a PURL, therefore:
- two similar items within the same collection must resolve to the same place for each format. This typically means that branding, unless it can be included in parameters to the URL must be bound to the collectionID.
We haven't yet decided what to do with file types and MIME types. In most cases, we don't need to refer to a specific file type, but some applications may require this. Two options:
- Allow a file extension to be appended to the PURL. This means that the regular PURL would return obtain the "current best" type of file for the required purpose, but applications could append specific file extensions to request other representations of the same object. This solution has the advantage of being simple, but the disadvantage that it would result in legal PURLs (even if unadvertised) that would live on long after the actual file formats had fallen out of use. We don't want to imply persistence if we're not committing to keep these files around.
- Use some other URL. This removes us from persistence claims, but it is not intuitive, and would be difficult to maintain in parallel with the "real" PURLs.
Old format image level (may want to migrate):
Sample collectionID values
Main IU library collections:
Other IU collections:
New Harmony Workingmens Institute:
Sample formatID values
- large (formerly full)
- printable (for PDF datastreams)
- encodedtext (for TEI datastreams)
- scalable (for JPEG2000 datastreams)
- text (tentative, used for the textual representation of an item, typically OCR)