Child pages
  • Identifiers
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Next »

We currently have "informal" identifiers that are the latter portion of the PURL (lilly/hohenberger/Hoh010.000.0049 or archives/cushman/P03412). We will retain this scheme until we find something significantly better.

We may eventually assign a DOI for each item, but we don't have a need to do that yet.

Some older collections don't include a fully-qualified ID in their OAI records. For example, in Hohenberger, we have Hoh007.013.0016 instead of lilly/hohenberger/Hoh007.013.0016. We will have to support the old forms of the ID.

Semantics in identifiers

Semantic IDs have "meaning" in at least some portion of the ID.


  • Memorable
  • May encode some metadata
  • May include a `brand'


  • "Ultimately, all semantic identifiers are incorrect" --Sean McGrath
  • Changes in language use may make some semantic terms confusing or offensive over time.

Opaqueness in identifiers

Opaque IDs have no apparent meaning.


  • Can encode no metadata (avoids problems when the metadata changes)
  • Include no branding, which can be useful when one project/collection absorbs another
  • Allow automatic generation
  • May be easier to manage (there is no need to maintain the semantics)


  • Not memorable
  • It is nearly impossible to create a "purely opaque" identifier system:
    • Globally-unique identifiers typically have a portion to indicate the instituion that generated each identifier.
    • Many identifier generators include some sort of sequential information, from which users will often make inferences about the objects.
    • Even John Kunze's noid program allows users to enter a prefix for each identifier sequence, which will likely end up having some meaning.

NOID (Nice Opaque IDentifier)

NOID avoids the use of vowels, to prevent the unintentional creation of words (which may be misleading, as in a Variations ID like bad6666). It also avoids the use of lowercase 'l', to minimize confusion with the number 1.

The noid program allows creation of an unlimited number of "minters", which can generate IDs of differing types. IDs may consist of:

  • a fixed prefix
  • numeric digits
  • lower-case ASCII characters, except 'l' and vowels (see above)
  • a single checksum character to allow detection of mis-typed IDs

If a minter is set to produce IDs with a fixed number of digits, it may be set to generate the IDs in a random order, otherwise they are generated sequentially.

Identifier resolution systems

PURL: Persistent URL. Simply a URL redirect, which an institution plans to maintain in a working state indefinitely. PURLs can always be resolved by a web browser.

ARK: Archival Resource Key. An identifier with the form ark:/NAME_GENERATING_AUTHORITY/NAME. More commonly, a URL of the form http://NAME_MAPPING_SERVICE/ark:/NAME_GENERATING_AUTHORITY/NAME. The name mapping service is replacable, and there is a system for looking up a new service if an old one is non-functional. Eventually, ARK hopes to drop everything before the "ark:", and have the name mapping service by dynamic, but it is included for now so that web browsers can support name resolution. The URL can be appended with "?" to retrieve metadata about the object, and with "??" to retrieve a commitment statement. Note: Even though CDL primarily relys on ARK, they use PURL for items they do not control, because they don't want to adhere to an ARK persistence statement for these items.

DOI/handle: Digital Object Identifier. An identifier of the form NAME_GENERATING_AUTHORITY/NAME, but sometimes seen written as a URI (with "doi:"). DOIs are usually associated with a resolver via a URL (like "").

URI: Uniform Resource Identifier. A string, followed by a colon, followed by a string. The initial string should be registered with IANA. Web browsers internally support for resolution of a subset of URIs. All URLs are URIs. ARKs and DOIs have not been recognized as standard URIs yet. Of course, the URL form of an ARK is a URI. ARKs may also be referenced through the "info:" URI.

  • No labels