Child pages
  • Identifiers
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

We currently have "informal" identifiers that are the latter portion of the PURL (lilly/hohenberger/Hoh010.000.0049 or archives/cushman/P03412). We will retain this scheme until we find something significantly better.

We may eventually assign a DOI for each item, but we don't have a need to do that yet.

Some older collections don't include a fully-qualified ID in their OAI records. For example, in Hohenberger, we have Hoh007.013.0016 instead of lilly/hohenberger/Hoh007.013.0016. We will have to support the old forms of the ID.

Semantics in identifiers

Semantic IDs have "meaning" in at least some portion of the ID.


  • memorable
  • may encode some metadata
  • may include a `brand'


  • "Ultimately, all semantic identifiers are incorrect" --Sean McGrath

Opaqueness in identifiers

Opaque IDs have no apparent meaning.


  • can encode no metadata (avoids problems when the metadata changes)
  • include no branding, which can be useful when one project/collection absorbs another
  • allow automatic generation
  • may be easier to manage (there is no need to maintain the semantics)


  • It is nearly impossible to create a "purely opaque" identifier system:
    • Globally-unique identifiers typically have a portion to indicate the instituion that generated each identifier.
    • Many identifier generators include some sort of sequential information, from which users will often make inferences about the objects.
    • Even John Kunze's noid program allows users to enter a prefix for each identifier sequence, which will likely end up having some meaning.


NOID avoids the use of vowels, to prevent the unintentional creation of words (which may be misleading, as in a Variations ID like bad6666). It also avoids the use of lowercase 'l', to minimize confusion with the number 1.

The noid program allows creation of an unlimited number of "minters", which can generate IDs of differing types. IDs may consist of:

  • a fixed prefix
  • numeric digits
  • lower-case ASCII characters, except 'l' and vowels (see above)
  • a single checksum character to allow detection of mis-typed IDs

If a minter is set to produce IDs with a fixed number of digits, it may be set to generate the IDs in a random order, otherwise they are generated sequentially.

  • No labels