spacer ARK

{0}."> {0}-{1} of {2} pages containing {3}.">
Skip to end of metadata
  • Page restrictions apply
  • Attachments:3
  • Added by Perry Willett, last edited by John Kunze on Mar 10, 2012  (view change)
Go to start of metadata

ARK (Archival Resource Key) Identifiers

ARKs are URLs designed to support long-term access to information objects. They identify objects of any type:

  • digital objects – documents, databases, images, software, websites, etc.
  • physical objects – books, bones, statues, etc.
  • living beings and groups – people, animals, companies, orchestras, etc.
  • intangible objects – places, chemicals, diseases, vocabulary terms, performances, etc.

ARKs are assigned for a variety of reasons:

  • affordability – there are no fees to assign or use ARKs
  • self-sufficiency – you can host ARKs on your own web server
  • portability – you can move ARKs to other servers without losing their core identities
  • global resolvability – you can host ARKs at a well-known server, such as n2t.net/
  • density – ARKs handle mixed case, permitting shorter identifiers (CD, Cd, cD, cd are all distinct)

Some unique advantages of ARKs:

  • simplicity – access relies only on mainstream web "redirects" and ordinary "get" requests
  • versatility – with "inflections" (different endings), an ARK should access data, metadata, promises, and more
  • transparency – no identifier can guarantee stability, and ARK inflections help users make informed judgements
  • visibility – syntax rules make ARKs easy to extract from texts and to compare for variant and containment relationships

Between 2001 and 2011 about a hundred organizations registered to assign ARKs.  Some of the largest users are

  • The California Digital Library
  • The Internet Archive
  • National Library of France (Bibliothèque nationale de France)
  • Portico Digital Preservation Service
  • University of California Berkeley
  • University of Chicago

We are very interested in building a community of users and will be announcing an email forum soon.  Here is a brief summary of other resources relevant to ARKs.

  • The ARK Identifier Scheme Specification PDF version     TXT version
  • Towards Electronic Persistence Using ARK Identifiers (July 2003)
  • ARK and CDL Identifier conventions
  • Archival Resource Key - Wikipedia
  • EZID service: long term identifiers made easy
  • N2T resolver: Name-to-Thing
  • NOID: (Nice Opaque Identifier) Minting and Binding Tool
  • CDL Identifier Conventions

ARK Anatomy

An ARK is represented by a sequence of characters that contains the label, "ark:".  When embedded in a URL, it is preceded by the protocol  ("") and name of a service that provides support for that ARK. That service name, or the "Name Mapping Authority" (NMA), is mutable and replaceable, as neither the web server itself nor the current web protocols are expected to last longer than the identified objects. The immutable, globally unique identifier follows the "ark:" label. This includes a "Name Assigning Authority Number" (NAAN) identifying the naming organization, followed by the name that it assigns to the object.

Here is a diagrammed example:

   example.org/ark:/12025/654xz321/s3/f8.05v.tiff
   \________________/ \__/ \___/ \______/ \____________/
     (replaceable)     |     |      |       Qualifier
          |       ARK Label  |      |    (NMA-supported)
          |                  |      |
Name Mapping Authority       |    Name (NAA-assigned)
         (NMA)               |
                  Name Assigning Authority Number (NAAN)

The ARK syntax can be summarized,

  [NMA/]ark:/NAAN/Name[Qualifier]

The NMA part, which makes the ARK actionable (clickable in a web browser), is in brackets to indicate that it is optional and replaceable. ARKs are intended to work with objects that last longer than the organizations that provide services for them, so when the provider changes it should not affect the object's identity. A different provider hosting the object would simply replace the NMA to reflect the new "home" of the object. For example,

  bnf.fr/ark:/13030/tf5p30086k

might become

  portico.org/ark:/13030/tf5p30086k

Note that the ark:/NAAN/Name remains the same.

NAAN: the Name Assigning Authority Number

The NAAN part, following the "ark:" label, uniquely identifies the organization that assigned the Name part of the ARK. Often the initial access provider (the first NMA) coincides with the original namer (represented by the NAAN), however, access may be provided by one or more different entities instead of or in addition to the original naming authority.

The NAAN used above, 13030, represents the California Digital Library.  As of 2012, roughly a hundred organizations have registered for ARK NAANs, including numerous universities, Google, the Internet Archive, WIPO, the British Library, and other national libraries.

Any stable memory organization may obtain a NAAN at no cost and begin assigning ARKs. Please contact the CDL if you are interested in generating and using ARKs for your information objects.

CDL maintains a complete registry of all currently assigned NAANs, which is mirrored at the (U.S.) National Library of Medicine and the Bibliothèque nationale de France.

Creating and Managing ARKs

Once your organization has a Name Assigning Authority Number (NAAN), you may begin using it immediately to assign ARKs.

In thinking about how to manage the namespace, you may find it helpful to consider the usual practice of partitioning it with reserved prefixes of, say 1-5 characters, eg, names of the form "ark:/NAAN/xt3...." for each "sub-publisher" in an organization. Opaque prefixes that only have meaning to information professionals are often a good idea and have precedent in schemes such as ISBN and ISSN. The ARK specification is currently the best guide for how to create URLs that comply with ARK rules, although it is fairly technical.

You can use any system you wish to manage your identifiers. One approach is to create and assign ARKs as a side-effect of deposit into a content repository, with ARKs publicized as being hosted on your server, eg,

  myrepo.example.org/ark:/12345/bcd987

Another option is to use the EZID service (n2t.net/ezid), which means your ARKs would appear to be hosted at n2t.net, as in

  n2t.net/ark:/12345/bcd987

As with any identifier scheme, persistence requires a redirectable reference to content in stable storage. EZID operates on a cost-recovery basis and can be used to manage your namespace, which includes minting and resolving ARKs (and other identifiers), as well as maintaining metadata. There's is also guidance on CDL Identifier Conventions available.

Because long-term identifiers often look like random strings of letters and digits, organizations typically use software to generate (or mint, in ARK parlance) and track identifiers. To mint ARKs, you may use any software that can produce identifiers conforming to the ARK specification. CDL uses the open-source NOID (nice opaque identifiers, rhymes with "employed") software, which creates minters and accepts commands that operate them. The noid software documentation explains how to use noid not only to mint identifiers but also to serve as an institution's "identifier resolver".

Once minted and publicized as being associated with a specific object, the ARK becomes a stable, unique, and compact reference that can be included in metadata records, da

gipoco.com is neither affiliated with the authors of this page nor responsible for its contents. This is a safe-cache copy of the original web site.