A document’s DOI (http://www.doi.org/ or on Wikipedia under Digital Object Identifier) is an important part of the citation of a document[ref 1]. Many style sheets allow for just the DOI of a paper as the citation. Because DOIs are unique they can act as URIs which are resolvable and look like URLs[ref 2]. However, a DOI is different than a URL for where a digital object might be located. It might be well argued that a DOI should be tracked in the metadata schemes of archives which collect language and linguistic data.
The follwoing video[ref 3][ref 4] from PsycINFO and the APA talks about DOIs. Notice that the DOIs in the different systems are presented as links to the web users but in print they do not include a http:// beginning. (There is one exception in the demo video.) The DOI is an alpha-numeric text string with “DOI:” preceding it. It is also treated as a separate metadata element.
It is true that DOIs are not the same as handles[ref 5][ref 6]. It is also true that both DOIs and handles attempt to do the same thing in being a permanent reference where things can be located. Both can be expressed and resolvable URIs but, where handles are inherently URIs, DOIs in contrast are not inherently URIs but can be made into resolvable URIs. Additionally, DOIs (from what I have seen in use) often go deeper and can reference even paragraphs or objects inside of academic papers or data sets. This would suggest that DOIs and handles (when presented to the user) need to be distinguished as separate items.
I use sil.org often to look for research done in minority languages. I have noticed that the web presentation there sometimes has a referring URL, but never contains a DOI. While DOIs are able to be made into links. Strictly speaking they are not links. And they are part of the professional metadata about an object.
Some, in my acquaintance have claimed that DOIs are just URLs, and therefore do not require special handling in metadata schemes beyond the kind of handling a URL would receive. I would like to dispel that myth and argue that in an object oriented model that URLs and DOIs are different objects, though they may (but not must) share a common class. Additionally, handles and DOIs would share a common class which would differentiate them from URLs. This essentially means that from an information architecture point of view that URLs and DOIs need to be treated as different metadata elements (with separate keys and values). DOIs and handles can be able to be treated as the same in respect to being references to a digital object, but can not be treated the same as if they are resolvable URIs or URLs, because handles need a different resolver than DOIs do.[ref 7][ref 8][ref 9]
There are three major reasons for listing keeping DOIs separate from just regular URLs in data structures.
- Interoperability of metadata requires it.
- DOIs are not URLs.
- DOIs function (or are allowed to function) differently than URLs.
Interoperability of metadata requires it.
Metadata clients like Zotero and EndNote both allow authors to use metadata to build citations. In each of these applications DOI’s are treated independently from URLs. This is because style sheet authors, the publication editors treat these objects differently. File formats like .ris and bibtex also treat these objects differently (supposedly to also support the personal academic publishing industry). Web apps like Drupal (RIS Exporter) also then support the difference based on their clients’ needs.
DOIs are not URLs.
As far as what is actually stored in a database… DOIs do not require a resolver the “http://” at the beginning of them. They can be communicated (exchanged just like a text string). Notice in the image above there is both a URL with a http:// and a DOI with noting preceding it.
But even this interoperability of metadata between clients/consumers and institutions or the “what constitutes a URI”, doesn’t really display the functional difference in between URLs and URIs.
DOIs function (or are allowed to function) differently than URLs.
The Thing About DOI[ref 10]
With Library of Congress sometime back (Feb. ’08) announcing LCCN Permalinks and NLM also (Mar. ’08) introducing simplified web links with its PubMed identifier one might be forgiven for wondering what is the essential difference between a DOI name and these (and other) seemingly like-minded identifiers from a purely web point of view. Both these identifiers can be accessed through very simple URL structures:
- http://www.ncbi.nlm.nih.gov/pubmed/16481614 (although http://pubmed.com/1386390 also works as noted here)
And the DOI itself can be resolved using an equally simple URL structure:
So, why does DOI not just present itself as a simple database number which is accessed through a simple web link and have done with it, e.g. a page for the object named by the DOI “10.1000/1″ is retrieved from the DOI proxy server at http://dx.doi.org/?
Essentially the typical DOI link presents an elementary web-based URL which performs a useful redirect service. What is different about this and, say a PURL, which offers a similar redirect service? What’s the big deal?Well, the thing about DOI is that it is built upon a directory service – the Handle System – and can be accessed either through native directory calls or more likely through standard web interfaces. From a web point of view we are usually interested in the latter. Differently from a simple lookup and/or redirect service which has a fixed entry point on the Web, the DOI can be serviced at any DOI service access point on the Internet. There are potentially multiple entry points which can be hosted by different organizations with separate IP addresses and/or DNS names.
For example, the DOI proxy (described here) is just one instance of such a service. Others could equally exist. And, in fact, they do. The following handle web services will also take the DOI and do the business:
With handle we have in essence a redirect to a redirect. Or in the case of a web service, a redirect (from HTTP to HDL) to a redirect (from HDL to HDL) to a redirect (from HDL to HTTP). That is, switch down from the web interface to the native handle layer, route the call from this local handle sever (via the global handle server) to the DOI handle server, fetch the URL stored with the DOI and switch back to the Web at that location.
But there’s more. The standard URL redirect is just one example of a DOI service. But multiple services can also be provided for the DOI. Currently the DOI travels light and is bound to the minimum of useful data, essentially just the URL for a splash page in the case of many CrossRef DOIs. But it could also carry pointers to structured information or to relationships with other objects.
As yet, the DOI is a fledgling in terms of realizing its true potential as a seasoned actor that can play out many roles – assume many guises. A queen bee, in effect, with a hive of worker bees servicing it. It is not joined at the hip with any particular web service as might be commonly understood with the current simple redirect service. It offers much more.
It is, however, true that both for reasons of link persistency and in order to maintain link ranking with search crawlers that a preferred web entry point is via the DOI proxy. It just doesn’t have to be that way – that’s all. Hard linking is something we are beginning to unlearn and instead we are taking our first steps towards embracing service-mediated links such as OpenURL and DOI can both offer.
Implications and Take-Aways
An institution which archives linguistic and language materials should use persistent resource identifiers, of which DOIs are a type (Bird & Simons p. 567[ref 11] ). However, these same classes of institutions (archives) may also end up archiving materials which have handles or DOIs from other institutions. Metadata schemas should have a slot for this data. Especially if it is a DOI. One reason that people go to digital repositories is because they are looking for the authoritative citation data so they can reference the object accurately. As language and linguistics related archives (often in collaboration with language documentation efforts) expand the number and types of submitters they interact with it is important that tools developed for submitters, like SIL International’s RAMP application[ref 12] also make this distinction for when submitters can provide this information.
- Chelsea Lee. 21 September 2009. A DOI Primer. APA Style Blog. http://blog.apastyle.org/apastyle/2009/09/a-doi-primer.html [Accessed: 10 April 2011] [Link] ↩
- Dion Almaer. 23 November 2007. URI vs. URL: What’s the difference?. Ajaxian. http://ajaxian.com/archives/uri-vs-url-whats-the-difference. [Accessed: 10 April 2012] [Link] ↩
- PsycINFO. 23 November 2009. How to Find DOIs in APA PsycINFO. On Youtube: http://www.youtube.com/watch?v=D9Afmknkzeo [Accessed: 9 April 2012] [Link] ↩
- Timothy McAdoo. 10 December 2009. How to Find a DOI. APA Style. http://blog.apastyle.org/apastyle/2009/12/how-to-find-a-doi.html [Accessed: 10 April 2012] [Link] ↩
- The International DOI Foundation. Updated 21 September 2006. Factsheet: DOI® System and the Handle System®. http://www.doi.org/factsheets/DOIHandle.html. [Accessed: 10 April 2012] [Link] ↩
- Ben Richardson. 8 November 2009 08:48:47. Technology Comparison: Handle,DOI,LSID. Biodiversity Information Standards (TDWG), also known as the Taxonomic Databases Working Group. http://wiki.tdwg.org/twiki/bin/view/GUID/TechnologyComparison [Accessed: 10 April 2012] [Link] ↩
- OpenHandle Project on Google Code. 2010. http://code.google.com/p/openhandle/wiki/HandleUriScheme. [Accessed: 17 May 2012] [Link] ↩
- “info” URI Scheme. http://info-uri.info/registry/OAIHandler?verb=GetRecord&metadataPrefix=reg&identifier=info:hdl/ [Accessed: 17 May 2012] [Link] ↩
- The Handle System. November 2010. System Fundamentals. http://www.handle.net/overviews/system_fundamentals.html [Accessed: 17 May 2012] [Link] ↩
- Tony Hammond. 30 June 2008 10:55 AM. The Thing About DOI. Crossref Blog. http://www.crossref.org/CrossTech/2008/06/the_thing_about_doi.html [Accessed: 9 April 2012] [Link] ↩
- Steven Bird & Gary Simons. 2003. Seven dimension of portability for language documentation and description. Language 79(3): 557-82. ↩
- Jeremy Nordmoe. 2011. Introducing RAMP: an application for packaging metadata and resources offline for submission to an institutional repository. In Proceedings of Workshop on Language Documentation & Archiving 18 November 2011 at SOAS, London. Edited by: David Nathan. p. 27-32. [Preprint PDF] ↩