I have been reviewing applications for library, research and citation metadata. Things like RDF, METS, Dublin Core, .ris and BibTeX. In some ways these things are related – they are metadata. But in other ways they are different animals.
In my search I have found two very different classes of metadata schemes based on two different kinds of end users.
- End users who are machines (Metadata for interoperability or resource discovery).
- End users who are human.
End Users who are machines are usually concerned with the interoperability of metadata for search, storage, and advertisement. These kinds of systems usually are engineered to use metadata schemes like Dublin Core, MODS and METS. Often these systems are able to communicate high level metadata in generic categories.
However, End Users who are human are usually concerned with purposing the metadata in creative processes. And in general, desire to use and appropriate more specific elements of metadata. This is especially true with citation metadata. Students and researchers want to be able to build bibliographies with the data. Additionally, Many of the more detaied metadata elements, that is, overly detailed from a Dublin Core perspective (i.e. can include geo-location name, or a Latitude value or an Altitude value), could be classified as technical metadata according to the first listing below. However, technical metadata is especially relevant for users of audio objects and graphic objects (photos and moving picture objects).
Of those users looking to use metadata to construct bibliographies and citations, they are often looking for that metadata in the interchange formats of either BibTeX, Endnote XML or .ris. Of those users interested in finding things based on technical metadata, such as audio technicians, linguists, ethnographers, and ethnomusicologists, they are looking to use the metadata and the object it describes in a workflow. And in order to purpose that media object as they need to, those users need to make sure that the digital object fits their workflow criteria.
This discrepancy between Metadata for System to System transmission and Metadata for End Users creates a bit of a complext situation, in that delivery systems need to consider both sets of users.
Which information to record?
Structured metadata is divided into four main categories that contain information which is defined by the schemas or extension schemas being used:
- Structural metadata. This is information about the structural relationship with other parent or family files and how the metadata relates to the file.
- Descriptive metadata. This is information about the content of the digital file. The information recorded here is more curatorial than technical, and is the primary portal for users to access your resource. Data including File name, creator, associated dates, description, summary, locations etc should be standardised using a interoperable schema such as Simple DC or MODS.
- Administrative metadata. This contains information about the analogue source material, the rights of the content and any preservation information. Information here provides support to the managerial team of the collection and researchers in organising and providing access to the resource. Information about rights, ownership and usage restrictions is also kept within the administrative metadata.
- Technical metadata. To make good use of the digital object data is required which describes the technical qualities of the physical and/or digital object. This includes information such as channel number, bit-depth, sampling rate, and the unique file identifier. AudioMD, is an XML based schema that has been designed primarily for this purpose. It is soon to be superseded by AES-X098, developed by the Audio Engineering Society, upon its formal release.
Though it is possible to separate out some finer grained metadata categories. Consider the differences from above and those below which were part of my post about Metadata for Socio-linguistic Corpora:
- Descriptive meta-data: supports discovery, attribution and identification of resources created.
- Administrative meta-data: supports management, preservation, and appropriate usage of resources created.
- Technical: About the machinery used to create the resource and the technical aspects of the resource.
- Use (meaning how one may use the objects) and Rights: Copyright, license and moral ownership of the items.
- Structural meta-data: maintains relationships between the parts of complex, multi-part resources (Spanne 2008).
- Situational: this is metadata which describes the events around the creation of the work. Asking questions about the social setting, or the precursory events. It follows ideas put forward by Bergqvist (2007).
- Use metadata: metadata collected from or about the users themselves (e.g. user annotations, number of people accessing a particular resource)
In that post I also said:
I think it is only fair to point out to archivist and to librarians that linguists and language documenters do not see a difference between descriptive and non-descriptive metadata in their workflows. That is sometimes we want to search all the corpora by licenses or by a technical attribute. This elevates the these attributes to the function of discovery metadata. It does not remove the function of descriptive metadata from its role in finding things but it does functionally mean that the other metadata is also viable as discovery metadata.
Compare and match three
My goal here is to compare Doublin Core [http://www.feedforall.com/dublin-core.htm] with BibTeX
[There is a nice cross-walk technology for bibTex resources in source-forge: http://bibtexml.sourceforge.net/details.html] and with .ris.
“RIS” Format Documentation Adding a “Direct Export” Button to Your Web Page or Web Application
List of Mappings not .ris or Bibtex to DC but many other cross walks.