Metadata and the Target Audience

I have been reviewing applications for library, research and citation metadata. Things like RDF, METS, Dublin Core, .ris and BibTeX. In some ways these things are related – they are metadata. But in other ways they are different animals.

In my search I have found two very different classes of metadata schemes based on two different kinds of end users.

  1. End users who are machines (Metadata for interoperability or resource discovery).
  2. End users who are human.

End Users who are machines are usually concerned with the interoperability of metadata for search, storage, and advertisement. These kinds of systems usually are engineered to use metadata schemes like Dublin Core, MODS and METS. Often these systems are able to communicate high level metadata in generic categories.

However, End Users who are human are usually concerned with purposing the metadata in creative processes. And in general, desire to use and appropriate more specific elements of metadata. This is especially true with citation metadata. Students and researchers want to be able to build bibliographies with the data. Additionally, Many of the more detaied metadata elements, that is, overly detailed from a Dublin Core perspective (i.e. can include geo-location name, or a Latitude value or an Altitude value), could be classified as technical metadata according to the first listing below. However, technical metadata is especially relevant for users of audio objects and graphic objects (photos and moving picture objects).

Of those users looking to use metadata to construct bibliographies and citations, they are often looking for that metadata in the interchange formats of either BibTeX, Endnote XML or .ris. Of those users interested in finding things based on technical metadata, such as audio technicians, linguists, ethnographers, and ethnomusicologists, they are looking to use the metadata and the object it describes in a workflow. And in order to purpose that media object as they need to, those users need to make sure that the digital object fits their workflow criteria.

This discrepancy between Metadata for System to System transmission and Metadata for End Users creates a bit of a complext situation, in that delivery systems need to consider both sets of users.

Which information to record?

Often times no single metadata schema (set of contoled vocabularies and fields) will fill all the kinds of information a project or managment practice requires to be recorded. This means that project staff will need to choose which metadata fields best fit their situation from a variety of schemas. The result is a home-grown schema called an application profile. JISC Digital Media from the UK has a really good article on what how to select elements and their purpose in your application profile and workflow.1 In general though one needs to work through the following catagories and ask questions in each phase to determine if which information needs to be retained in each step. In this sense then, structured metadata is divided into four broad categories that contain information which is defined by the schemas or extension schemas being used:

  1. Structural metadata. This is information about the structural relationship with other parent or family files and how the metadata relates to the file.
  2. Descriptive metadata. This is information about the content of the digital file. The information recorded here is more curatorial than technical, and is the primary portal for users to access your resource. Data including File name, creator, associated dates, description, summary, locations etc should be standardised using a interoperable schema such as Simple DC or MODS.
  3. Administrative metadata. This contains information about the analogue source material, the rights of the content and any preservation information. Information here provides support to the managerial team of the collection and researchers in organising and providing access to the resource. Information about rights, ownership and usage restrictions is also kept within the administrative metadata.
  4. Technical metadata. To make good use of the digital object data is required which describes the technical qualities of the physical and/or digital object. This includes information such as channel number, bit-depth, sampling rate, and the unique file identifier. AudioMD, is an XML based schema that has been designed primarily for this purpose. It is soon to be superseded by AES-X098, developed by the Audio Engineering Society, upon its formal release.

Though it is possible to separate out some finer grained metadata categories. Consider the differences from above and those below which were part of my post about Metadata for Socio-linguistic Corpora:

  • Descriptive meta-data: supports discovery, attribution and identification of resources created.
  • Administrative meta-data: supports management, preservation, and appropriate usage of resources created.
    • Technical: About the machinery used to create the resource and the technical aspects of the resource.
    • Use and Rights: Copyright, license and moral ownership of the items.
  • Structural meta-data: maintains relationships between the parts of complex, multi-part resources (Citation: , ) (). Metadata: Why, What and How (the “Who” is You). .
  • Situational: this is metadata which describes the events around the creation of the work. Asking questions about the social setting, or the precursory events. It follows ideas put forward by (Citation: , ) (). The role of metadata for translation and pragmatics in language documentation. Language Documentation and Description, 4. 163–173. Retrieved from http://www.elpublishing.org/PID/055 .
  • Use metadata: metadata collected from or about the users themselves (e.g. user annotations, number of people accessing a particular resource) (Citation: , ) (). Retrieved from http://www.jiscdigitalmedia.ac.uk/crossmedia/advice/an-introduction-to-metadata/

In that post I also said:

I think it is only fair to point out to archivist and to librarians that linguists and language documenters do not see a difference between descriptive and non-descriptive metadata in their workflows. That is sometimes we want to search all the corpora by licenses or by a technical attribute. This elevates the these attributes to the function of discovery metadata. It does not remove the function of descriptive metadata from its role in finding things but it does functionally mean that the other metadata is also viable as discovery metadata.

Compare and match three

My goal here is to compare Doublin Core with BibTeX. There is a nice cross-walk technology for BibTex resources in source-forge with some .ris support.2 In terms of resources there is a list of mappings3 not .ris or Bibtex to DC but many other crosswalks. I started my own Crosswalk work in Google Spreadsheets here. I started with roles but also have an interest in item types for use in XLingPaper.

Bibliography

Bergqvist (2007)
(). The role of metadata for translation and pragmatics in language documentation. Language Documentation and Description, 4. 163–173. Retrieved from http://www.elpublishing.org/PID/055
Joint Information Systems Committee (JISC) (2010)
(). Retrieved from http://www.jiscdigitalmedia.ac.uk/crossmedia/advice/an-introduction-to-metadata/
Spanne (2008)
(). Metadata: Why, What and How (the “Who” is You).

  1. JISC Digital Media. 07 January 2010. Metadata and Audio Resources. http://www.jiscdigitalmedia.ac.uk/audio/advice/metadata-and-audio-resources Link [Accessed: 19 March 2012] ↩︎

  2. Feed for All has a nice list view of Dublin Core and a buch of resources for making different kinds of feeds—especially good for PodCast or RSS feed building. ↩︎

  3. Michael Day. August 1996, Last updated: 22-May-2002. Metadata: Mapping between metadata formats. UKOLN: The UK Office for Library and Information Networking, University of Bath, Bath, BA2 7AY, United Kingdom. http://www.ukoln.ac.uk/metadata/interoperability/ Link [Accessed: 19 March 2012] ↩︎

Tags:
Categories:
Hugh Paterson III
Hugh Paterson III
Collaborative Scholar

I specialize in bespoke research at the intersection of Linguistics, Law, Languages, and Technology; specifically utility and life-cycle management for information products in these spaces.

Related