Skip to primary content
Skip to secondary content

The Journeyler

A walk through: Life, Leadership, Linguistics, Language Documentation, WordPress, and OS X (and a bit of Marketing & Business Administration)

The Journeyler

Main menu

  • Home
  • CV/Resume
  • Family
    • Katja
    • Hugh V
  • Location
    • Cartography
    • Geo-Tagging
    • GPS
  • Language Documentation
    • Linguistics
    • Digital Archival
  • Visiting Collections
    • Photography
    • Open Drafts
    • Posts to move to another website
  • Archives

Tag Archives: VRACore

VRA Core and its use of xml:lang

Posted on December 15, 2023 by Hugh Paterson III
Reply

Some information professionals might be confused about the use of language identification metadata in larger bibliographic metadata standards. For example, VRA Core (Visual Resources Association)is a metadata standard which is used to describe visual artifacts. It is implemented in XML and therefore takes on all the descriptive power of XML. Including the use of the xml:lang attribute.

The following observations are made using the VRA (Visual Resources Association) Core 4 XML Schema, version 0.42. This schema implements the final VRA Core 4.0 guidelines, 2007-04-09. It is important to note that in these metadata standards implemented by memory institutions there are really two parts, the first is the "guidelines" and then there is the "implementation" of those guidelines (in this case as an XSD validation file). These two documents may not always be congruent even if that is the intention. In these cases I argue that what is valid is the technical implementation over the guidelines as that seems to be the best way to argue the definitive authority.

The XSD validation document contains the following annotation around the use of the xml:lang attribute.

VRA Core metadata attributes which can be applied to virtually any element. Note that xml:lang should contain ISO 639 language codes, not the English names of languages. Although the XML Schema defines xml:lang as allowing ISO 639-2 (three-letter) codes, some validators will only accept ISO 639-1 (two-letter) codes.

This annotation is misleading. First, the VRA Core authors are trying to alert catalogers and technologists that they need to not use the full text name value as might be done in other "library oriented standards", but rather they need to use language codes. In general this is a good thing. However, the VRA authors fail to understand the XML specification. Specifically, they indicate the need to use ISO 639 language codes. This is not true. XML needs to use BCP-47 language codes. This can be found in the specification for XML 1.0 fifth edition §2.12 https://www.w3.org/TR/xml/#sec-lang-tag. It is true that BCP-47 currently calls for the use of ISO 639 codes, but this might not always be true.

A second issue with the annotation is how the annotation distinguishes use between ISO 639-2 and ISO 639-1. If there are VRA Core data consumers or producers who are not consuming or producing valid XML then this is a transmission machinery issue not a protocol issue. BCP-47 does not call for the use of ISO 639-2/3 tas when there is an equivalent ISO 639-1 tag. If data ingest processes have only implemented ingest of ISO 639-1 then they haven't implemented VRA because VRA stands on XML which stands on BCP-47. BCP-47 is an algorithm which calls upon different standards at different times. Understanding the fall back nature of the algorithm would have clarified this point for VRA authors.

The following resources are useful for a better understanding of Language Tags in XML:

  • https://www.w3.org/International/articles/language-tags
  • https://www.w3.org/International/techniques/authoring-xml#natlang
  • https://www.w3.org/International/questions/qa-when-xmllang.en.html
Posted in Other Journals | Tagged in_Obsidian, Language metadata, metadata, Visual Metadata, VRACore, XML | Leave a reply

Activity

May 2025
M T W T F S S
 1234
567891011
12131415161718
19202122232425
262728293031  
« Jan    

I’ve been saying

  • Chasing subsets
  • New mouse buttons
  • Moving Apple notes
  • Academic Heritage in MARC records
  • Converting DC Subjects to Schema.org
  • Language Documentation Gear
  • Serials, MARC Records and RDA Core
  • Font Modulator
  • OLAC CMS options via XML
  • OLAC Collection Description and Linked Data Terms
  • Zotero Plugins
  • OLAC and User Tasks

Say What?

  • David Clews on German Waters
  • Jeff Pitts on Kinder Eier
  • Jeff on Plasticification of soil
  • Thoughts on file formats and file names in language documentation projects and archiving | The Journeyler on The Workflow Management for Linguists
  • Hugh Paterson III on Types of Linguistic Maps: The Mapping of linguistic Features and Researcher Interactivity

One should not consider the content on this website to be an official opinion of any company associated with me. These posts are solely my opinion.

Proudly powered by WordPress

© 2005-2025 Hugh Paterson III All Rights Reserved.
By submitting a comment here you grant this site a perpetual license to reproduce your Words, Name & Website URL in attribution.
Details of your viewing experience maybe retained and used. -- Copyright notice by Blog Copyright