In this post I take a look at some of the software needs of a language documentation team. One of my ongoing concerns of linguistic software development teams (like SIL International's Palaso or LSDev, or MPI's archive software group, or a host of other niche software products adapted from main stream open-source projects) is the approach they take in communicating how to use the various elements of their software together to create useful workflows for linguists participating in field research on minority languages. Many of these software development teams do not take the approach that potential software users coming to their website want to be oriented to how these software solutions work together to solve specific problems in the language documentation problem space. Now, it is true that every language documentation program is different and will have different goals and outputs, but many of these goals are the same across projects. New users to software want to know top level organizational assumptions made by software developers. That is, they want to evaluate how software will work in a given scenario (problem space) and to understand and make informed decisions based on the eco-system that the software will lead them into. This is not too unlike users asking which is better Android or iPhone, and then deciding what works not just with a given device but where they will buy their music, their digital books, and how they will get those digital assets to a new device, when the phone they are about to buy no-longer serves them. These digital consequences are not in the mind of every consumer... but they are nonetheless real consequences. Continue reading →
As linguistics and language documentation interface with digital humanities there has been a lot of effort to time-align texts and audio/video materials. At one level this is rather trivial to do and has the backing of comercial media processes like subtitles in movies. However, at another level this task is often done in XML for every project (digital corpus curation) slightly differently. At the macro-scale the argument is that if the annotation of the audio is in XML and someone wants to do something else with it, then they can just convert the XML to whatever schema they desire. This is true.
However, one antidotal point that I have not heard in discussion of time aligned texts is specifications for Audio Dominant Text vs. Text Dominant Audio. This may not initially seem very important, so let me explain what I mean. Continue reading →
In a recent paper Jeremy Nordmoe, a friend and colleague, states that:
Because most linguists archive documents infrequently, they will never be experts at doing so, nor will they be experts in the intricacies of metadata schemas. [1]Jeremy Nordmoe. 2011. Introducing RAMP: an application for packaging metadata and resources offline for submission to an institutional repository. In Proceedings of Workshop on Language Documentation … Continue reading
My initial reply is:
You are d@#n right! and it is because archives are not sexy enough!
Jeremy Nordmoe. 2011. Introducing RAMP: an application for packaging metadata and resources offline for submission to an institutional repository. In Proceedings of Workshop on Language Documentation & Archiving 18 November 2011 at SOAS, London. Edited by: David Nathan. p. 27-32. [Preprint PDF]
In a team framework where there are several members of a research team and the job requirements call for the sharing of bibliographic data (of materials referenced) as well as the actual resources being referenced. In this environment there needs to be a central repository for sharing both kinds of data. This is true for small localized (geographically) groups as well as large distributed research teams. New researchers joining a existing team need to be able to “plug-in” to existing foundational work on the project and be able to access bibliographic data as well as the resources those bibliographic details point to. It is my point here to outline some of the current challenges involved in trying to overcoming the collaborative obstacle when working in the fields of Linguistics and Language Documentation [1]Nikolaus P. Himmelmann. 1998. Documentary and Descriptive Linguistics. Linguistics vol. 36:161-195. [PDF] [Accessed 24 Dec. 2010].This sentiment is echoed by many in the world of science. Here is someone on Zetero’s forums [INSERT LINK]. (Though Zetero does claim to combat some of these issues.)
In July I presented a paper at CRASSH in Cambridge. It was a small conference, but being in Europe it was good to see many of the various kinds of projects which are going on in Digital Humanities and Linguists, or also Cloud Computing and Linguistics. One particular project, TypeCraft, stands out as being rather well done and promising was presented by Dorothee Beermann Hellan. I think the ideas presented in this project are well thought out and seem to be well implemented. It would be nice to see this product integrated with some other linguistics and language documentation cloud offerings. i.e. Project Lego from the Linguist’s List or the Max Planck Institute’s LEXUS project. While TypeCraft does allow for round tripping of data with XML, what I am talking about is a consolidated User Experience for both professional linguists and for Minority language users.
A couple of years ago I had a chance meeting with a cartographer in North Dakota. It was interesting because he asked us (a group of linguists) What is a language or linguistic map? So, I grabbed a few examples and put them into a brief for him. This past January at the LSA meeting in Portland, Oregon, I had several interesting conversations with the folks at the LL-Map Project under Linguists’ List. It occurred to me that such a presentation of various kinds of language maps might be useful to a larger audience. So this will be a bit unpolished but should show a wide selection of language and linguistic based maps, and in the last section I will also talk a bit about interactive maps. Continue reading →
This post is a open draft! It might be updated at any time… But was last updated on at .
In this reviewRegardless of the views expressed here in this review, it should be stated that I have high hopes for Webonary’s future. Some of the people working on Webonary are my colleagues so I attempt hedge my review with the understanding that this is not the final state of Webonary. I am excited that easy to use technology, like WordPress is being used, and that minority language groups around the world have the opportunity to use free software like webonary. I will be looking at the WordPress plugin, Webonary and several associated issues. Continue reading →
For the last few weeks I have been thinking about how can one measure the impact on a language due to a language communities' contact with other languages. I have been looking for ways that remoteness has been measured in the past. I recently ran across a note on my iPhone from when I was in Mexico dated March 8, 2011.
A metric for measuring the language language shift, contact, and relatedness of indigenous languages of Mexico
The formation of aerial features
Population density
Trade and social networks
Political affiliation
Geographic factors
Roads travel opportunities
I remember writing this note: I was standing in front of a topographical map showing terrain regions. This map also had the language areas of Mexico outlined. It occurred to me (having also recently had a conversation with a local anthropologist on the matter of trade routes and mountain passes) that as a factor in language endangerment that these sorts of factors should be accounted for and if it can be accounted for then it should also be able to be graphed (on a map of course). The major issue being that if one just plots a language area without showing population/speaker density in that area then the viewer of that map will get a warped view of the language situation. Population density also does not solely infer where language attrition will likely not occur. And language contact does not automatically happen on the edges of a language area. That is to say, in a country with mountain passes, there will likely be more language contact in the passes as various groups travel to market than in higher elevated mountain villages. This leads to the issue of language diffusion and the representation of language diffusion. But the issue is not just one of language diffusion, it is also one of population diffusion, and population mobility and accessibility to various areas. So in terms of projecting, assessing and plotting language vitality, considering remoteness should be part of the equation. But remoteness is not just a factor on its own, it is more of an index considering the issues mentioned above but specifically considering the issues of geographical remoteness and considering the issues of social remoteness (or contact, even with other villages and cities in the same language and ethnic communities).
I am not currently aware of any index, much less a project which plots this index to a geographical area. However, I have found some previous work worth mentioning which might be related and relevant.
Modeling Language Diffusion With ArcGIS
There is an interesting paper and project on modeling language diffusion with ArcGIS. It was prepared for Worldmap.org by Christopher Deckert in 2004 and presented at the 24th ESRI users conference. [1]Christopher Deckert. 2004. Modeling Language Diffusion With ArcGIS. Paper published in the proceedings of the 24th Annual Esri International User Conference, August 9–13, 2004. … Continue reading
Remote Areas of the World
The magazine NewScientist has an article from April 2009 [2]Caroline Williams. 20 April 2009. NewScientist. Where's the remotest place on Earth?. http://www.newscientist.com/article/mg20227041.500-wheres-the-remotest-place-on-earth.html. [Link] [Accessed: 27 … Continue reading about the Remotes places in the world it has several maps and abstractions showing how remote (with reference to travel time) places in the world are. The following maps come from the NewScientist article.
Map showing the access ability from one point to another.
Detail of roads in west Africa
Map showing the remoteness of the Tibetan Plateau
The ASGC Remoteness Structure
Another promising resource I found is the ASGC Remoteness Structure which Australia has developed to show how remote parts of Australia are. There is a series of papers explaining the methods behind the algorithms used and the purpose of the study. One of the outputs was the map below. [3]Commonwealth Department of Health and Aged Care. 2001, Measuring Remoteness: Accessibility/Remoteness Index of Australia (ARIA), Revised Edition, Occasional Papers: New Series No. 14 [PDF] [Link] … Continue reading
Australia Remoteness Map
The Territoriality of Public Health Governance in Mexico
The last resource I am going to mention here is The Territoriality of Public Health Governance in Mexico. A study which plots the Remoteness of Health Care in Mexico. [4] Alberto Díaz-Cayeros and Justin Levitt. August 30, 2011. The Territoriality of Public Health Governance in Mexico. http://irps.ucsd.edu/assets/001/502971.pdf [PDF] [Accessed: 12 February 2012]
Christopher Deckert. 2004. Modeling Language Diffusion With ArcGIS. Paper published in the proceedings of the 24th Annual Esri International User Conference, August 9–13, 2004. http://proceedings.esri.com/library/userconf/proc04/docs/pap1071.pdf [PDF] [Accessed: 27 February 2011]
Caroline Williams. 20 April 2009. NewScientist. Where's the remotest place on Earth?. http://www.newscientist.com/article/mg20227041.500-wheres-the-remotest-place-on-earth.html. [Link] [Accessed: 27 February 2011]
Commonwealth Department of Health and Aged Care. 2001, Measuring Remoteness: Accessibility/Remoteness Index of Australia (ARIA), Revised Edition, Occasional Papers: New Series No. 14 [PDF] [Link] [Accessed: 2 February 2012]
Alberto Díaz-Cayeros and Justin Levitt. August 30, 2011. The Territoriality of Public Health Governance in Mexico. http://irps.ucsd.edu/assets/001/502971.pdf [PDF] [Accessed: 12 February 2012]