Software Needs for a Language Documentation Project

Posted on April 20, 2013 by Hugh Paterson III

In this post I take a look at some of the software needs of a language documentation team. One of my ongoing concerns of linguistic software development teams (like SIL International's Palaso or LSDev, or MPI's archive software group, or a host of other niche software products adapted from main stream open-source projects) is the approach they take in communicating how to use the various elements of their software together to create useful workflows for linguists participating in field research on minority languages. Many of these software development teams do not take the approach that potential software users coming to their website want to be oriented to how these software solutions work together to solve specific problems in the language documentation problem space. Now, it is true that every language documentation program is different and will have different goals and outputs, but many of these goals are the same across projects. New users to software want to know top level organizational assumptions made by software developers. That is, they want to evaluate how software will work in a given scenario (problem space) and to understand and make informed decisions based on the eco-system that the software will lead them into. This is not too unlike users asking which is better Android or iPhone, and then deciding what works not just with a given device but where they will buy their music, their digital books, and how they will get those digital assets to a new device, when the phone they are about to buy no-longer serves them. These digital consequences are not in the mind of every consumer... but they are nonetheless real consequences.
Continue reading →

The Look of Language Archive Websites

Posted on September 19, 2012 by Hugh Paterson III

This the start of a cross-language archive look at the current state of UX design presenting Content generated in Language Documentation.

http://www.rnld.org/archives
http://www.mpi.nl/DOBES/language_archives

http://paradisec.org.au/
http://repository.digiarch.sinica.edu.tw/index.jsp?lang=en

http://alma.matrix.msu.edu/

http://www.thlib.org/

http://www.ailla.utexas.org/site/welcome.html

Leave Typology to the Typologists: I am a Linguist

Posted on September 13, 2012 by Hugh Paterson III

A User Experience look at Linguistic Archiving

In a recent paper Jeremy Nordmoe, a friend and colleague, states that:

Because most linguists archive documents infrequently, they will never be experts at doing so, nor will they be experts in the intricacies of metadata schemas. ^[1]Jeremy Nordmoe. 2011. Introducing RAMP: an application for packaging metadata and resources offline for submission to an institutional repository. In Proceedings of Workshop on Language Documentation … Continue reading

My initial reply is:

You are d@#n right! and it is because archives are not sexy enough!

Continue reading →

References[+]

References
↑1	Jeremy Nordmoe. 2011. Introducing RAMP: an application for packaging metadata and resources offline for submission to an institutional repository. In Proceedings of Workshop on Language Documentation & Archiving 18 November 2011 at SOAS, London. Edited by: David Nathan. p. 27-32. [Preprint PDF]

Permanently accessible? to whom?

Posted on August 31, 2012 by Hugh Paterson III

Bush house: the BBC World Service is leaving its home after 71 years
Photo: Paul Grover via The Telegraph

There has recently been some discussion on the about the BBC selling its production facilities and moving from the Bush House to somewhere else. The BBC world service has been a major player in radio and oral culture in Great Britain and around the world for 71 years. A lot of history has been reported by the service. And the BBC's records (including its archive) have oral histories of a variety of world events for the last 71 years in a variety of languages (Wikipedia has a brief description of the collections at the BBC.). Continue reading →

References[+]

References
↑1	Christopher Middleton. 7:30 am BST 10 Jul 2012. For sale: Bush House, a landmark of BBC World Service history. The Telegraph on-line. http://www.telegraph.co.uk/culture/tvandradio/bbc/9386848/For-sale-Bush-House-a-landmark-of-BBC-World-Service-history.html [Link] [Accessed: 19 July 2012]
↑2	Jonathan Prynn. 11 July 2012. Buy a bit of BBC radio history… or an entire studio. London Evening Standard on-line. http://www.standard.co.uk/news/uk/buy-a-bit-of-bbc-radio-history-or-an-entire-studio-7935734.html [Link] [Accessed: 19 July 2012]
↑3	Paul Ridden. 12:41 pm 12 July 2012. Updated: BBC World Service equipment and memorabilia to go under the auctioneer's hammer. gizmag online. http://www.gizmag.com/bbc-world-service-bush-house-auction/23292/ [Link] [Accessed: 19 July 2012]

Useful or Not?

Posted on August 31, 2012 by Hugh Paterson III

This post is a open draft! It might be updated at any time... But was last updated on at .

The online version of the SIL Bibliography contains a subset of over 29,000 citations from the more than 40,000 publications representing 75 years of SIL International's language research in over 2,700 languages. ^[1] SIL Bibliography Online. April 2012 version. SIL International on Ethnologe.com. http://www.ethnologue.com/bibliography.asp [Accessed: 21 August 2012] [Link]

Finding Resources through SIL.org's (as of 2 August 2012) Bibliography can be a challenge at times - Maybe even a time-wasting endeavor. Time wasting because it might not be very useful to consult the online Bibliography.

The challenging aspect which affects usefulness is primarily three fold:

Items known by SIL to have been created by SIL staff may or may not be listed. (The on-line Bibliography is a sub-set.)
Items listed in the Bibilography may or may not have digitally accessible resources.
Items created by SIL staff may or may not be in the bibliography because they have not been submitted to the Language and Culture Archive (managing division of the SIL Bibliography).

Continue reading →

References[+]

References
↑1	SIL Bibliography Online. April 2012 version. SIL International on Ethnologe.com. http://www.ethnologue.com/bibliography.asp [Accessed: 21 August 2012] [Link]

The Citation Problem

Posted on August 28, 2012 by Hugh Paterson III

In a team framework where there are several members of a research team and the job requirements call for the sharing of bibliographic data (of materials referenced) as well as the actual resources being referenced. In this environment there needs to be a central repository for sharing both kinds of data. This is true for small localized (geographically) groups as well as large distributed research teams. New researchers joining a existing team need to be able to “plug-in” to existing foundational work on the project and be able to access bibliographic data as well as the resources those bibliographic details point to. It is my point here to outline some of the current challenges involved in trying to overcoming the collaborative obstacle when working in the fields of Linguistics and Language Documentation ^[1]Nikolaus P. Himmelmann. 1998. Documentary and Descriptive Linguistics. Linguistics vol. 36:161-195. [PDF] [Accessed 24 Dec. 2010].This sentiment is echoed by many in the world of science. Here is someone on Zetero’s forums [INSERT LINK]. (Though Zetero does claim to combat some of these issues.)

Bibliographic Data v.s Citation Data

Continue reading →

References[+]

References
↑1	Nikolaus P. Himmelmann. 1998. Documentary and Descriptive Linguistics. Linguistics vol. 36:161-195. [PDF] [Accessed 24 Dec. 2010]

Keyboard Design for Minority languages

Posted on June 22, 2012 by Hugh Paterson III

This post is a open draft! It might be updated at any time… But was last updated on at .

Keyboards Virtual and Physical

Pre-Print Draft will not be available through this means, though there is a video of the presentation.

A. Meꞌphaa Text Sample

A̱ ngui̱nꞌ, tsáanꞌ ninimba̱ꞌlaꞌ ju̱ya̱á Jesús, ga̱ju̱ma̱ꞌlaꞌ rí phú gagi juwalaꞌ ído̱ rí nanújngalaꞌ awúun mbaꞌa inii gajmá. Numuu ndu̱ya̱á málaꞌ rí ído̱ rí na̱ꞌnga̱ꞌlaꞌ inuu gajmá, nasngájma ne̱ rí gakon rí jañii a̱kia̱nꞌlaꞌ ju̱ya̱á Ana̱ꞌlóꞌ, jamí naꞌne ne̱ rí ma̱wajún gúkuálaꞌ. I̱ndo̱ó máꞌ gíꞌmaa rí ma̱wajún gúkuálaꞌ xúgíí mbiꞌi, kajngó ma̱jráanꞌlaꞌ jamí ma̱ꞌne rí jañii a̱kia̱nꞌlaꞌ, asndo rí náxáꞌyóo nitháan rí jaꞌyoo ma̱nindxa̱ꞌlaꞌ. [I̱yi̱i̱ꞌ rí niꞌtháán Santiágo̱ 1:2-4]

B. Sochiapam Chinantec Text Sample

Hnoh² reh², ma³hiún¹³ hnoh² honh² lɨ³ua³ cáun² hi³ quiunh³² náh², quí¹ la³ cun³ hi³ má²ca³lɨ³ ñíh¹ hnoh² jáun² hi³ tɨ³ jlánh¹ bíh¹ re² lı̵́²tɨn² tsú² hi³ jmu³ juenh² tsı̵́³, nı̵́¹juáh³ zia³² hi³ cá² lau²³ ca³tɨ²¹ hi³ taunh³² tsú² jáun² ta²¹. Hi³ jáun² né³, chá¹ hnoh² cáun² honh², hi³ jáun² lı̵́¹³ lɨ³tɨn² hnoh² re² hi³ jmúh¹³ náh² juenh² honh², hi³ jáun² hnoh² lı̵́¹³ lı̵́n³ náh² tsá² má²hún¹ tsı̵́³, tsá² má²ca³hiá² ca³táunh³ ca³la³ tán¹ hián² cu³tí³, la³ cun³ tsá² tiá² hi³ lɨ³hniauh²³ hí¹ cáun² ñí¹con² yáh³. [Jacobo Jmu² Cáun² Sí² Hi³ Ca³tɨn¹ Tsá² *Judíos, Tsá² Má²tiáunh¹ Ñí¹ Hliáun³ 1:2-4]

C. Spanish Text Sample

Hermanos míos, gozaos profundamente cuando os halléis en diversas pruebas, sabiendo que la prueba de vuestra fe produce paciencia. Pero tenga la paciencia su obra completa, para que seáis perfectos y cabales, sin que os falte cosa alguna. [Santiago 1:2-4 Reina-Valera 1995 (RVR1995)]

D. English Text Sample

Dear brothers and sisters, when troubles come your way, consider it an opportunity for great joy. For you know that when your faith is tested, your endurance has a chance to grow. So let it grow, for when your endurance is fully developed, you will be perfect and complete, needing nothing. [James 1:2-4 New Living Translation (NLT 2007)]

From Folksonomies to Taxonomies with Linguistic Metadata

Posted on March 22, 2012 by Hugh Paterson III

This post is a open draft! It might be updated at any time... But was last updated on at .

Metadata is very important - Everyone agrees. However, there is some discussion when it comes to how to develop metadata and also how to ensure that the metadata is accurate. Taxonomies are limited vocabularies (a set number of items) where each term has a predefined definition. A folksonomy is a vocabulary where people, usually users of data, assign their own useful words or metadata to an item. Folksonomies are like taxonomies in that they are both sets but are unlike taxonomies in the sense that they are an open set where taxonomies are closed sets.

An example of a taxonomy might be the colors of a traffic light: Red, Yellow, and Green. If this were a folksonomy people might suggest also the colors of Amber, Orange, Blue-Green and Blue. These additional terms may be accurate to some viewers of traffic lights or in some cases but they do not fit the stereo-typical model for what are the colors of traffic lights.
Continue reading →

Types of Linguistic Maps: The Mapping of linguistic Features and Researcher Interactivity

Posted on March 22, 2012 by Hugh Paterson III

A couple of years ago I had a chance meeting with a cartographer in North Dakota. It was interesting because he asked us (a group of linguists) What is a language or linguistic map? So, I grabbed a few examples and put them into a brief for him. This past January at the LSA meeting in Portland, Oregon, I had several interesting conversations with the folks at the LL-Map Project under Linguists’ List. It occurred to me that such a presentation of various kinds of language maps might be useful to a larger audience. So this will be a bit unpolished but should show a wide selection of language and linguistic based maps, and in the last section I will also talk a bit about interactive maps. Continue reading →

Developing an understanding on how multi-lingual content needs to work on sil.org

Posted on March 21, 2012 by Hugh Paterson III

Over the last few weeks I have been contemplating how multi-lingual content could work on sil.org. (I have had several helpful conversations to direct my thinking.)

As I understand the situation there is basically three ways which multi-lingual content could work.

First let me say that there is a difference between, multi-lingual content, multi-lingual taxonomies, and multi-lingual menu structures. We are talking about content here, not menu and navigation structures or taxonimies. Facebook has probably presented the best framework to date for utilizing on the power crowds to translate navigation structures. ^[1] Nico Vera. 11 February 2008. ¡Bienvenidos a Facebook en Español!. The Facebook Blog. https://blog.facebook.com/blog.php?post=10005792130 [Accessed: 5 March 2012] In just under two years they added over 70 languages to Facebook. However, Facebook has had some bumps along the way as DropBox points out in their post talking about their experience in translating their products and services. ^[2] Dan Wheeler. 18 April 2011. Translating Dropbox. http://tech.dropbox.com/?p=1 [Accessed: 5 March 2011]

Use a mechanism which shows all the available languages for content and highlights which ones are available to the user. Zotero has an implementation of this on their support forums.
Zotero language options
Basically create a subsite for each language and then only show which pages have content in that language. Wikipedia does this. Wikipedia has a menu on the left side with links to articles with this same title in other languages. Only languages which have an article started in them on that title are shown in the menu.
SIL International in English

Other Pages in other languages may not show the same content.
Finally, create a cascading structure for each page or content area. So there is a primary language and a secondary language or a tertiary, or a quaternary language etc. based on the browser language of choice with country IP playing a secondary role. If there is no page for the primary language then the next in preference will show. This last option has been preferred by some because if an organization wants to present content to a user, then obviously, it would be in the users’ primary language. But if the content is not available in the primary language then the organization would want to still let the user know that the content exists in another language.

It would also be good to understand the concepts used in Drupal 7 (and Drupal 8) for multi-lingual content. There are several resources which I have found helpful:

Localized and Multi-Lingual Content in Drupal 7 ^[3]Karen Stevenson. 17 November 2011. Localized and Multi-Lingual Content in Drupal 7. Lullabot Ideas. http://www.lullabot.com/articles/localized-and-multi-lingual-content-drupal-7 [Accessed: 5 March … Continue reading
Drupal 7’s new multilingual systems (part 4) – Node translation ^[4]Gábor Hojtsy. 31 January 2011. Drupal 7’s new multilingual systems (part 4) – Node translation. … Continue reading
Drupal 7’s new multilingual systems compilation ^[5] Gábor Hojtsy. 5 May 2011. Drupal 7’s new multilingual systems compilation. http://hojtsy.hu/multilingual-drupal7 [Accessed: 5 March 2011]
Drupal 8 Multilingual Initiative ^[6] Gábor Hojtsy. 26 January 2012. Drupal 8 Multilingual Initiative. http://hojtsy.hu/d8mi [Accessed: 5 March 2011]

It would appear that from this list of resources that Drupal’s default behavior is more in line with part two of the three examples given above.

References[+]

References
↑1	Nico Vera. 11 February 2008. ¡Bienvenidos a Facebook en Español!. The Facebook Blog. https://blog.facebook.com/blog.php?post=10005792130 [Accessed: 5 March 2012]
↑2	Dan Wheeler. 18 April 2011. Translating Dropbox. http://tech.dropbox.com/?p=1 [Accessed: 5 March 2011]
↑3	Karen Stevenson. 17 November 2011. Localized and Multi-Lingual Content in Drupal 7. Lullabot Ideas. http://www.lullabot.com/articles/localized-and-multi-lingual-content-drupal-7 [Accessed: 5 March 2011]
↑4	Gábor Hojtsy. 31 January 2011. Drupal 7’s new multilingual systems (part 4) – Node translation. http://hojtsy.hu/blog/2011-jan-31/drupal-7039s-new-multilingual-systems-part-4-node-translation [Accessed: 5 March 2011]
↑5	Gábor Hojtsy. 5 May 2011. Drupal 7’s new multilingual systems compilation. http://hojtsy.hu/multilingual-drupal7 [Accessed: 5 March 2011]
↑6	Gábor Hojtsy. 26 January 2012. Drupal 8 Multilingual Initiative. http://hojtsy.hu/d8mi [Accessed: 5 March 2011]

The Journeyler

A walk through: Life, Leadership, Linguistics, Language Documentation, WordPress, and OS X (and a bit of Marketing & Business Administration)

Tag Archives: opendraft

Software Needs for a Language Documentation Project

The Look of Language Archive Websites

Leave Typology to the Typologists: I am a Linguist

A User Experience look at Linguistic Archiving

Permanently accessible? to whom?

Useful or Not?

The Citation Problem

Bibliographic Data v.s Citation Data

Keyboard Design for Minority languages

From Folksonomies to Taxonomies with Linguistic Metadata

Types of Linguistic Maps: The Mapping of linguistic Features and Researcher Interactivity

Developing an understanding on how multi-lingual content needs to work on sil.org