Language Documentation Gear

Posted on December 9, 2024 by Hugh Paterson III

I have been investigating some new video equipment for language documentation. Currently highly recommended is the Osmo Pocket 3 and the GoPro Hero 11 (as this has GPS embedded whereas the GP12 does not). It is not entirely clear if the GP13 has GPS or not.

Also my microphones need some attention. I need some endcaps and some windscreens.

I have been thinking about using updating my capabilities for recording and regularly recording Katja Playing Music.

Thoughts on file formats and file names in language documentation projects and archiving

Posted on June 4, 2024 by Hugh Paterson III

I’ve written about some of these file issues before.

https://hugh.thejourneyler.org/2012/the-workflow-management-for-linguists/

https://hugh.thejourneyler.org/2012/the-data-management-space-for-linguists/

https://hugh.thejourneyler.org/2012/resources-for-digitizing-audio-as-part-of-archiving/

https://hugh.thejourneyler.org/2011/presentation-version-vs-archival-version-of-digital-audio-files/

Lexical Data Management helps (with SIL software)

Posted on November 19, 2013 by Hugh Paterson III

This is a quick note to record some of the things I have learned this week about working with lexical data within SIL's software options.

There is information scattered all over the place:
- FLEx website: http://fieldworks.sil.org
  - Google Group:https://groups.google.com/forum/#!forum/flex-list
- Toolbox website: http://www-01.sil.org/computIng/toolbox/
  - Toolbox Google Grouphttps://groups.google.com/forum/#!forum/ShoeboxToolbox-Field-Linguists-Toolbox
- Webonary Website: http://webonary.org/
  And then on Webonary about Data transfer: http://webonary.org/data-transfer/
- Solid:http://solid.palaso.org/
- Wesay: http://wesay.palaso.org/
- LingTranSoft:
  - A redundancy of the FLEx Google group: http://tiki.lingtransoft.info/tiki-view_forum_thread.php?comments_parentId=27&topics_offset=1
  - Various introductions to FLEx: http://tiki.lingtransoft.info/Introduction+to+Flex?structure=Navmenu
- MDF documentation:http://www-01.sil.org/computing/shoebox/mdf.html including this PDF
- The LIFt format: https://code.google.com/p/lift-standard/
- LiftTweaker:http://projects.palaso.org/projects/show/lifttweaker
- LiftTools: http://downloads.palaso.org/LiftTools/
- xHtml expression of lift: http://pathway.sil.org/features/standards/dictionary-xhtml-proposed-standard/
What should the purpose of the websites be? to distribute the product or to build community around the product's existence?

Software Needs for a Language Documentation Project

Posted on April 20, 2013 by Hugh Paterson III

In this post I take a look at some of the software needs of a language documentation team. One of my ongoing concerns of linguistic software development teams (like SIL International's Palaso or LSDev, or MPI's archive software group, or a host of other niche software products adapted from main stream open-source projects) is the approach they take in communicating how to use the various elements of their software together to create useful workflows for linguists participating in field research on minority languages. Many of these software development teams do not take the approach that potential software users coming to their website want to be oriented to how these software solutions work together to solve specific problems in the language documentation problem space. Now, it is true that every language documentation program is different and will have different goals and outputs, but many of these goals are the same across projects. New users to software want to know top level organizational assumptions made by software developers. That is, they want to evaluate how software will work in a given scenario (problem space) and to understand and make informed decisions based on the eco-system that the software will lead them into. This is not too unlike users asking which is better Android or iPhone, and then deciding what works not just with a given device but where they will buy their music, their digital books, and how they will get those digital assets to a new device, when the phone they are about to buy no-longer serves them. These digital consequences are not in the mind of every consumer... but they are nonetheless real consequences.
Continue reading →

Audio Dominant Texts and Text Dominant Audio

Posted on March 18, 2013 by Hugh Paterson III

As linguistics and language documentation interface with digital humanities there has been a lot of effort to time-align texts and audio/video materials. At one level this is rather trivial to do and has the backing of comercial media processes like subtitles in movies. However, at another level this task is often done in XML for every project (digital corpus curation) slightly differently. At the macro-scale the argument is that if the annotation of the audio is in XML and someone wants to do something else with it, then they can just convert the XML to whatever schema they desire. This is true.

However, one antidotal point that I have not heard in discussion of time aligned texts is specifications for Audio Dominant Text vs. Text Dominant Audio. This may not initially seem very important, so let me explain what I mean.
Continue reading →

The Look of Language Archive Websites

Posted on September 19, 2012 by Hugh Paterson III

This the start of a cross-language archive look at the current state of UX design presenting Content generated in Language Documentation.

http://www.rnld.org/archives
http://www.mpi.nl/DOBES/language_archives

http://paradisec.org.au/
http://repository.digiarch.sinica.edu.tw/index.jsp?lang=en

http://alma.matrix.msu.edu/

http://www.thlib.org/

http://www.ailla.utexas.org/site/welcome.html

Leave Typology to the Typologists: I am a Linguist

Posted on September 13, 2012 by Hugh Paterson III

A User Experience look at Linguistic Archiving

In a recent paper Jeremy Nordmoe, a friend and colleague, states that:

Because most linguists archive documents infrequently, they will never be experts at doing so, nor will they be experts in the intricacies of metadata schemas. ^[1]Jeremy Nordmoe. 2011. Introducing RAMP: an application for packaging metadata and resources offline for submission to an institutional repository. In Proceedings of Workshop on Language Documentation … Continue reading

My initial reply is:

You are d@#n right! and it is because archives are not sexy enough!

Continue reading →

References[+]

References
↑1	Jeremy Nordmoe. 2011. Introducing RAMP: an application for packaging metadata and resources offline for submission to an institutional repository. In Proceedings of Workshop on Language Documentation & Archiving 18 November 2011 at SOAS, London. Edited by: David Nathan. p. 27-32. [Preprint PDF]

The Citation Problem

Posted on August 28, 2012 by Hugh Paterson III

In a team framework where there are several members of a research team and the job requirements call for the sharing of bibliographic data (of materials referenced) as well as the actual resources being referenced. In this environment there needs to be a central repository for sharing both kinds of data. This is true for small localized (geographically) groups as well as large distributed research teams. New researchers joining a existing team need to be able to “plug-in” to existing foundational work on the project and be able to access bibliographic data as well as the resources those bibliographic details point to. It is my point here to outline some of the current challenges involved in trying to overcoming the collaborative obstacle when working in the fields of Linguistics and Language Documentation ^[1]Nikolaus P. Himmelmann. 1998. Documentary and Descriptive Linguistics. Linguistics vol. 36:161-195. [PDF] [Accessed 24 Dec. 2010].This sentiment is echoed by many in the world of science. Here is someone on Zetero’s forums [INSERT LINK]. (Though Zetero does claim to combat some of these issues.)

Bibliographic Data v.s Citation Data

Continue reading →

References[+]

References
↑1	Nikolaus P. Himmelmann. 1998. Documentary and Descriptive Linguistics. Linguistics vol. 36:161-195. [PDF] [Accessed 24 Dec. 2010]

Reflections on CRASSH

Posted on August 21, 2012 by Hugh Paterson III

In July I presented a paper at CRASSH in Cambridge. It was a small conference, but being in Europe it was good to see many of the various kinds of projects which are going on in Digital Humanities and Linguists, or also Cloud Computing and Linguistics. One particular project, TypeCraft, stands out as being rather well done and promising was presented by Dorothee Beermann Hellan. I think the ideas presented in this project are well thought out and seem to be well implemented. It would be nice to see this product integrated with some other linguistics and language documentation cloud offerings. i.e. Project Lego from the Linguist’s List or the Max Planck Institute’s LEXUS project. While TypeCraft does allow for round tripping of data with XML, what I am talking about is a consolidated User Experience for both professional linguists and for Minority language users.

A note on foundational technologies:

It appears that Lexus is is built on BaseX with Cocoon and XML.
The front page of TypeCraft has a very Wikipedia like feel, but this might not be the true foundational technology.
Linguist’s List often does their work in ColdFusion and the LEGO project definitely has this feel about it.

Types of Linguistic Maps: The Mapping of linguistic Features and Researcher Interactivity

Posted on March 22, 2012 by Hugh Paterson III

A couple of years ago I had a chance meeting with a cartographer in North Dakota. It was interesting because he asked us (a group of linguists) What is a language or linguistic map? So, I grabbed a few examples and put them into a brief for him. This past January at the LSA meeting in Portland, Oregon, I had several interesting conversations with the folks at the LL-Map Project under Linguists’ List. It occurred to me that such a presentation of various kinds of language maps might be useful to a larger audience. So this will be a bit unpolished but should show a wide selection of language and linguistic based maps, and in the last section I will also talk a bit about interactive maps. Continue reading →

The Journeyler

A walk through: Life, Leadership, Linguistics, Language Documentation, WordPress, and OS X (and a bit of Marketing & Business Administration)

Tag Archives: Language Documentation