This post is a open draft! It might be updated at any time… But was last updated on at .
Meta-data is not just for Archives
Bringing the usefulness of meta-data to the language project workflow
It has recently come to my attention that there is a challenge when considering the need for a network accessible file management solution during a language documentation project. This comes with my first introduction to linguistic field experience and my first field setting for a language documentation project.The project I was involved with was documenting 4 Languages in the same language family. The Location was in Mexico. We had high-speed Internet, and a Local Area Network. Stable electric (more than not). The heart of the language communities were a 2-3 hour drive from where we were staying, so we could make trips to different villages in the language community, and there were language consultants coming to us from various villages. Those consultants who came to us were computer literate and were capable of writing in their language. The methods of the documentation project was motivated along the lines of: “we want to know ‘xyz’ so we can write a paper about ‘xyz’ so lets elicit things about ‘xyz'”. In a sense, the project was product oriented rather than (anthropological) framework oriented. We had a recording booth. Our consultants could log into a Google-doc and fill out a paradigm, we could run the list of words given to us through the Google-doc to a word processor and create a list to be recorded. Give that list to the recording technician and then produce a recorded list. Our consultants could also create a story, and often did and then we would help them to revise it and record it. We had Geo-Social data from the Mexican government census. We had Geo-spacial data from our own GPS units. During the corse of the project massive amounts of data were created in a wide variety of formats. Additionally, in the case of this project language description is happening concurrently with language documentation. The result is that additional data is desired and generated. That is, language documentation and language description feed each other in a symbiotic relationship. Description helps us understand why this language is so important to document and which data to get, documenting it gives us the data for doing analysis to describe the language. The challenge has been how do we organize the data in meaningful and useful ways for current work and future work (archiving)?People are evidently doing it, all over the world… maybe I just need to know how they are doing it. In our project there were two opposing needs for the data:
- Data organization for archiving.
- Data organization for current use in analysis and evaluation of what else to document.It could be argued that a well planned corpus would eliminate, or reduce the need for flexibility to decide what else there is to document. This line of thought does have its merits. But flexibility is needed by those people who do not try to implement detailed plans.