This post is a open draft! It was originally started on April 23, 2011. Almost two years later it makes it’s public debut. It might be updated at any time… But was last updated on < ?php the_modified_date() ?> at < ?php the_modified_time() ?>.
In this post I take a look at some of the software needs of a language documentation team. One of my ongoing concerns of linguistic software development teams (like SIL International’s Palaso or LSDev, or MPI’s archive software group, or a host of other niche software products adapted from main stream open-source projects) is the approach they take in communicating how to use the various elements of their software together to create useful workflows for linguists participating in field research on minority languages. Many of these software development teams do not take the approach that potential software users coming to their website want to be oriented to how these software solutions work together to solve specific problems in the language documentation problem space. Now, it is true that every language documentation program is different and will have different goals and outputs, but many of these goals are the same across projects. New users to software want to know top level organizational assumptions made by software developers. That is, they want to evaluate how software will work in a given senario (problem space) and to understand and make informed decisions based on the eco-system that the software will lead them into. This is not too unlike users asking which is better Android or iPhone, and then deciding what works not just with a given device but where they will buy their music, their digital books, and how they will get those digital assest to a new device, when the phone they are about to buy no-longer serves them. These digital consequences are not in the mind of every consumer… but they are nonetheless real consequences. Continue reading →
As linguistics and language documentation interface with digital humanities there has been a lot of effort to time-align texts and audio/video materials. At one level this is rather trivial to do and has the backing of comercial media processes like subtitles in movies. However, at another level this task is often done in XML for every project (digital corpus curation) slightly differently. At the macro-scale the argument is that if the annotation of the audio is in XML and someone wants to do something else with it, then they can just convert the XML to whatever schema they desire. This is true.
However, one antidotal point that I have not heard in discussion of time aligned texts is specifications for Audio Dominant Text vs. Text Dominant Audio. This may not initially seem very important, so let me explain what I mean. Continue reading →
I have been working on describing the FLEx software eco-system (for both a blog post and an info-graphic). In the process I googled "language documentation" workflow and was promptly directed to resources created for InField and aggregated via ctldc.org. An amazing set of resources. the ctldc.org website is well put together and the content from InField 2010 and 2008 is amazing - I which I could have been there. I am almost convinced that most SIL staff pursuing linguistic fieldwork should just go to InField... But it is true that InField seems to be targeted at someone who has had more than one semester of linguistics training.
I have been thinking about the language data marketplace (exchange if one prefers), and the role of archives in a world where minority language speakers are also internet users and digital file consumers. In particular I have been thinking about SIL’s Language and Culture Archive and the economic model called a two sided market. So, SIL as “Partners in Language Development” seems to be well situated for analysis using the two sided market analysis (matching linguist and professionals with language development skills, and persons with language development skills with interested parties in developing their language). On the surface, it seems that the SIL archive would also benefit from being the center of exchange between these same two groups. This is the subject of one of my slides for an upcoming presentation, therefore I sketched out the interactions various SIL staff might have with the archive to see if I could diagram the social interactions around language data in SIL’s two sided market. To my surprise, the two sided nature of access to data in the archive is not supported, thereby blocking a data-centric archiving service. It makes me wonder what the perceived value of the archive really is, and if the perceived value is low, then why bother? What is the return on investment (ROI) for users on either side of the market?
I tried to summarize the relationships between the various clients of the archive in the following image.
Media and relationships among different roles in SIL projects.
I have been working with SIL team members to help create a better experience on SIL.org. So, I am constantly looking at how people on different web projects talk about user experience making a difference. Today I was visiting the Noun Project. There were some things I didn’t like about the website, so, I tried to give them some feedback. I found out that my ideas had already been suggested and that they were under review by the management and implementation team. A+ to the management team of the Noun Project – not for being perfect, but for communicating through imperfection and being concerned enough with users to add a feedback loop and for listening to user suggestions. The Noun Project has the edge on being Wikipedia for icons. However, it is the project and organizational commitment to User Experience and User Interaction which will make them succeed. As I look at what they are doing, I noticed this quote by their co-founder:
I find working on The Noun Project inspiring because I know what we’re doing is making a difference. I constantly get emails from teachers, designers, architects…and it’s never about how much they just “like” the service. People who use The Noun Project fall in love with it, and that’s when you know you’ve built something worthwhile. - Sofya, Cofounder
At the end of the day, I want people to fall in love with the things I help build.
I feel that in the language and culture documentation community that there is a tension between “documenting” and “globalizing”. In the sense that what we as digital natives and cultural technologists think is “living” is in part “documenting”.
Now, in some sense “Language Documentation” is an academic pursuit of its own right independent of linguistics if it has a plan and tries to capture elements of the expression of the culture and language as it is spoken or acted out. I think there is a bit of confusion in the literature as linguists move from linguistics to language development and community development. This is particularly evident with the use of video in language documentation. Continue reading →
I was looking through Facebook to see if I could generate a list of videos which I have shared from YouTube… I wanted to see what I have “liked”. It would appear that though this information is available to businesses it is not available to me as a user… Sad… I kinda wanted to see what my longitudinal tastes were for videos and how much YouTube watching I do do… and has it increased over time…
Branding and video provider
In some respects this is motivated by wanting to become more able to communicate in video forms. Some of the videos I have enjoyed have been both on various video-graphic styles and various content genres. I have noticed that some of the creative videos I like to watch have sound tracks to MTV culture and music to which I have never been acquainted, but Becky has.
I think this stop motion video of head phones is an example:
They are willing to listen to the ideas of young, fresh people.
They are willing to work with temporary staff.
They are willing to mentor.
They are willing trust (things like project goals and budding technologies).
Each of these things listed above are social issues. They are social issues within the context of the corporate environment. Additionally, the company has to be contentious of them to the point that they implement HR processes to allow these sorts of things to happen. In this respect these four things have to be something that is fought for (in order to maintain them as part of the corporate culture). I currently look at the NGO I work for and wonder, What it would take to have harness the power of Interns? We don’t currently have the corporate culture to facilitate interns, but why is that? Is our walled garden so well constructed with bricks from the baby-boomer generation that we forget the power which comes when we can run with young people? For businesses, even for NGOs, if we don’t fight for relevance within the social networks of the up-coming generation then we will marginalize our significance.
In 2008 I was contacted by a professor who wanted to be able to share various linguistics exercises with fellow professors. He asked for a website to be build so that if a professor were to translate the directions of these exercises that they could in turn put these translated versions back into the “set of exercises”. Continue reading →
This week I have been outlining the types of data that linguists need to be able to use and relate to each other as they do Language Documentation and Linguistic Research. I try to express these things graphically and then also express where some of the leading tools which SIL International is offering sit in the problem space.
The Data Management Space for linguists with SIL software.