Capitalization in indigenous writing systems

I was recently visiting a small remote village. There were large sorghum fields all around. This village was notable for some of the environmental literacy which on could find in the area. Particularly the use of capitalization in names. In fact the name of the village had two capital letters.

Village name sign

This sort capitalization pattern of the use of capitalization word medially has seen its objections among onomastists. The suggestion has been that English does not allow for names to contain two capital letters and therefore references materials written in English containing non-English names should normalize capitalization so that only the first letter of names is capitalized. Obviously this is an uninformed but principled position to take. It is a serious matter to regularize a reference resource because it gives a filtered (and biased) view to users.

Is and am

Katja has two verbs of note: is and am. Last night we heard “am” for the first time. We heard the response “me am” to “you should be laying on the pillow”. In contrast to new verbs “is” has been a long standing verb of location. “Me is up”, “me is down”, or “me blanket is”. Is is almost exclusively used along side ideas of location. And is often in phrase final position. As in “is mommy?” For something like “where is it mommy?” Whereas “mommy is?” would be “where is mommy?”

It is cute how her language choice evolves. The new lexicon is displayed, the old home speak word diaper, pronunciation evolves. In one way I loath the change. It is sad to loose the old forms. They are often so straightforward and morphology simple. Part of me says I should be recording this speech, but I’ll never review it. Maybe 5-10 minutes of it the night before she gets married. But in reality, not really ever.

Color terms

I’m not really sure how my daughter started to learn color terms. How does one learn that the word is not a noun (the name of the object) and is the name of a color? I mean think about it as a parent you point to something and say “blue” or “red”. How does the child know that the parent is talking about color?

For about 4 months Katja has correctly identified and labeled blue objects. This is likely due to blue bear being such a prominent part of her life.

I think it was back in June and May that we started coloring with Crayons and writing with pens. This increases the exposure to color terms.

Somewhere along the line we started talking about the colors of the handholds on her climbing wall. I think her second color term about two months ago was “Black”.

“Black” quickly became overshadowed by “pink”. Katja has a special blanket which is pink. But there is her pink backpack, and shoes, and tonnes of other pink things.

Along the way “yellow” has been identified and pronounced in short sessions, like reading curious George. But as a color term has not been a stable self produced word. This may be in part to the challenge Katja has with pronouncing liquids.

This week “green” became a stable word and within a few hours red did too. Though there is a lot of phonological reduction going on right now.

OER Links

A few weeks a go I put together a resource ("paper") outlining an economic strategy related to Open Educational Resources (OER) and mobile compatible resources. The purpose was to kickstart and provide ideas for the organization I work for to consider alternative models of information maintenance and dissemination. The following links are more or less my list of references which did not make into that paper.

Economically (in terms of information economy), the problem I see with CommonCore as it is implemented in the USA across grades 1-12, is that law and policy affect the kinds of resources being produced and subsequently also shared in these curriculum development co-op endeavors (OER). I think the impact is greater than originally anticipated (or perhaps not, perhaps this is a foreign policy move affecting exports of knowledge). The indirect impact of CommonCore on the consumers of these OER materials, is that when people from other countries consume Open Education Resources, they are consuming CommonCore. Thankfully, there is a lot of OER work going on at the university level and outside of the scope of CommonCore.
Continue reading

Lexical Data Management helps (with SIL software)

This is a quick note to record some of the things I have learned this week about working with lexical data within SIL's software options.

  1. There is information scattered all over the place:
  2. What should the purpose of the websites be? to distribute the product or to build community around the product's existence?

Audio Dominant Texts and Text Dominant Audio

As linguistics and language documentation interface with digital humanities there has been a lot of effort to time-align texts and audio/video materials. At one level this is rather trivial to do and has the backing of comercial media processes like subtitles in movies. However, at another level this task is often done in XML for every project (digital corpus curation) slightly differently. At the macro-scale the argument is that if the annotation of the audio is in XML and someone wants to do something else with it, then they can just convert the XML to whatever schema they desire. This is true.

However, one antidotal point that I have not heard in discussion of time aligned texts is specifications for Audio Dominant Text vs. Text Dominant Audio. This may not initially seem very important, so let me explain what I mean.
Continue reading

InField

I have been working on describing the FLEx software eco-system (for both a blog post and an info-graphic). In the process I googled "language documentation" workflow and was promptly directed to resources created for InField and aggregated via ctldc.org. An amazing set of resources. the ctldc.org website is well put together and the content from InField 2010 and 2008 is amazing - I which I could have been there. I am almost convinced that most SIL staff pursuing linguistic fieldwork should just go to InField... But it is true that InField seems to be targeted at someone who has had more than one semester of linguistics training.