In a recent (2010-2011) Language Documentation Project we decided to also collect GIS data (GPS Coordinates), about our consultants (place of origin and place of current dwelling), about our recording locations and for Geo-tagging Photos. We used a Garmin eTrex Venture HC to collect the data and then we compared this data with GIS information from Google maps and the National GIS information service. This write up and evaluation of the Garmin eTrex Venture HC is based on this experience.
Continue reading
Category Archives: Linguistics
Software I would load on my Windows machine, because I can’t on my Mac…
While I was in Mexico I realized that for the way I work, virtualization was not the best solution… so here is a list of applications I would use:
Scan Taylor http://sourceforge.net/projects/scantailor/
Qiqqa http://www.qiqqa.com/About/Features#Compare
StatPlanet http://www.sacmeq.org/statplanet
FLeX http://fieldworks.sil.org/flex/
SayMore http://saymore.palaso.org/about
Chrome http://www.google.com/chrome/intl/en/make/features.html
GSpot www.headbands.com/gspot/
Network Language Documentation File Management
This post is a open draft! It might be updated at any time… But was last updated on at .
Meta-data is not just for Archives
Bringing the usefulness of meta-data to the language project workflow
It has recently come to my attention that there is a challenge when considering the need for a network accessible file management solution during a language documentation project. This comes with my first introduction to linguistic field experience and my first field setting for a language documentation project.The project I was involved with was documenting 4 Languages in the same language family. The Location was in Mexico. We had high-speed Internet, and a Local Area Network. Stable electric (more than not). The heart of the language communities were a 2-3 hour drive from where we were staying, so we could make trips to different villages in the language community, and there were language consultants coming to us from various villages. Those consultants who came to us were computer literate and were capable of writing in their language. The methods of the documentation project was motivated along the lines of: “we want to know ‘xyz’ so we can write a paper about ‘xyz’ so lets elicit things about ‘xyz'”. In a sense, the project was product oriented rather than (anthropological) framework oriented. We had a recording booth. Our consultants could log into a Google-doc and fill out a paradigm, we could run the list of words given to us through the Google-doc to a word processor and create a list to be recorded. Give that list to the recording technician and then produce a recorded list. Our consultants could also create a story, and often did and then we would help them to revise it and record it. We had Geo-Social data from the Mexican government census. We had Geo-spacial data from our own GPS units. During the corse of the project massive amounts of data were created in a wide variety of formats. Additionally, in the case of this project language description is happening concurrently with language documentation. The result is that additional data is desired and generated. That is, language documentation and language description feed each other in a symbiotic relationship. Description helps us understand why this language is so important to document and which data to get, documenting it gives us the data for doing analysis to describe the language. The challenge has been how do we organize the data in meaningful and useful ways for current work and future work (archiving)?People are evidently doing it, all over the world… maybe I just need to know how they are doing it. In our project there were two opposing needs for the data:
- Data organization for archiving.
- Data organization for current use in analysis and evaluation of what else to document.It could be argued that a well planned corpus would eliminate, or reduce the need for flexibility to decide what else there is to document. This line of thought does have its merits. But flexibility is needed by those people who do not try to implement detailed plans.
Riddles, Poems, and Tangle-Worded Couplets
We were sitting around the kitchen table after pizza one night, when the neighbor started to tell some jokes. After a few jokes others around the table started to tell their favorite jokes. Soon the neighbor turned to me and said, “you are up next”. Fear struck my heart. Continue reading
EGIDS, SIL, and Language Documentation
About two or three weeks ago Gary Simons and Paul Lewis co-presented on an Extension to Fishman’s Graded Intergenerational Disruption Scale (Lewis & Simons 2010) [1] Paul M. Lewis & Gary F. Simons. 2010. Assessing endangerment: Expanding Fishman’s GIDS. Revue Roumaine de Linguistique 55.2: 103–20. . Fishman’s scale for measuring Language Vitality and Language Endangerment has been around for about 2 decades (almost longer than me ;-)). The Ethnologue in its most recent version has started to list the position of the language on the EGIDS scale. This is something that the editors are looking to expand to all languages in the Ethnologue. This has some bearing on Language Documentation globally (as grant writers and funders look at EGIDS as a pivot point for language vitality) and because Language Documentation efforts usually (and typically) focus on languages on a 7 or higher on the scale (Shifting, Moribund, Nearly Extinct, etc). Continue reading
References
↑1 | Paul M. Lewis & Gary F. Simons. 2010. Assessing endangerment: Expanding Fishman’s GIDS. Revue Roumaine de Linguistique 55.2: 103–20. |
---|
Meꞌphaa Bibliography
.pdf
the plugin re-codes the link name to "pdf". This is the advertised behavior. However, when there is more than one URL, they all say "url" rather than what is the last part of the URI. Look at this example from above:
Steven Egland, Doris Bartholomew, Saúl Cruz Ramos (1978) La inteligibilidad interdialectal de las lenguas indígenas de México: Resultado de algunos sondeos, Instituto Lingüístico de Verano, p. 58-59, Mexico City: Instituto Lingüístico de Verano, url, url[mendeley type="groups" id="899061" groupby="year" grouporder="desc"]
Using Endnote X4 for Mac
One of the most popular Citation Management software applications among academics is the application Endnote. Endnote has a long history is published by a reputable company, and has some pretty cool features. I use it (version X4) primarily because it is the only citation softwareThere is other citation management software for OS X which claims integration with pages but none of these solutions are endorsed or supported by Apple. Some of the other applications which claim integration with Pages are:
- Sente
- Bookends
- Papers – This is according to Wikipedia, but I own and use Papers 1.9.7 and have not seen how to integrate it with Pages. (However, Papers2, released March 8th, 2011 does say that it supports citation integration with Pages.)
which integrates natively with the word processor Pages, by Apple, Inc. The software boasts a bit of flexibility and quite a few useful features. Some of the really useful features I use are below.
- Customizing the output style of the bibliographies.There are several Linguistics Journals with style sheets on Endnote’s Website. Among them are:
- Linguistic Inquiry
- Journal of the International Phonetic Association
- Phonology
- Lingua
- Journal of Phonetics
- Language: The Journal of the Linguistic Society of America
- Phonetica
Additionally there is a version of the Unified Linguistics Style Sheet available for Endnote. This is available from Manchester UK. http://www.llc.manchester.ac.uk/intranet/ug/useful-links/computing-resources/wordprocessing/. [.ens file]
- Looking for PDF files.
- Attaching additional meta-data to each citation. (Like ISO 639-3 Language Codes)
- Adding additional types of resources like Rutgers Optimality Archive Documents with an ROA number.
- Smart Groups of files based on desired criteria.
- Integration with Apple’s word processor Pages.
- Research Notes section in the citation’s file for creating an annotated bibliography.
- Copy our all the selected works, so that they can be pasted as a bibliography in another document.
- XML Output of Citation DataThe XML Support of Endnote has not been hailed as the greatest implementation of XML but there are tools out there to work with it.
However, regardless of how many good features I find and use in Endnote there are several things about it which irk me to no end. This is sort of a laundry list of these problematic areas.
Can not sort by resource type:
For instance if I wanted to sort or create a smart list of all my Book references, or just Journal Articles. This can be done, one just has to create a smart list and then set Reference Type to Contains: “Book Section”. There is not a drop down list of reference types invoked by the user.Can not sort by custom field:
I think you can do this in the interface. Though it was not obvious on how to do it.- Can not view all the custom fields for a resource type across all resources.
This seems to be limited to eight fileds in the sorting viewer at a time. Can not view all entries without content in a specified field.
This would be especially nice to create a smart list for this.- No exports of PDFs or exports of PDFs with .ris files.
- There is no keyboard short-cut to bring up the Import function (or Export) under File Menu
- Does not rename PDFs based on metadata of the resource.
This is possible with Papers and Mendeley. The user has the option to rename the file based on things like Author, Date of publication, etc. - Can not create a smart list based on a constant in the Issue data part.
I have Volume and Issue Data. Some of the citation data pulled in for some items has the issue set as 02, 03, etc. I want to be able to find all the issues which start with a zero so I can remove the zeros. Most stylesheets do not remove the zeros and also do not allow for them. - Can not export PDFs with embedded metadata in the PDF.
- Can not open the folder which contains a PDF included in an Endnote Library.
- Modifying Resource type does not accept |Language| Subject Language|
There is no guide in any of Endnote’s documentation for how to create an export style sheet.
This is in the Help Menus I was expecting it on the producers website or in a book.- When editing an entry’s meta-data i.e. the author, or the title of a work, pressing TAB does not move the cursor to the next field.
At least some times it does not continue to TAB. If I do a new entry as a Journal article, then it will tab till the issue field, but not beyond. It gets stuck. - There is no LAN collaboration or sharing feature for a local network solution.
- There is no Cloud based collaborative solution.
- There is no way to create a smart group based off of a subset of items in a normal group.
i.e. I want to create a smart group of all the references with a PDF attached but I only want it to pull from the items in a particular group (or set of groups). - There is no PDF Preview within the application. The existing Preview is for seeing the current citation in the selected citation style. (Preview of the output.) It would be helpful if there was also a preview pane for viewing the PDF or the attached file.
Learned or Innate
I presented on Jeff Mielke’s (his web page) The Emergence of Distinctive Features.
The two questions covered in this presentation are:
- Are features learned or innate?
- Do we have sound patterns from features or do we have features from sound patterns?
PDF of Slides: [download id=”2″]
Open Source Language Codes Meta-data
One of the projects I have been involved with published a paper this week in JIPA. It is a first for me; being published. Being the thoughtful person I am, I was considering how this paper will be categorized by librarians. For the most part papers themselves are not catalogued. Rather journals are catalogued. In a sense this is reasonable considering all the additional meta-data librarians would have to create in their meta-data tracking systems. However, in today’s world of computer catalogues it is really a shame that a user can’t go to a library catalogue and say what resources are related to German [deu]? As a language and linguistics researcher I would like to quickly reference all the titles in a library or collection which reference a particular language. The use of the ISO 639-3 standard can and does help with this. OLAC also tires to help with this resource location problem by aggregating the tagged contents of participating libraries. But in our case the paper makes reference to over 15 languages via ISO 639-3 codes. So our paper should have at least those 15 codes in its meta-data entry. Furthermore, there is no way for independent researchers to list their resource in the OLAC aggregation of resources. That is, I can not go to the OLAC website and add my citation and connect it to a particular language code.
There is one more twist which I noticed today too. One of the ISO codes is already out of date. This could be conceived of as a publication error. But even if the ISO had made its change after our paper was published then this issue would still be persistent.
During the course of the research and publication process of our paper, change request 2009-78 was accepted by the ISO 639-3 Registrar. This is actually a good thing. (I really am pro ISO 639-3.)
Basically, Buhi’non Bikol is now considered a distinct language and has been assigned the code [ubl]. It was formerly considered to be a variety of Albay Bicolano [bhk]. As a result of this change [bhk] has now been retired.
Here is where we use the old code, on page 208 we say:
voiced velar fricative [ɣ]
- Aklanon [AKL] (Scheerer 1920, Ryder 1940, de la Cruz & Zorc 1968, Payne 1978, Zorc 1995) (Zorc 1995: 344 considers the sound a velar approximant)
- Buhi’non [BHK] (McFarland 1974)
In reality McFarland did not reference the ISO code in 1974. (ISO 639-3 didn’t exist yet!) So the persistent information is that it was the language Buhi’non. I am not so concerned with errata or getting the publication to be corrected. What I want is for people to be able to find this resource when they are looking for it. (And that includes searches which are looking for a resource based on the languages which that resource references.)
The bottom line is that the ISO does change. And when it does change we can start referencing our new publications and data to the current codes. But there are going to be thousands of libraries out there with out-dated language codes referencing older publications. A librarian’s perspective might say that they need to add both the old and the new codes to the card catalogues. This is probably the best way to go about this. But who will notice that the catalogues need to be updated with the new codes? What this change makes me think is that there needs to be an Open Source vehicle where linguists and language researchers can give their knowledge about a language resources a community. Then librarians can pull that meta-data from that community. The community needs to be able to vet the meta-data so that the librarians feel like it is credible meta-data. In this way the quality and relevance of Meta-data can always be improved upon.
Simple Linguistics software
I often see good (maybe not sexy), software, like iBable designed on the Mac for scientific purposes. I often wonder, “Why hasn’t anyone done something for or with linguistics?” linguistics is a big field. Don’t get me wrong. It is also a field with few standardizations for data interoperability, and even fewer standards for data description and markup. Just seeing something like iBable is inspiring to want to learn Ruby and do something for linguistic data.
The Apple developer program is only $99 a year.
Tutorial on Ruby by Phusion.