Language maps like heat maps

There is a myriad of difficulties in overlaying language data with geographical data. But it has be done and can be done. While I was working in México on a language documentation project, I learned that some of the language mixing (not quite diglossia, rather the living of two people groups with different languages in the same spaces) was due geographical factors and economical factors pulling them into the same geographic locations. In the particular case I am thinking of there was a mountain pass and a valley on the way to the major center of trade. In this sort of context the interesting things are displayed not when a polygon is drawn showing a territorial overlay of where various language speakers living, but where something is drawn showing what the density or population dispersion per general population is. Some of the most detailed (in terms of global perspective) language maps can be found in the Ethnologue [1] Lewis, M. Paul (ed.). 2009. Ethnologue: Languages of the World, Sixteenth edition. Dallas, Tex.: SIL International. .

Western Central Mexico from the Ethnologue

Western Central Mexico from the Ethnologue

However, as I was working on the language documentation project I found out how much effort actually goes into that sort of map. ArcGIS, the software used to create the maps can not auto-generate a polygon a certain distance around a combined set of given points. A set of points can be selected and each point can get a 5 mile radius. What this means is that each polygon has to be hand drawn. This sort of graphical overly that is used in the the Ethnologue [2] Map of Languages in Western Mexico in the Ethnologue. [Accessed: 9 September 2011] http://www.ethnologue.com/show_map.asp?name=MX&seq=30. [Link] does not show the density of speakers of a language in an area relative to the total population (in the Ethnologue’s defense I am not sure it is supposed to). For instance, if I wanted to know “What is the density of speakers in the Me’phaa area of México relative to speakers of other languages?” that would show me some dispersion, and by implication the peopling of the area. This sort of geographical overlay may be closer to displaying social networks, not really bilingualism or diglossia. There might be some bilinguals or some average level of bilingualism there, but the heat map method of plotting is looking still at the density of speakers to an area. A simular map might be created of New York City where certain languages are given a color based on their distribution density in the area. Additionally, these sorts of data overlays are probably more prone to lend insights on language attrition patterns or language speaker migration patterns. Also these hand drawn polygons change (a little) from edition to edition. Because the data used to create the polygons is not referenced (cited) it is hard to tell if the change is keeping pace with language attrition and/or population movement or if the changes are due to a better linguistic understanding in a particular area. When looking at the large area maps in the Ethnologue, [3] Map of Languages in the Americas in the Ethnologue. [Accessed: 9 September 2011] http://www.ethnologue.com/show_map.asp?name=Americas&seq=10. [Link] it is hard to tell if the red dots represent “traditional” language area (or geographical center thereof) or if the points represent the current geographical center of the speaking area. Either way the plotting functions as if it were a heat map showing the diversity of languages over a geographical area.

Americas Map from the Ethnologue

Americas Map from the Ethnologue

gHeat

I am generally on the look out for web apps and APIs which can be used to overlay data to bring new insights to situations through graphical representations. I recently found a tool for overlaying data on Google Maps. This tool creates heat maps given data from another source. This tool is called gHeat. This tool was brough to my attention by Been O’Steen as he modified gHeat to display some prices for student properties [4] Ben O’Steen. 2011. Student Property Heatmap. Random Hacks: Hacks, code and other things. [Accessed: 2 September 2011] http://benosteen.wordpress.com/2011/07/26/student-property-heatmap . [Link] in the UK. My initial thought was: “Wow how can we do language maps like this?”

Student Property Heat Map

Student Property Heat Map

Obviously I still think that language based heat maps could prove to provide language workers world wide access to visualizations of data that could really add clarity to the language vitality situation.

References

References
1 Lewis, M. Paul (ed.). 2009. Ethnologue: Languages of the World, Sixteenth edition. Dallas, Tex.: SIL International.
2 Map of Languages in Western Mexico in the Ethnologue. [Accessed: 9 September 2011] http://www.ethnologue.com/show_map.asp?name=MX&seq=30. [Link]
3 Map of Languages in the Americas in the Ethnologue. [Accessed: 9 September 2011] http://www.ethnologue.com/show_map.asp?name=Americas&seq=10. [Link]
4 Ben O’Steen. 2011. Student Property Heatmap. Random Hacks: Hacks, code and other things. [Accessed: 2 September 2011] http://benosteen.wordpress.com/2011/07/26/student-property-heatmap . [Link]

Interoperability of online dictionary data: 
 A test case using WordPress as a CMS

Linked data is an effort to enhance applications and thereby lives with structured knowledge. This structure at its core is developed by human interaction. The challenge to consumers of linked data is to convince holders of unstructured data to structure it into actionable, manipulatable knowledge. Continue reading

Review of Garmin eTrex Venture HC for Language Documentation

In a recent (2010-2011) Language Documentation Project we decided to also collect GIS data (GPS Coordinates), about our consultants (place of origin and place of current dwelling), about our recording locations and for Geo-tagging Photos. We used a Garmin eTrex Venture HC to collect the data and then we compared this data with GIS information from Google maps and the National GIS information service. This write up and evaluation of the Garmin eTrex Venture HC is based on this experience.
Continue reading

Software I would load on my Windows machine, because I can’t on my Mac…

While I was in Mexico I realized that for the way I work, virtualization was not the best solution… so here is a list of applications I would use:

Scan Taylor http://sourceforge.net/projects/scantailor/
Qiqqa http://www.qiqqa.com/About/Features#Compare
StatPlanet http://www.sacmeq.org/statplanet
FLeX http://fieldworks.sil.org/flex/
SayMore http://saymore.palaso.org/about
Chrome http://www.google.com/chrome/intl/en/make/features.html
GSpot www.headbands.com/gspot/