Client-Side Content Restrictions for Archives and Content Providers

Two times since the launch of the new SIL.org website colleagues of mine have contacted me about the new requirement on SIL.org to log-in before downloading content from the SIL Language and Culture Archive. Both know that I relate to the website implementation team. I feel as if they expect me to be able to speak into this situation (as if I even have this sort of power) - I only work with the team in a loose affiliation (from a different sub-group within SIL), I don't make design decisions, social impact decisions, or negotiate the politics of content distribution.

However, I think there are some real concerns by web-users users about being required to log-in prior to downloading, and some real considerations which are not being realized by web-users.

I want to reply to these concernes.

Continue reading

Software Needs for a Language Documentation Project

In this post I take a look at some of the software needs of a language documentation team. One of my ongoing concerns of linguistic software development teams (like SIL International's Palaso or LSDev, or MPI's archive software group, or a host of other niche software products adapted from main stream open-source projects) is the approach they take in communicating how to use the various elements of their software together to create useful workflows for linguists participating in field research on minority languages. Many of these software development teams do not take the approach that potential software users coming to their website want to be oriented to how these software solutions work together to solve specific problems in the language documentation problem space. Now, it is true that every language documentation program is different and will have different goals and outputs, but many of these goals are the same across projects. New users to software want to know top level organizational assumptions made by software developers. That is, they want to evaluate how software will work in a given scenario (problem space) and to understand and make informed decisions based on the eco-system that the software will lead them into. This is not too unlike users asking which is better Android or iPhone, and then deciding what works not just with a given device but where they will buy their music, their digital books, and how they will get those digital assets to a new device, when the phone they are about to buy no-longer serves them. These digital consequences are not in the mind of every consumer... but they are nonetheless real consequences.
Continue reading

To Protect and Serve…

In the U.S. we have a long tradition of citizenry, police, and military. For many years the citizenry has had distinct semantic categories for these social functions. However, I think there is evidence that at some levels these distinctions are merging. While not all citizens agree that the merger is useful, it is nevertheless happening at a political and managerial level. Terms like Law Enforcement extend beyond the traditional roles of police and bring the police into a larger strategically orchestrated social movement. In addition to this some of the traditional imagery surrounding police has changed. While it does raise many questions about the order and structure of society in the U.S. one question which seems pertinent to ask is: Who is protected and Who is served?

LA Police Shootings

More Traditional Police imagery.

Los Angeles Police Depertment manhunt for murder suspect

More modern police imagery.

Armed police officers search vehicles driving south in Yucaipa during the manhunt for fugitive former Los Angeles police officer Dorner

More modern police imagery.

Imagery is only one way to assess the conflation of semantic concepts. Another way to look at it would be to consider concepts and terminology of detainees and prisoners as those concepts are practiced by Law Enforcement operators. A look at the U.S. Army's Internment and Resettlement manual's terminology found in FM 3-39.40 available at http://armypubs.army.mil/doctrine/19_Series_Collection_1.html or locally [PDF]

Audio Dominant Texts and Text Dominant Audio

As linguistics and language documentation interface with digital humanities there has been a lot of effort to time-align texts and audio/video materials. At one level this is rather trivial to do and has the backing of comercial media processes like subtitles in movies. However, at another level this task is often done in XML for every project (digital corpus curation) slightly differently. At the macro-scale the argument is that if the annotation of the audio is in XML and someone wants to do something else with it, then they can just convert the XML to whatever schema they desire. This is true.

However, one antidotal point that I have not heard in discussion of time aligned texts is specifications for Audio Dominant Text vs. Text Dominant Audio. This may not initially seem very important, so let me explain what I mean.
Continue reading

Engagement Strategy

In my work with redesigning a NGO’s website, I have been recommending that the organization adopt and implement an engagement strategy. There are several challenges to this.

  1. There is the question: What is an engagement strategy which we are not doing now? Basically, what is the difference between engagement strategy and operations. – and certainly these are areas of organizations which need to have some symbiotic relationship.
  2. Another question has been: Why do we need an engagement strategy with our new website? – The new website is centrally managed, whereas operations are generally regionally managed.

So, the over simplistic answer is that if what a corporation presents themself as on their website is something which they are operationally not, then that presents certain discontinuities for persons viewing their operations and also viewing their website. This becomes evermore important as many potential clients for organizations first interact with that organization via the web. I first started blogging about engagement with regards to Language Development activities in a post titled: The Look of Language Development Websites.

However, engagement strategy goes beyond just presenting continuity. It gets into connecting potential clients with services offered or knowledge held by that organization. An engagement strategy for an NGO with a cause also speaks to how that NGO is going to target persons who are not aware of their cause and introduce them to the cause and provide options for those newly introduced persons to become part of a great solution for the problem just presented. This level of engagement is different than Public Relations, or brochure development (though both of these can be part of an engagement strategy).
Continue reading

InField

I have been working on describing the FLEx software eco-system (for both a blog post and an info-graphic). In the process I googled "language documentation" workflow and was promptly directed to resources created for InField and aggregated via ctldc.org. An amazing set of resources. the ctldc.org website is well put together and the content from InField 2010 and 2008 is amazing - I which I could have been there. I am almost convinced that most SIL staff pursuing linguistic fieldwork should just go to InField... But it is true that InField seems to be targeted at someone who has had more than one semester of linguistics training.

Plugin Abandonment

In the open source development world there is a lot of emphases on developing software to solve specific problems, there is much less emphasis on solving those problems well. That is, solving those problems so the most people are serviced, or so that users of software have the flexibility they need (there is also often a lack of commitment to User Experience Design but this is a shameless side plug). And there is often a real lack of collaboration around competing solutions. This is evident in the software which is created for use by linguists (usually also coded by linguists for solving the linguists’ challenges) but this is also evidenced in a different sphere of programing in the WordPress eco-system. In the WordPress eco-system there is a plethora of plugins which are abandoned. WordPress is GPL’d and so these plugins are GPL’d too. However, the repository – the human visual interface to the repository – allows for coders to grab code, and modify it for their ends, but it doesn’t allow for merging once the plugin has been “updated”. (It is true that not all changes are “updates”, sometimes people need one-off solutions.) But the net result is that early 1/3rd of all plugins for wordpress are abandoned. Their developer has been paid and has now ended their relationship with the commissioning client, or the WordPress eco-system no-longer requires the service options provided by that plugin. Matt Jones created an info-graphic to illustrate this point and to bring awareness to the problem. My comments below are my reply to him, with some minor corrections .
Continue reading

The SIL archive and its two sided markets

I have been thinking about the language data marketplace (exchange if one prefers), and the role of archives in a world where minority language speakers are also internet users and digital file consumers. In particular I have been thinking about SIL’s Language and Culture Archive and the economic model called a two sided market. So, SIL as “Partners in Language Development” seems to be well situated for analysis using the two sided market analysis (matching linguist and professionals with language development skills, and persons with language development skills with interested parties in developing their language). On the surface, it seems that the SIL archive would also benefit from being the center of exchange between these same two groups. This is the subject of one of my slides for an upcoming presentation, therefore I sketched out the interactions various SIL staff might have with the archive to see if I could diagram the social interactions around language data in SIL’s two sided market. To my surprise, the two sided nature of access to data in the archive is not supported, thereby blocking a data-centric archiving service. It makes me wonder what the perceived value of the archive really is, and if the perceived value is low, then why bother? What is the return on investment (ROI) for users on either side of the market?

I tried to summarize the relationships between the various clients of the archive in the following image.

Media and relationships among different roles in SIL projects.

Media and relationships among different roles in SIL projects.

Related Images: