Software Needs for a Language Documentation Project

In this post I take a look at some of the software needs of a language documentation team. One of my ongoing concerns of linguistic software development teams (like SIL International's Palaso or LSDev, or MPI's archive software group, or a host of other niche software products adapted from main stream open-source projects) is the approach they take in communicating how to use the various elements of their software together to create useful workflows for linguists participating in field research on minority languages. Many of these software development teams do not take the approach that potential software users coming to their website want to be oriented to how these software solutions work together to solve specific problems in the language documentation problem space. Now, it is true that every language documentation program is different and will have different goals and outputs, but many of these goals are the same across projects. New users to software want to know top level organizational assumptions made by software developers. That is, they want to evaluate how software will work in a given scenario (problem space) and to understand and make informed decisions based on the eco-system that the software will lead them into. This is not too unlike users asking which is better Android or iPhone, and then deciding what works not just with a given device but where they will buy their music, their digital books, and how they will get those digital assets to a new device, when the phone they are about to buy no-longer serves them. These digital consequences are not in the mind of every consumer... but they are nonetheless real consequences.
Continue reading

Network Needs for Poly-lingual Language Documentation Project

Network Diagram

Mephaa LangDoc Project Network Diagram

The diagram above roughly illustrates our network setup. This set-up might be typologically rare in terms of language documentation field stations for several reasons. But we had reasonable power (both in quality and quantity), though there were some power outages. And we had high-speed internet.

In terms of network set up there was the need for an internet direct out, so that we could have a team network, and then a separate network for language consultants, who would bring their own computers to have a “drop box with us”. To fill this need we could open our network to each of the consultants or we could use an outside service like Dropbox. – I am not sure why we did not use DropBox. Eventually we did use google spread sheets for collection word frames. Our consultants might have been atypical in that they also had their own computers and had some familiarity with computer use.

Single FLEx Datastore for all languages

MicrosoftSQL Server for running FLEx on the Network. This is achieved through running XP in a virtual machine via Virtualbox on the OSX Server. We have multi-able entry points of data to the “FLEx System”. We also did not completely solve the network access to the data bases. That is one person could access the database at a time with write access. Since this project the current version of FLEx has moved from a MicrosoftSQL Server Backend to an XML backend. But perhaps what would have been better was to use FLExBridge or LiftBridge.

Server and data store Backup

Best practice for backup calls for a three way backup plan.

  • An onsite backup.
  • An “across town” backup. Where a (at least weekly) backup is held by a friend or colleague across town.
  • And an out of country back-up.

This three way backup is to:

  1. Protect from mistakes or equipment failure.
  2. Protect from theft.
  3. Protect from catastrophic events.

Our onsite backup was handled by Time Machine.

We would switch out our Backup drive every week and give it to a colleague across town.

We attempted to use KKoncepts for our offsite backup. (KKoncepts did not work out because it was based on a simple rsync script and every time we tried to re-organize folders in our corpus it would try and re-sync all of the Gigabytes of data which lived under the folders.) The DropBox service is much more efficient and looks at the block level (inside the file) and only updates things that have changed. It then looks at the tree structure and mirrors what is currently on the clients computer, rather than re-uploading the content.

Not yet well defined are the network settings needed to run WindowsXP in the virtual machine, OS X, and Windows 7, establish a DNS server with AirPort Extreme.Note: Although the title/URL says “Multi-lingual” this is to be understood that multiple languages are being documented. The term poly-lingual also fits this particular project because the language of communication and authorship was Spanish, yet many of the network issues were resolved in English.

Chronicles of DNS & Sub-Domains on OSX Server

The Intended Setup:

We want to be able to run the OSX provided Wiki, Calendar and Blog features of the WebService. In addition we want to also run Mercurial (http://mercurial.selenic.com/) and RefBase (http://www.refbase.net/).
We want to run:

  • OSX services at mephaa.xyz
  • Mercurial at: hg.mephaa.xyz
  • Refbase at: ref.mephaa.xyz

These sites are for our work group only, they need not be accessible to the outside world. But if in the process we can make our setup of such a nature that an invited guest could collaborate with us on our project and view our workgroups’s collaboration area that would be ok. We will be using the MacPorts version/method of running mercurial.
Aside: Since I originally started out to resolve this challenge I have acquired mephaa.org, as a real registered domain.

Network layout:

Network Diagram

Mephaa LangDoc Program Network Diagram

We have a dynamic IP from our neighbor’s router. (We share the line and they are up stream. That is just the way things work in this location in Mexico. They are in turn connected to the ISP.)
The connection from the neighbor is hardwired to the WAN port on the Airport Extreme. The Airport Extreme is using NAT & DHCP (see settings below). I have 9 machines connected to the Airport. One of which is the MacMini server. It is the only one that is hard wired to the Airport. The rest are laptops that connect wirelessly.
The MacMini is assigned a stable IP address by the Airport Extreme based on its MAC address. The IP address for the server behind the firewall: 10.0.1.5.

The Settings on the AirPort Extreme:

The Challenge: As I presented it and discussed it on apple’s forums, on Nov. 20th.

Status: (Nov 20th)

We do not have an outside domain name that we have purchased. We just are using the name of the computer as it was set up during the install of the OSX.

I have the Wiki, Calendar and Blog features running at macminimarlett.local.
I can type macminimarlett.local in any web browser on the server side of the Airport Extreme and access the OSX provided WebServices (aka the wiki, blog and calendar.)

I would like to make the mercurial repository available at: hg.macminimarlett.local
I would like to make the refbase instance available at: ref.macminimarlett.local
These “additional” websites are hosted on the same machine as MacMiniMarlett.

§1. So What must I do to get hg.macminimarlett.local to resolve at all to anything?
§2. So What must I do to get hg.macminimarlett.local to resolve to my mercurial instance?

Currently I can not get hg.macminimarlett.local to resolve at all. “Safari can not find the server”. But browsers do find macminimarlett.local.
This leads me to think that it is a problem with my OSX server settings not with my install of Mercurial.

Suggestions offered on the 20th:

  1. Do not use .local.
  2. Do not use .private.
  3. Change the domain to something other than the computer name.
  4. Computers on the LAN can find macminimarlett.local because of bonjour. Not because of any special DNS entry.

We dropped the .local and the .private and switched to mephaa. instead of using macminimarlett.. I left the macminimarlett. zone in the DNS records just incase. This leads us to the server settings on Nov. 21st.

To this point I had been assuming that .private in the DNS registry was being translated to .local in the bowsers. This was an errant assumption.

Server Settings: Nov 21st

MacMiniMarlett DNS Settings

Suggestions received from John on Nov. 21st:

Your DNS is incorrect. Run the terminal command:
sudo changeip -checkhostname
You need to get this sorted out because it effects a lot services.

Here’s the crash course version: the . (period) at the end of the domain name means it is a full qualified domain name (meaning that it is real domain that real people use, like google.com.) also the primary domain record should be like this macminimarlett.com. Or macminimarlett.private. Or macminimartlett.local. (beaware that Microsoft Server 2008+ is droping .local support and you need a real domain name and public IP/dedicated IP – which means using .local isn’t future proofing).

One thing to know, the primary domain record doesn’t have to be a fully qualified domain, but it should be as everything is heading that way in the future.

At the moment your server is thinking the macminimarlett. And mepaa. Are the .com part of the domain name.

Yeah there will be a lot of confusion in the mepaa domain record as there isn’t any reverse mapping for it. And the cname record is at the .com level (layer 1) which won’t resolve very well for clients.

Next, what is the forwarder settings set to? These should be set to the ISP DNS and then to the router (you can add as many DNS servers as you like for redundancy).

What is doing DHCP to the clients? What DNS are they getting? The clients need to know where your subdomains are in the network. For example if a pc is typing in hg.macminimarlett (which is a bad idea – it should be hg.macminimarlett.private or something like that) then the pc client checks the DNS server for which server (IP address) has the subdomain hg.macminimarlett – but if the DNS server doesn’t have a record of hg.macminimarlett then the DNS server will reply with not a real address (because it doesn’t know who that is).

Regards,

Nov. 22nd.

I now realize that the syntax of my DNS entries (when and only when I am not using a registered domain name) need to be:

  • For a Zone: <Some Name>.<Something Unique>.
  • Where the above corresponds to the following: (domain name level).(TDL level).

  • For the domain root, which is an entry in the zone: <Some Name>.<Something Unique>.
  • Where the above corresponds to the following: (domain name level).(TDL level).

  • For a subdomain, which is also an entry in the zone: <Some Name>.<Some Name>.<Something Unique>.
  • Where the above corresponds to the following: (subdomain prefix).(domain name level).(TDL level).

My current entries in my DNS are not set up this way. I need to change them. Before I do that I should likely run the changeip -checkhostname as suggested by John.

I ran sudo changeip -checkhostname
And this is what I got:

changeip Results

Now my question is: is this message saying I need to run this again? What am I to do with the results of the message from changeip? I read the Manuals but that did not yield any profound insights.

I added .private to all the DNS records in the DNS service, in order to fix the syntax of the DNS records as indicated by John. After that I ran changeip again. It now shows that there is nothing needing to be changed. I think this part is now resolved.

Resolved changeip Results

Now this is what the server settings are (Noon 22nd) :

Aside: I corrected a spelling error in the DNS Records where mephaa was mispelled as mepaa. All the records now read with the mephaa spelling, as indicated in the second picture.

I got hg.macminimarlett.private to resolve from the server to a test index.html page on the server. But I could not get it to resolve from a client on the network.

  • Is this because I have the wrong type of records?
  • Is this because I am not passing the DNS records to where the clients are looking for the records?
  • How do I pass these DNS entries to my clients?
  • Is this something I have to enter in the Airport Extreme? If so which entries on which lines?

Airport Extreme TCP/IP

From Camalot via the Apple forum post:

A hostname is a record of a host within the domain. For example, hg.macminimarlett.private is the hostname for the host hg within the macminimarlett.private domain.

I don’t see anything in the Server Admin titled “hostname”… There is one thing under the Primary zone that says “hostname” but what should this be set to? the IP of the computer on this LAN?

Server Admin doesn’t know what additional hostname you want for your domain. It’s up to you to create them. You create additional records (either ‘A’ records (Alias) for physical machines, or ‘CNAME’ records for additional hostnames that you want to map to an existing machine.

§7. Ok so in what manner do I add hg.macminimarlett.private. to that zone? Do I add it as a CNAME, as a secondary zone, as a Machine (A) recored?

In this case it sounds like you want to add three records to your existing zone.

One A record for your server (call it whatever you want, but server.macminimarlett.private seems to make sense). Give this the IP address of your server.
Two CNAME records – one for hg and one for ref that both point to server.macminimarlett.private.

Now you’ll be able to resolve all three hostnames, and they’ll all point to the same physical IP address. From there it’s just Apache’s configuration telling it how to deal with the different requests.

From John:

Yeah there will be a lot of confusion in the mepaa [sic] domain record as there isn’t any reverse mapping for it. And the cname record is at the .com level (layer 1) which won’t resolve very well for clients.

Next, what is the forwarder settings set to? These should be set to the ISP DNS and then to the router (you can add as many DNS servers as you like for redundancy).

What is doing DHCP to the clients? What DNS are they getting? The clients need to know where your subdomains are in the network. For example if a pc is typing in hg.macminimarlett (which is a bad idea – it should be hg.macminimarlett.private or something like that) then the pc client checks the DNS server for which server (IP address) has the subdomain hg.macminimarlett – but if the DNS server doesn’t have a record of hg.macminimarlett then the DNS server will reply with not a real address (because it doesn’t know who that is).

I am not sure what is giving DNS to the clients. I did have to put something (I think it is the IP of my neighbors router, see the image above) in the DNS settings of the Airport Extreme in order to get the Internet to be passed to the clients. So what I did was put the internal IP address of the MacMini Server in the DNS field on the Airport Extreme. I also found this interesting: http://www.dyndnscommunity.com/questions/4567/custom-dns-with-subdomain-and-airport-extreme

It seems that an AirPort Extreme will always identify itself as the DNS server. If I want the network to look for a DNS server elsewhere. Then I need to follow one of these options: http://discussions.apple.com/thread.jspa?threadID=121990, http://wiki.amahi.org/index.php/Airport_express or http://discussions.apple.com/thread.jspa?threadID=2288123&tstart=0. (Restart might be required. Also I might be looking for something called “split horizon DNS”.)

http://www.dyndns.com/support/kb/apple_airport_with_custom_dns.html
http://www.dyndnscommunity.com/questions/1087/apple-airport-does-not-create-global-dynamic-hostname-in-custom-dns-zone

Running and using MySQL On OSX

Installing
The best tutorial for running MySQL on OS X is actually found on the MySQL website.

However, there is a really cool System Preference pane that turns on or off the MySQL server/service. This either only works in OSX 10.5 or in 32-bit mode on OSX 10.6.

I downloaded mysql-5.1.42-osx10.5-x86_64.dmg from MySQL.com and the included preference pane works on OSX 10.6.2. (Even though it says it is for OSX 10.5.)

I just installed it without uninstalling a previous version of MySQL. I was brought over to a New MBP from an older MBP running OSX 10.5 (via Apple Genius at the Apple store), which was running MySQL. So I don’t know if the older version is still there somewhere or if the /local/ folder was not brought over in the transfer.

It seems that I have avoided the issues mentioned here:

Using
As far as editing the MySQL databases there used to be an app called CocoaMySQL. But as the link says the project has been abandoned. I heard it rumored on O’Reilley that it was because the app didn’t keep up with changes mad in MySQL past MySQL version 4.0. So CocoaMySQL can still be used on OSX 10.6 with a MySQL version 4 Database, but not with MySQL version 5.

However, there is an new app called Sequel Pro. It is available on Google Code and boast to work on OSX 10.5 with MySQL version 3-5. (I am about to test it on OS X 10.6, though the application was last updated in Dec. ’09, so it should work on 10.6.)

Of course there is always PhpMyAdmin.