I was recently looking at licenses for databases and discovered the ODbL license. This license was pioneered by the OpenStreetMap Project. I was reading their introduction to why the change was needed. This introduction outlined what the change was, what the change would allow them to do, who agreed, who disagreed, what the cost of the change would be, among other things. I thought it was a very open, engaging and confidence building way to move a group of volunteers through change. It allows for more kinds (also different kinds) of product use. It is well worth the look at not only if you are interested in the open licensing of data in databases and why CC-BY-SA and CC0 licenses do not work for data [also as PDF], but also how they are answering the questions of the community as they are moving the community through change.
Category Archives: Blogging
Social Meta-data collection
As part of my job I work with materials created by the company I work for, that is the archived materials. We have several collections of photos by people from around the world. In fact we might have as many as 40,000 photos, slides, and Negatives. Unfortunately most of these images have no Meta-data associated with them. It just happens to be the case that many of the retirees from our company still live around or volunteer in the offices. Much of the meta-data for these images lives in the minds of these retirees. Each image tells a story. As an archivist I want to be able to tell that story to many people. I do not know what that story is. I need to be able to sit down and listen to that story and make the notes on each photo. This is time consuming. More time consuming than I have.
Here is the Data I need to minimally collect:
Photo ID Number: ______________________________
Who (photographer): ____________________________
Who (subject): ________________________________
People group:_________________________________
When (was the photo taken): _______________________
Where (Country): _______________________________
Where (City): _________________________________
Where (Place): ________________________________
What is in the Photo: ____________________________
Why was the photo taken (At what event):_________________________
Photo Description:__short story or caption___
Who (provided the Meta-data): _________________________
Here is my idea: Have 2 volunteers with iPads sit down with the retirees and show these pictures on the iPads to the retirees and then start collecting the data. The iPad app needs to be able to display the photos and then be able to allow the user to answer the above questions quickly and easily.
One app which has a really nice UI for editing photos is PhotoForge. [Review].
The iPad is only the first step though. The iPad works in one-on-one sessions working with one person at a time. Part of the overall strategy needs to be a cloud sourcing effort of meta-data collection. To implement this there needs to be central point of access where interested parties can have a many to one relationship with the content. This community added meta-data may have to be kept in a separate taxonomy until it can be verified by a curator, but there should be no reason that this community added meta-data can not be expected to be valid.
However, what the app needs to do is more inline with MetaEditor 3.0. MetaEditor actually edits the IPTC tags in the photos – Allowing the meta-data to travel with the images.In one sense adding meta-data to an image is annotating an image. But this is something completely different than what Photo Annotate does to images.
Photosmith seems to be a move in the right direction, but it is focused on working with Lightroom. Not with a social media platform like Gallery2 & Gallery3, Flickr or CopperMine.While looking at open source photo CMS’s one of the things we have to be aware of is that meta-data needs to come back to the archive in a doublin core “markup”. That is it needs to be mapped and integrated with our current DC aware meta-data scehma. So I looked into modules that make Gallery and Drupal “DC aware”. One of the challenges is that there are many photo management modules for drupal. None of them will do all we want and some of them will do what we want more elegantly (in a Code is Poetry sense). In drupal it is possible that several modules might do what we want. But what is still needed is a theme which elegantly, and intuitively pulls together the users, the content, the questions and the answers. No theme will do what we want out of the box. This is where Form, Function, Design and Development all come together – and each case, especially ours is unique.
- Adding Dublin Core Metadata to Drupal
- Dublin Core to Gallery2 Image Mapping
- Galleries in Drupal
- A Potential Gallery module for drupal – Node Gallery
- Embedding Gallery 3 into Drupal
- Embedding Gallery 2 into Drupal
This, cloud sourcing of meta-data model has been implemented by the Library of Congress in the Chronicling America project. Where the Library of Congress is putting images out on Flickr and the public is annotating (or “enriching” or “tagging” ) them. Flickr has something called Machine Tags, which are also used to enrich the content.
There are two challenges though which still remain:
- How do we sync offline iPad enriched photos with online hosted images?
- How do we sync the public face of the hosted images to the authoritative source for the images in the archive’s files?
Network Language Documentation File Management
This post is a open draft! It might be updated at any time… But was last updated on at .
Meta-data is not just for Archives
Bringing the usefulness of meta-data to the language project workflow
It has recently come to my attention that there is a challenge when considering the need for a network accessible file management solution during a language documentation project. This comes with my first introduction to linguistic field experience and my first field setting for a language documentation project.The project I was involved with was documenting 4 Languages in the same language family. The Location was in Mexico. We had high-speed Internet, and a Local Area Network. Stable electric (more than not). The heart of the language communities were a 2-3 hour drive from where we were staying, so we could make trips to different villages in the language community, and there were language consultants coming to us from various villages. Those consultants who came to us were computer literate and were capable of writing in their language. The methods of the documentation project was motivated along the lines of: “we want to know ‘xyz’ so we can write a paper about ‘xyz’ so lets elicit things about ‘xyz'”. In a sense, the project was product oriented rather than (anthropological) framework oriented. We had a recording booth. Our consultants could log into a Google-doc and fill out a paradigm, we could run the list of words given to us through the Google-doc to a word processor and create a list to be recorded. Give that list to the recording technician and then produce a recorded list. Our consultants could also create a story, and often did and then we would help them to revise it and record it. We had Geo-Social data from the Mexican government census. We had Geo-spacial data from our own GPS units. During the corse of the project massive amounts of data were created in a wide variety of formats. Additionally, in the case of this project language description is happening concurrently with language documentation. The result is that additional data is desired and generated. That is, language documentation and language description feed each other in a symbiotic relationship. Description helps us understand why this language is so important to document and which data to get, documenting it gives us the data for doing analysis to describe the language. The challenge has been how do we organize the data in meaningful and useful ways for current work and future work (archiving)?People are evidently doing it, all over the world… maybe I just need to know how they are doing it. In our project there were two opposing needs for the data:
- Data organization for archiving.
- Data organization for current use in analysis and evaluation of what else to document.It could be argued that a well planned corpus would eliminate, or reduce the need for flexibility to decide what else there is to document. This line of thought does have its merits. But flexibility is needed by those people who do not try to implement detailed plans.
Finding that Apple command symbol
I have always wanted to be able to type the ⌘ symbol for various reasons, including writing tutorials, but I have not know how to access it through my keyboard. A few, general, related notes:
- There is a nice wright up including some history on the Command Key, ⌘ on wikipedia.
- How Apple Keyboards Lost a Logo and Windows PCs Gained One
- PopChar is an application which helps users find obscure characters.
This functionality is built in to OS X with Character Viewer, though it is likely that PopChar extends the user experience in some way.
- This discussion on the Apple Forums talks about a way to put these symbols in Pages’ auto correction so that Pages will auto correct a set of characters typed to the symbol desired. I have seen this used in MS Word too.
- A table of Unicode characters corresponding to Macintosh keyboard symbols, as they commonly appear in menus.
- Special Key Symbols
- Apple Keyboard Symbols
- Multi-stroke Key Bindings
- Keystroke mapping explained by SIL’s NRSI.
The Next two Links are more detailed but like the above.
Marginally relevant:
It is unicode point 2318 (the html hex code is ⌘ ) and so you can find it in the character palette under:
- Code Tables>Unicode>2300>2318
- All Characters>Symbols>Technical Symbols
or you can go into
.
On OS X, if you switch your keyboard to Unicode Hex Input, then holding down opt allows you to type the four digits for a unicode symbol and get the ⌘ (2318).
The Alt/Option Symbol has also been elusive. It can be fount at Unicode point 2325. U+2325.
Unicode and Hex Keyboard symbols
⌘ – ⌘ – ⌘ – the Command Key symbol
⌥ – ⌥ – ⌥ – the Option Key symbol
⇧ – ⇧ – ⇧ – the Shift Key (really just an outline up-arrow, not Mac-specific)
⇥ – ⇥ – ⇥ – the Tab Key symbol
⏎ – ⏎ – ⏎ – the Return Key symbol
⌫ – ⌫ – ⌫ – the Delete Key symbol
Meꞌphaa Bibliography
.pdf
the plugin re-codes the link name to "pdf". This is the advertised behavior. However, when there is more than one URL, they all say "url" rather than what is the last part of the URI. Look at this example from above:
Steven Egland, Doris Bartholomew, Saúl Cruz Ramos (1978) La inteligibilidad interdialectal de las lenguas indígenas de México: Resultado de algunos sondeos, Instituto Lingüístico de Verano, p. 58-59, Mexico City: Instituto Lingüístico de Verano, url, url[mendeley type="groups" id="899061" groupby="year" grouporder="desc"]
Notes and a Bibliography
I have been looking for a way to create Posts with both Footnotes and a Bibliography section. I have wanted to make my post a little more professional looking, and let the information flow more easily with the way I write. What I have come to realize, is that Footnotes and Endnotes are different and function differently in respect to information processing. Traditionally, in print media Endnotes have occurred at the end of the article, whereas Footnotes have occurred at the end of the page on which the footnote is mentioned. This leads to a three way breakdown:
- Footnotes
- Endnotes
- Bibliography
The purpose of footnotes is to facilitate quick information processing without breaking the flow of reading or information processing of the consumer of information. On web-based media, the end of the article and the end of the page is the same if pagination is not enabled. So this creates a sort of syncretism between Endnotes and Footnotes. However, the greater principal of quick reference to additional information still applies on the web. There are several strategies which have tried to fill this information processing nitch, these include things like:
- Tooltips (The pop-up text which appears when your mouse cursor hovers over a link or some other text.)
- Lightbox (The darker shading of the background and the high-lighting of the content in focus.)
- Pop-up windows (which have been phased out of popular "good web design").
- Information (Text) balloons (an example of this is Wikipop Wikipop is really a combination of the above mentioned effects above to create an inline experience for the user. But some web-sites have a similar effect which is dependent on the mouse hovering over the "trigger".).
With strategies for conveying information like Tooltips it is possible to meet the same information communication and information processing goals which were formerly achieved through footnotes. For Web-based information, which is intended to be consumed through a web medium Wiki-pop makes a lot of sense. However, if the goal is a good print out of content then footnotes are still needed, that is why I am using footnotes on this particular web presentationA solution which does both, tooltips or solutions like Wikipop, and footnotes when the content is printed, would be ideal. .
So here is a quick post on how I am doing it.
I am using two different "endnotes" plugins. One for the Bibliography section and the other for the Notes section.
Creating the Footnotes section:
To create the notes section I have elected to use a plugin called FootnotesEven though there are other options for Footnote Plugins. One other option I know about is FD-Footnotes. by Rob Miller. (Big surprise on the name of the plugin...) Footnotes allows for me to put what I want to show up as footnotes in <ref>something</ref>
In order to get these tags to display inside of <code>
and </code>
tags I had to use HTML codes for the greater than sign, less than sign and slash. There is some additional good information about character encoding in HTML on Wikipedia. tags.
Additionally I can set a tag <reference />
anywhere in the post and produce a list of footnotes.
Creating the Bibliography:
To create the Bibliography Section I am using WP-Footnotes (in the WP plugins repository) by Simon Elvery. More information can be read about his plugin here. What this plugin allows me to do is to craft the citation of the item I want to cite. I have to figure out how I want to "code" the citation and then present the citation.
[1]Hand Code the contents of the citation as it is to appear in the bibliography here, between a set of double parentheses.
This will produce a citation marker (a number) as a super script inline with the text. Like this [2]Nikolaus P. Himmelmann. 1998. Documentary and Descriptive Linguistics. Linguistics vol. 36:161-195. [PDF] [Accessed 24 Dec. 2010] :
And that will produce a citation in the bibliography section like the following:
One interesting thing that occurs on the admin side of WordPress is that the plugin WP-Footnotes has an options page which shows up in the Settings menu, however what is interesting is that in that in the menu it is called Footnotes, not WP-Footnotes.
The options for WP-Footnotes really make it flexible, it is these setting which have allowed me to rename the section from Notes to Bibliography.
Final solution?
Is this my final solution? No. One thing I really don't like is that the bibliography is not orderd in alphabetical order of the last names, and then in order of the year of publication. Rather, citations are ordered in the order of appearance (as footnotes generally are). The plugin does not have any options for changing the order that thing appear in (though the headings on the ordered list can be changed). There is also no way to structure the data in the bibliography for reuse (even if it is just within this site), so each use of each citation must be hand-crafted with love. There are some other solutions which I am looking at integrating with this one but have not had time to really explore. One options is to integrate with Mendeley and aggregate bibliography data from a Mendeley collection. Another option is to create bibliographies as bibtex files and then use those to display the bibliography.
No hCite format defined
I am looking to re-skin Wikindex. I thought that I would add some CSS classes that would embed the meta-data in a manner that the citations could be picked up by Zotero quite easily. It seems to be a bit more difficult than I first anticipated. As a Microformat for citations is not yet been fully fleshed out. Obviously one way to go would be to embed everything in a span
element as COinS does but that is not really what I am looking for. (Mostly because I don’t have a way to generate the Attributes
in the span
element automatically.) I have thought of using RDFa. But I still need to do some more research and see what can be gleaned in terms of which controlled vocabularies to use. I am hoping that this Lesson On RDFa will really help me out here. Finally I do need to know something about OAI so that once the Resources are put into Wikindex I can then tell OLAC what language they belong to.
Leadership in an OpenSource Project
In the past week have been confronted with several issues related to project planning, task & time management and project execution. Just defining the “deliverables” has been a real challenge. Given that the workforce of the company I work for is largely constituted of people who consider themselves to be volunteers, it makes for an interesting work environment. I naturally gravitate towards planning for tactical success and wanting to view things from the “big picture” perspective – knowing how the parts fit together. Project planning and project execution involves a lot of decision making and a lot of communicating about decisions.
Over the last year I have been watching with some interest the UI development of WordPress. UI design is an area that I really enjoy. So when I saw Jane presenting on this issue of “How decisions get made at WordPress” (on the Open Source part of the project), I thought I would watch it. I thought that I would be watching how a company does UI decision making. But the focus of the talk was broader than that. It was generally good to see a model at work in a company where there is a successful product. As I listened to the discussion I was struck at how their project deals with:
- Decision Making
- Community Involvement
- Consensus Building
- Project Planning
- Leadership
- Sustainability
In many respects the company I work with deals with these same issues. It was good to see how another company/project deals with these issues, and sees these kinds of issues as important to the success of their product.
Open Source Language Codes Meta-data
One of the projects I have been involved with published a paper this week in JIPA. It is a first for me; being published. Being the thoughtful person I am, I was considering how this paper will be categorized by librarians. For the most part papers themselves are not catalogued. Rather journals are catalogued. In a sense this is reasonable considering all the additional meta-data librarians would have to create in their meta-data tracking systems. However, in today’s world of computer catalogues it is really a shame that a user can’t go to a library catalogue and say what resources are related to German [deu]? As a language and linguistics researcher I would like to quickly reference all the titles in a library or collection which reference a particular language. The use of the ISO 639-3 standard can and does help with this. OLAC also tires to help with this resource location problem by aggregating the tagged contents of participating libraries. But in our case the paper makes reference to over 15 languages via ISO 639-3 codes. So our paper should have at least those 15 codes in its meta-data entry. Furthermore, there is no way for independent researchers to list their resource in the OLAC aggregation of resources. That is, I can not go to the OLAC website and add my citation and connect it to a particular language code.
There is one more twist which I noticed today too. One of the ISO codes is already out of date. This could be conceived of as a publication error. But even if the ISO had made its change after our paper was published then this issue would still be persistent.
During the course of the research and publication process of our paper, change request 2009-78 was accepted by the ISO 639-3 Registrar. This is actually a good thing. (I really am pro ISO 639-3.)
Basically, Buhi’non Bikol is now considered a distinct language and has been assigned the code [ubl]. It was formerly considered to be a variety of Albay Bicolano [bhk]. As a result of this change [bhk] has now been retired.
Here is where we use the old code, on page 208 we say:
voiced velar fricative [ɣ]
- Aklanon [AKL] (Scheerer 1920, Ryder 1940, de la Cruz & Zorc 1968, Payne 1978, Zorc 1995) (Zorc 1995: 344 considers the sound a velar approximant)
- Buhi’non [BHK] (McFarland 1974)
In reality McFarland did not reference the ISO code in 1974. (ISO 639-3 didn’t exist yet!) So the persistent information is that it was the language Buhi’non. I am not so concerned with errata or getting the publication to be corrected. What I want is for people to be able to find this resource when they are looking for it. (And that includes searches which are looking for a resource based on the languages which that resource references.)
The bottom line is that the ISO does change. And when it does change we can start referencing our new publications and data to the current codes. But there are going to be thousands of libraries out there with out-dated language codes referencing older publications. A librarian’s perspective might say that they need to add both the old and the new codes to the card catalogues. This is probably the best way to go about this. But who will notice that the catalogues need to be updated with the new codes? What this change makes me think is that there needs to be an Open Source vehicle where linguists and language researchers can give their knowledge about a language resources a community. Then librarians can pull that meta-data from that community. The community needs to be able to vet the meta-data so that the librarians feel like it is credible meta-data. In this way the quality and relevance of Meta-data can always be improved upon.
Better Extended Live Archives
A long time a go, when WordPress was young (like version 1.5), And K2 was young. There was a plugin called Extended Live Archive (ELA).
I love the organization that this plugin gave to a bolg’s entries. It is still my preferred presentation of posts on a blog. Over the years all the software has developed K2 is now in version 1.0.3, WordPress is in version 3.0 and ELA has become Better Extended Live Archive (BELA) thanks to Charles
Here is a series of links – in no particular order – which talk about the development of ELA.
- http://74.125.95.132/search?q=cache:HamcenI2NhAJ:www.sonsofskadi.net/extended-live-archive/extended-live-archive-0-9/+extended-live-archive&cd=10&hl=en&ct=clnk&gl=us
- http://sonsofskadi.net/extended-live-archive/extended-live-archive-0-9/
- http://www.flickr.com/groups/ela-support/
- http://lemoned.me/archives/ela-for-wp23
- http://iwaruna.com/2008/11/03/extended-live-archives-ela-plugin-works-finally/
- http://sonsofskadi.net/extended-live-archive/
.
I have had a problem with how BELA Presents the entries by Date:
Notice how all the blog entires at the bottom are displayed on top of each other.
I can not figure out how to un-do that.
Notice also in the following two pictures of the sort by Tags and sort by Category list the entries are not displayed on top of each other.
For checking this live: you can look at the archive. This has been checked in several browsers:
- Safari
- Flock
- Firefox
- Cruz
All to no avail. (That is it does not appear to be a Browser based issue.)
The Offending Element: