Several months ago, I posted a question to Facebook about digital literacy.
What is the role or place of Digital Literacy in a company that values literacy as being vital to reaching its goals?
I have had several months to contemplate the question and I realize that I was a bit ambiguous in my question, or rather my question could not have been understood concisely. Digital Literacy can and is used to mean Continue reading →
A document’s DOI (http://www.doi.org/ or on Wikipedia under Digital Object Identifier) is an important part of the citation of a document. Many style sheets allow for just the DOI of a paper as the citation. Because DOIs are unique they can act as URIs which are resolvable and look like URLs. However, a DOI is different than a URL for where a digital object might be located. It might be well argued that a DOI should be tracked in the metadata schemes of archives which collect language and linguistic data. Continue reading →
I have been trying to find out what is the best way to present audio on the web. This led me to look at how to present video too. I do not have any conclusions on the matter. But I have been looking at HTML5 and not using javascript or Flash. Because my platform (CMS) is WordPress, Continue reading →
Metadata is very important – Everyone agrees. However, there is some discussion when it comes to how to develop metadata and also how to ensure that the metadata is accurate. Taxonomies are limited vocabularies (a set number of items) where each term has a predefined definition. A folksonomy is a vocabulary where people, usually users of data, assign their own useful words or metadata to an item. folksonomies are like taxonomies in that they are both sets but are unlike taxonomies in the sense that they are an ope set where taxonomies are closed sets.
An example of a taxonomy might be the colors of a traffic light: Red, Yellow, and Green. If this were a folksonomy people might suggest also the colors of Amber, Orange, Blue-Green and Blue. These additional terms may be accurate to some viewers of traffic lights or in some cases but they do not fit the stereo-typical model for what are the colors of traffic lights.
Some examples of taxonomies might be the keywords on a book record in a library. A library might have only certain keywords it uses. In contrast to curated records at libraries, websites like flickr and delicious allow users to tag (or Keyword) their photos and links with the keywords which are useful to them. These are examples of folksonomies. However, the concept of user generated metadata goes beyond the folksonomy to the any and all user generated metadata. In this scope projects like LibraryThing and Bibsonomy deserved to be mentioned as sites where user generated metadata plays a powerful part of the organizational presentation of the content on the site.
So the question comes to how are managers of data, like web masters or librarians to ensure the quality of metadata? And also balance that quality with the usefulness of the metadata to the users of the data. So if visitors to the library can not find the book they are looking for because the way they are looking for the book (the terms they are using) is not supported (those terms are not associated with the record for the book) then the cataloguing record is not as useful to that person. But if the library opens up its records for everyone to edit the how is the library to know that the records are accurate?
In linguistics there are several important taxonomies.
In this context there is also a multi-lingual element, each term may have several variations across languages. i.e. Phonology in English is Phonologie in German.
And in library science there are also several important taxonomies.
And every company or institution is going to have their own special taxonomies for various purposes.
SIL International unique taxonomies
The challenge for “marketing” or enabling the rapid and useful discovery and association of resources is to spend as little effort describing resources as an institution and to allow users to provide accurate metadata which is helpful to them. After all their mental associations are very important to the use and discovery of relevant resources. So the question is how can users add metadata value to objects in the archive? And how can the institution trust these proposed added value elements? SIL International, as a host institution to the Language and Culture Archive is not alone in this problem space.
Basically what is needed is an algorithm for turning unstructured data into valuable, valued, authoritative, structured data.
A algorithm for turning folksonomies into taxonomies
As I have stated above SIL International is not alone in this problem space there have been several studies and use cases which have been done and published on this very kind of problem.
Analysis of User Generated Metadata in the Library Thing Folksonomy_Vincent Sterken
The use of social discovery systems is rapidly expanding, often building vibrant and interactive communities. Some public and academic libraries are trying out these systems, in which patrons can contribute ratings, reviews, and comments. While user-contributed metadata may not equal the quality of professional cataloging, it can enhance the catalog records with rich supplementary information and personal perspectives. The author’s examination of use of social features in two public libraries led to the discouraging observation that addition of user-generated metadata in these contexts was limited, in sharp contrast to other social sites. The question of motivation is key. People’s notions of library catalog records and their ownership by library staff may present an obstacle to contributing metadata. User-generated metadata has the potential to add value to records while conserving limited library resources. The challenge of promoting the active use of social discovery systems in libraries demands further research.
The Continuum of Metadata Quality: Defining, Expressing, Exploiting
http://www.ecommons.cornell.edu/handle/1813/7895
Like pornography, metadata quality is difficult to define. We know it when we see it, but conveying the full bundle of assumptions and experience that allow us to identify it is a different matter. For this reason, among others, few outside the library community have written about defining metadata quality. Still less has been said about enforcing quality in ways that do not require unacceptable levels of human effort.
Metadata creation system for mobile images
http://dl.acm.org/citation.cfm?id=990064.990072
User-Generated Metadata for ETDs: Added Value for Libraries Sharon Reeves
http://epc.ub.uu.se/etd2007/files/papers/paper-40.pdf
Making Use of User-Generated Content and Contextual Metadata Collected during Ubiquitous Learning Activities
During the last years significant research efforts have been conducted looking at how to standardize digital educational content. Due to better connectivity and computational power of mobile devices, new opportunities have emerged for collecting user-generated data based on the context and the environment where the content has been generated. While metadata standards for learning objects such as IEEE LOM make it possible to annotate digital content with pre-defined metadata tags, the ability to store custom user-generated or contextual metadata is not yet fully supported. The need for developing a flexible solution to deal with these problems motivated the design of our activity controller system (ACS), a rapid prototyping system and a task manager, which interprets, reacts to and stores contextual metadata and content extracted during learning activities. This paper presents how ACS facilitates coordination and reusability of user generated data, which we believe is as a valuable feature compared with existing standards and initiatives.
Annotea and Semantic Web Supported Collaboration
http://ceur-ws.org/Vol-137/01_koivunen_final.pdf
Mapping Entry Vocabulary to Unfamiliar Metadata Vocabularies.
Author-generated Dublin Core Metadata for Web Resources: A Baseline Study in an Organization
(( Jane Greenberg, Maria Cristina Pattuelli, Bijan Parsia and W. Davenport Robertson.. Author-generated Dublin Core Metadata for Web Resources: A Baseline Study in an Organization. http://journals.tdl.org/jodi/article/viewArticle/42/45, ))
This paper reports on a study that examined the ability of resource authors to create acceptable metadata in an organizational setting. The results indicate that authors can create good quality metadata when working with the Dublin Core, and in some cases they may be able to create metadata that is of better quality than a metadata professional can produce. This research suggests that authors think metadata is valuable for resource discovery, that it should be created for Web resources, and that they, as authors, should be involved in metadata production for their works. The study also indicates that a simple Web form, with textual guidance and selective use of features (e.g. pop-up windows, drop-down menus, etc.) can assist authors in generating good quality metadata.
However, the relationships between datasets and the data created by those data sets have been growing over the past few years.
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/ CC-BY-SA
I am being convinced that at some point there will be enough open data out there that there will be a tipping point where if your data is not shared in this method that app producers will not process your data (without significant extra charge in home-grown apps, or at all for externally produced data consuming apps). This means that the social significance of open and Linked Data in RDF will be more important than, more labor intensive proprietary data sets.
I was watching this video, where several web app and several mobile apps were developed and competed for a prize. What one can do with this data is incredible.
I particularly like the app which tells you how long it takes someone in London to travel from point A to point B.
So where does this come into play with SIL International? Well, SIL is an NGO. NGO’s need engagement strategies. That is, Non-profits and NGOs operate to affect change. They have a compelling story, they tell the story and the hearers of the story are motivated to do some sort of action.
An engaged employee population is a strategic asset that enables organizations to inspire and mobilize their people to achieve specific business objectives. – http://engagementstrategies.com/
This has been the very nature of the Kony 2012 video and story. Their web presence is not about marketing, it is not about messaging, it is not about branding or color palettes. It is about engaging people to commit a certain set of activities. The Kony campaign’s entire web presence from the scripting of the youtube film to the design of their website is about getting people to commit to do and to carry out those suggested activities.
But how does this relate back to RDF and Linked Data? Well, if web apps and mobile apps are going to present data to users and work thought the presentation challenges of User Experience and User Interface in multiple locations and contexts. Then it becomes in the interest of NGOs as data providers to provide data which will affect users for their cause. Some NGO’s like SIL are very involved in content production. Consider the 40,000 plus items in the SIL bibliography of academic and vernacular works produced over their 75+ year history. These bits of content or resources are describable in RDF for data consumers. The obvious question is “Why”? That answer is simple: so that when others use Linked Data your resources are found and thereby promote awareness of your cause.
Let’s say that the organization, Invisible Children released 100,000 images of children who were carrying AK-47s and shooting their parents and were maimed or raped. Let’s also say that these images were also geo-tagged for the locations they were taken in. And that this metadata and these images were made available as Linked Data. Then, when global leaders in internet mapping technologies like Google, Wikipedia, and Yahoo! create web based applications which display Geo-Spacial content from Linked Data sources who’s content do you think is going to be displayed when someone is looking for pictures of Africa?
Image from BBC article about Kony. The BBC caption reads: "Some South Sudanese have already taken up arms against Kony and the LRA"
I have been looking for RDF ontologies for describing Bible portions. Particularly so that I can reference sections of scripture like chapter and verses of the bible (in addition to sections of books of the bible like The Prophets or The New Testament). Does such an ontology already exist? I have found http://bibleontology.com but this does not seem to be deep enough. I have also found http://www.semanticbible.com/ but the ontologies offered here do not seem to fit the desired coverage.
I was looking at the wikipedia article for Language Documentation. The only reference cited was a thesis by Debbie Chang. I happen to know Debbie. So I thought I would take a look at her thesis and see what she said. So I clicked the link and was delivered to a 404 error page on GIAL’s website. GIAL had recently renovated their website. I was able to locate thesis and fix the URL on wikipedia by digging through the GIAL website. The new URL is: http://www.gial.edu/images/theses/Chang_Debbie-thesis.pdf
But then I looked at the URL and asked: Why are PDFS in the images folder? What is the long term infrastructure for this school? It seems that when PDFs (thesis) are put into the images folder rather than into a digital repository that something is not quite right with the longterm planning for the school. Ironically, this is not too far from the main thrust of Debbie’s thesis.
It would seem that the long term solution for this kind of problem would be for a small school like GAIL to A. have its library develop an infrastructure for permanently housing these kinds of materials. Or B. contract with another organization or archive which could take care of these sorts of issues for them, provide handles or stable URLs, and then for GIAL to link to the permanent location of these items from GIAL’s website. It is interesting to note that on the same campus as GIAL is SIL International’s Language and Culture Archive, yet GIAL has not taken advantage of this opportunity.
For one of the web projects I am working in we have been throwing around the idea of having a world map as a navigation element. Each country would then be clickable. This kind of navigation has been done with hyperlinked bitmaps like the LL-Map project.
I have not seen any implementations in HTML5 canvas or in SVG. It occurs to me that these technologies could be used. I am not deeply familiar with either technology. So I did some googling.
I found some interesting articles on the matter.
I am not sure that I have any answers but this is my thought towards the problem space.
There is one map of languages I have found which deserves to be mentioned. I am not sure of the technology used but it seems it would be either of these methods. It is the map of the Languages of California hosted at Berkeley.
Umm frankly, I am not sure anything out there right now is going to work to bring OAI-PMH services to WordPress1. If it does then is it going to be able to use WordPress to advertise things or is it going to use WordPress to aggregate things? if the former then nothing out there ever let the admin user choose which fields were matched to which attributes, dynamically. But if it is also the former then why would anyone actually want this functionality? What is the Use Case? If one is using WordPress as a bibliography reference system like some libraries do, then this makes a lot of sense. However, there is another use case I would like to present. That is, the website which is about several or a single language. There are potentially two ways to conceptualize this:
I have been Looking at different ways to make SIL’s digital research content more interactive, findable, and usable. Today I found http://research.microsoft.com/en-us/. It is interesting how they approach the facets of Location, Projects, Publications, and People up in the right hand corner. I think they did a good job. The site feels like it is balanced.