A User Experience look at Linguistic Archiving
In a recent paper Jeremy Nordmoe, a friend and colleague, states that:
Because most linguists archive documents infrequently, they will never be experts at doing so, nor will they be experts in the intricacies of metadata schemas. [1]Jeremy Nordmoe. 2011. Introducing RAMP: an application for packaging metadata and resources offline for submission to an institutional repository. In Proceedings of Workshop on Language Documentation … Continue reading
My initial reply is:
You are d@#n right! and it is because archives are not sexy enough!
To properly understand my perspective it might be helpful to understand that I often fluctuate between worlds, the linguistics and language documentation world, the Web-design and usability world, the human behavior and systems design world, and also the SIL International and academic/university world.
I would like to explore Nordmoe's statement a bit more from a reference point of human behavior considering issues like usability and user experience.
First, I will address a few terms. How I define them here is not necessarily how Nordmoe defines them in his statement, but I think that our definitions overlap. Following a discussion of terms, I will present an analysis of two types of interaction with metadata and an analysis of different classes of an archive's users. Finally I will conclude with a discussion of some other kinds of tools in the same problem spaces as RAMP (Resource and Metadata Packager), SIL's newly introduced software for bundling electronic resources and metadata for submission to an electronic archive. That is, tools used in the submitter-archive interaction.In discussing the terms, there are two which I would like to cover with some interest:
- linguists
- documents
As well as two propositions:
- Linguists archive documents infrequently.
- Linguists are not experts at intricate metadata schemas.
On defining linguists
What is a linguist? Is this a academic (with the feature [+professionalism] optionally invoked)? Is it someone who has had 4 or more college level courses in linguistics? Is this a humanitarian aid worker working with a community to develop materials in their language? Is it a multi-lingual education consultant helping to develop bi-lingual curriculum? Is it someone who fills out a questionnaire about their language? Is it someone who is doing conservationist linguistics? [2] Brenda H. Boerger. 2011. To BOLDly go where no one has gone before. Language Documentation and Conservation (5):208-233. http://hdl.handle.net/10125/4499 [PDF] Or is it someone who is amassing oral histories of a community? Suffice it to say that what different entities deem to be "linguists" are truly various levels of education and various specific areas of practice within the communities of linguistics, language documentation and education. What one entity holds to be a linguist is not necessarily a linguist by another entities definition. What is certain and is transferable across contexts, is that the person Nordmoe presents as a linguist is a person in a position entering into an archive - submitter relationship in the role of a submitter. but this does not mean that all submitters are linguists.
On defining documents
In traditional archival (before the 1950s), archives were not dealing with digital items. Today digital is very much part of the picture. So a document, as Nordmoe references it, is not just a paper item but it could be a data set or a database or a video or a complex arrangement of related files. So in all fairness, today's Language Documenters and Linguists have many more "things" to track (along with the complex relationships relating the "things" together) than previous generations did, and thus the actual complexity of archiving as a profession has grown.
Linguists archive documents infrequently
Now to consider the two propositions. The first might perhaps be true. I am not sure how many "linguists" actually do archive their materials. I have been working with linguists since 2005. Most of those whom I have had the privilege to work with are proud of their professionalism. And from what I have experienced, they are professionals in life, and this shows through to their work in linguistics. However, professional linguists do not necessarily archiveThis footnote needs to be able to disambiguate the term archive. I have heard the term used several ways among linguists.
- Archive ≠ a corpus or collection
- Archive ≠ library
- Archive = Plan for longevity
- It seems natural that items recording the heritage of indigenous people of the Northwest would be archived locally and U of O is a more than qualified location in both location and in technical savvy for presenting the materials. But the institution it might not be being as helpful to the larger linguistic and anthropological communities as is possible. It could be submitting the U of O's language and culture based records to OLAC for aggregation.
- It would seem that the larger pattern is that professors and students archive locally, that is in these cited cases the researchers are archiving with their home institutions.
- How many places to access content is enough? Or otherwise stated, "How many places should there be?" if more is better... My contention is that more is not better. On the internet, considering web design and user experience [9] Hugh Paterson III. 10 November 2011. Diving into the UX World. https://hugh.thejourneyler.org/2011/diving-into-the-ux-world/ [Accessed: 18 January 2012] [Link] principles. Where one downloads things from creates context. This context is part of the framing of the content and is also part of the user experience. The archive, does not have to create a unique ethnocentric interface for each collection. Though this is possible, it is highly unrealistic in terms of sustainability. However, being good moralists, and only wanting the best for those we serve, as linguists, language documenters, and language community advocates, we should be asking the question: "does throwing content in several locations around the web actually help communities find and use materials in their languages? or does having the content in a single location actually help consolidated the web traffic and enable a synergy of interest to occur, and a social recognition that "this site" is a trusted source for these files?" In other words, does having multiple places on the internet where communities can download the source documents sell the community short? I think in general it does. There may be a good use case for presenting the content in more than one place. But if this is the case then web site designers need to be able to cite and link the content to the original on a per file bases. The crucial issue here is how do we set the minority language speaker's expectation in relation to web access to content? and where is the authoritative place to which people can go and get the original content?
- Are we over obligating institutions by archiving things twice? As a community, if we have multiple people doing the same thing to the same files in different places then we loose because we waste man hours. If ELRA and PARADISEC both have copies of the same content, in principal that is good. But one archive should be indicated as primary.
- Citations of data flow back to a single source.
- One institution has primary responsibility for maintaing the digital accessibility/usability (migration to new file types when needed) of the files.
- When considering alternative access points to language data, enabling community use of language data on community run websites, archives should consider using Technologies like RSS and CDNs to deliver content dynamically. This will add longevity, and continuity to community websites where the community wants to access or present data in their personalized context without requiring the community website to have a large data store. (This kind of efficiency could be achieved with a specially designed Drupal install profile.)
Archives are places with a commitment to provide the greatest possible open access to data while also preserving data and enabling the conditions speakers give for its dissemination. their work, at least historically. And if linguists do "archive", it might be at an institutional repository or a library, and not an archive which has the capability or expertise in dealing with language data and language data issues. [3]Debbie, Chang. 2011. TAPS: Checklist for Responsible Archiving of Digital Language Resources. MA thesis: Graduate Institute of Applied Linguistics. … Continue reading Several examples might include: Picturing the Cayuse, Walla Walla, and Umatilla Tribes a photographic collection documenting their cultures housed at the University of Oregon Library, [4]Major Lee Moorhouse (photographer). 1888-1916. Picturing the Cayuse, Walla Walla, and Umatilla Tribes. Digitized and held at the University of Oregon Digital Library. … Continue reading language documentation materials to be given to the library of University of North Texas, [5] Sadaf Munshi. 2011. Archive of Annotated Burushaski Texts. NSF grant proposal. http://www.neh.gov/grants/guidelines/pdf/DEL_NSF_Munshi.pdf. [PDF] [DEL Awards] [Accessed: 15 February 2011] and the Yagua [6] Thomas E. Payne and Doris L.Payne. 1983. Yagua language sound recordings. http://hdl.handle.net/1794/4125 [Link] and Panare [7] Thomas E. Payne and Doris L.Payne. 1989. Panare language sound recordings. http://hdl.handle.net/1794/4126 [Link] [Accessed: 15 February 2012] language sound recordings housed at the University of Oregon Library. It does occur to me that there deserves to be two remarks about the University of Oregon Library:
In the fall of 2006, I went to the New York Public Library to look at the personal papers of a linguist who worked in the Philippines. The experience was as exciting as this short film depicts. httpv://youtu.be/xEIO4mWgS2E The long since deceased man's papers, were off site in some box which had to be special ordered. When this "linguist" took those field notes on languages spoken in the Visayas and in Luzon in the early 1900's he had no idea that a young man in the 21st Century would be pouring over the descriptions of words from over 100 years ago. Carlos Everett Conant had no knowledge that his papers and field notes were going to be archived. In fact, he did not even archive them himself. Someone else did it after his death. So though this is a testimony to the power of archives and libraries which function as archives, it is also a testimony to the fact that "linguists" often do not see archiving as part of "doing linguistics". The act of archiving, a "linguist" offering their data, notes and analysis to an archive is beyond the act of collecting and analyzing language data.
Most professional linguists are academics and are caught in the rat-race of academia - publishing in peer reviewed journals, etc. And why not? It is often connected to the way linguists retain their position at their academic institution. It is the business they are in; it is what they do.
linguistsEspecially some in SIL International, those in language documentation, language revitalization and in conservational linguistics. feel, and rightfully so, that they have a responsibility to give something back to the communities with which they interact. [8] Aleksandr E. Kibrik. 2005. Collective Field work: Advantages or disadvantages? Studies in Language vol. 30:2, p. 259-279. Part of giving back to the community is often acknowledged to be the archiving of language materials in a manner in which the community can access the materials.Working with both SIL International language program and project administration and with SIL International's Language and Culture Archive, I hear "You have to make sure that the local community has access to the content, and if you also put it in an archive, that's great! It is another place they [people in communities] can access the content." What this comment does not distinguish is that there is a very distinct and professional difference between access to materials and archiving materials. However, this comment would reflect that the administrative perspective on the priority of SIL's obligation to communities is to grant access to materials, rather than to archiving of content. I do not think that this challenge of defining of priorities is unique to SIL International. In fact every researcher, institution, and project needs to ask the question: "Which is more important: access to content or archiving content"? This breaks down to an operational question of: "Does one secure the files or does one present the files first"? While in some situations it might be possible to do both simultaneously, it seems that the better thing to do for the longevity of the data, and the community is to archive the material first and then figure out how to best disseminate the material to the community. However, the challenge to the researcher who views their research as a service to the community is that the community is the client and the researcher is the service provider. At the end of this Client-Service Provider exchange it should be the client who negotiates and arranges for the archival of products (including data) produced for their interests (or with their participation). This is not to say that the community can not also negotiate the archiving of materials with the same service provider. However, the Client-Service Provider framework is not the only framework for working with language data or used by language related researchers submitting things to archives. There is also the research framework, where the researcher pays for the data and then also "owns" the process by which the data is collected and has a tighter control, or influence on the terms of use of the data. This kind of framework might be used in a variety of use cases but can be imagined to be used very often in acoustic phonetics cases where specific words are elicited and tested.
In general it is often agreed that local depositing of data in some way is necessary for community access. Even if local archival is not practical because the culture and community are not ready for the responsibility to be long term stewards and custodians of language data. I want to be sure to point out that local access is not archiving and archiving does not always mean local access. However, the two are not mutually exclusive, and one does not infer the other. A practice where materials can be accessed by the community and community members in diaspora via the internet is probably the best solution one can hope for given the current state of global politics, technology, and social settings.
This distinction between archival and access points to a class of customers serviced by archives; the end user in language communities. For this article it is important to distinguish this class of end user from the researcher end user who may want to access materials for a different kind of use. With this distinction made between both of these kinds of end users I think it is appropriate to consider usability in terms of the language community class of end user. The statement in the beginning of this note infers that lots of access points are better. While I would agree that LOTS is a perspective on archiving. That is, lots of copies keep things safe, LOTS does not consider the long term preservation issues with digital media. That is, in digital archiving, there is a responsibility on the part of the archive to maintain the files in a currently readable/usable format. This is part of the digital preservation process. So there are three issues which present themselves when considering LOTS along with access issues.
However, access is not usability, access is also not archival. Archiving language data (including consumer products, such as curriculum, and items often made in language revitalization projects) does not make the products usable, useful, or pleasurable for the target audience. However it is encouraging that language documenters, in contrast to linguists, set out with a mindset to archive their work. Perhaps this is another useful factor we can add to Himmelmann's distinctions [10]Nikolaus P. Himmelmann. 1998. Documentary and Descriptive Linguistics. Linguistics vol. 36:161-195. [PDF] [Accessed: 24 December 2010] to help us differentiate between Langauge Documenters and Linguists.Himmelmann does mention that documentary linguistics does need to interact with an archive. However, my understanding based on discussion in pages 171-176, is that the discussion presents the need for thinking through archiving as an OpenAccess type of platform for distribution. This is different than thinking through preservation of language data, and yet again different than thinking through organization of language data in an archive. And yet once again, its different from thinking through the dissemination for enrichment and reabsorption into the archive.
I am not saying that I think this was overlooked in the writing of Himmelmann's article, just that these factors mentioned here and the propensity for linguists to not deposit their materials in an archive wasn't explicitly stated.
This distinction, which Himmelmann draws, continues to be blurred by many researchers and funding institutions, which propagate a style of best practice linguistics which brings multi-media tools into the research environment and require this data, primary data, to be archived.I want to be clear that I am not arguing against doing this sort of best practice linguistics. I do question whether this kind of linguistics should be called Language Documentation. David Nathan describes this multi-media approach in linguistics when he says:
"... I drew the conclusion that linguists make field recordings to serve as evidence, not performance. Even as evidence audio was auxiliary, a kind of side effect; the principal fieldwork products being field notes and language knowledge absorbed by the researcher. It was as if the main role of the recorded tapes was to provide evidence that the fieldwork had actually taken place". [11]Nathan, David. 2010. Sound and unsound practices in documentary linguistics: towards an epistemology for audio. In Peter Austin (ed.), Language Documentation and Description, vol. 7, 262-84. London: … Continue reading
The noteworthy distinction here is that a best practice linguistics investigation with multimedia resources is not the same as conservational linguistics, or corpus creation. While all have some merit, the later is the target of language documentation as defined by Himmelmann. Each of these cases interact with an archive, but by nature of their target objectives each will create different types of outputs. This presents some operational challenges for archives when they "design" methods for describing the kinds of resources produced by both linguists and language documenters. The solution which Nordmoe (2011) is presenting, RAMP, is not just for language documentation. It is designed to fit the needs of several kinds of products held at SIL International's Language and Culture Archive. This archive also works closely with SIL's publication workflows. RAMP as a tool is most appropriately situated in a framework or problem space in which the researcher - archive interaction is considered.This problem space is not to far from the publishing problem space either, especially when one considers that some think, archiving to be publishing. [12]Gideon Burton. 27 January 2009. The coming change in Humanities 7: the online archive. … Continue reading [13]Heidi Johnson. 2005. Language Documentation & Archiving. Presentation at: The Open Language Archives Community Archiving and linguistic resources or How to keep your data from becoming … Continue reading
Linguists are not experts at intricate metadata schemas
At face value, Nordmoe's statement and particularly the second proposition, highlights that developing metadata schemas is not the primary focus of linguistics. While I can generally agree with the proposition when metadata schemas like those used for news articles, [14] rNews. 7 October 2011. rNews version. International Press Telecommunications Council. http://dev.iptc.org/rNews [Accessed: 13 December 2011] [Link] photos, [15]Liu, Cjien-cheng & Chen, Chao-chen 2009. Archiving and Management of Digital Images Based on an Embedded Metadata Framework. Proc. Int’l Conf. on Dublin Core and Metadata Applications 2009. p. … Continue reading or video [16]Chris Lacinak. 2010. A Primer on Codecs for Moving Image and Sound Archives: 10 Recommendations for Codec Selection and Management. Edited by: Joshua Ranger. AudioVisual Preservation Solutions: New … Continue reading are referenced, I would like to point out that the proposition, in its general sense, is controversial at best because linguistics, and linguistic descriptions are a meta description of language. Metadata schemas often come from information sciences like library science and computer science. However, when linguistics is viewed as a science, [17]Scott Farrar and Moran Steven. 2008. The e-linguistics toolkit. Proceedings of e-Humanities—an emerging discipline: Workshop in the 4th IEEE International Conference on e-Science. IEEE/Clarin, IEEE … Continue reading which most linguists would probably agree with the view that linguistics is a science, then it too is an information based science.Linguistics has always be about information flow. The flow of information to and from the brain and connecting muscles, the flow of information as it travels with other information, the flow of information as it flows through processing centers, or between groups of people, etc. (It may sometimes be treated as a humanity rather than a hard science, but it still has an information based foundation.) The view that linguistics is a science is supported by realities like that in linguistics it is important to be able to control or identify factors which affect the testability of a hypothesis. That is, even in linguistics we have testable hypotheses and "experiments". As linguist we search for knowledge and then seek to describe and explain it through analysis. Many typologies, taxonomies, and metadata schemas have been created by linguists (i.e. Project Gold, [18] Scott Farrar and D. Terence Langendoen. 2003. A linguistic ontology for the Semantic Web. GLOT International. vol. 7:3, p.97-100. [PDF] EARS Metadata Extraction project [19]Linguistic Data Consortium. 3 February 2004. Simple Metadata Annotation Specification Version 6.2. from the EARS Metadata Extraction project. … Continue reading and other projects relating to language resources [20]Christopher Cieri, Khalid Choukri, Nicoletta Calzolari, D. Terence Langendoen, Johannes Leveling, Martha Palmer, Nancy Ide, James Pustejovsky. 2010. A Road Map for Interoperable Language Resource … Continue reading [21]Jeff Good, Tom Myers, Alexander Nakhimovsky. 2010. Interoperability for Language Documentation: The Role of Semantic Web Tools. Manuscript. … Continue reading ). We, the linguists, use metadata in our analysis representing them in complex models, and relationships between various factors. So we do use metadata - we may not know it or call it metadata but we use it. What is new to linguists who are crossing the bridge from Linguistics to Language Documentation or to a Best Practice Linguistics with multi-media tools is that the metadata schemas used with "new" digital tools are in some cases "new". This does not mean that linguists are not able to comprehend the complexities of these systems of organization, but rather that they would rather be users of the systems than designers, or learners of the systems.
As language archive operators, what we can learn from this is that we have a class of users of archiving systems, linguists (in the role of submitters) who are familiar with metadata, but are not desirous of learning new systems. So while the assumption that linguists will not be experts in metadata is not entirely true. It is true that Linguist do not want to be experts in metadata, especially archival metadata, after all they have other pursuits and that is what archivist do (take care of metadata and organize things, right?). Linguists want to be system users and they want a pleasurable experience which makes senseThe phrase "which makes sense to them" is not to be underestimated. Users will not use systems voluntarily unless the approach to the system is comfortable to them and they can comprehend the approach. Since we are talking about researchers as users the out-put of the system must continually meet the changing expectations of users. Researchers are constantly asking new questions. This would imply that systems must evolve to meet the changing answers of those research questions. It also implies that the systems must have a user feedback mechanism which gains the trust of the users. to them while they do it (the it here is also ambiguous, it could be submitting the data or it could be collecting the data). This begins to approach the issue of user experience when Thorsten Trippel talks about usability in his paper The missing links in documentary linguistics: An approach to bridging the gap between annotation tools.
As Trippel says about annotation software:
usability: the non programming researcher cannot be required of going into the technical details. In fact it is already a form of hindrance for a lot of researchers to install and use an additional program they are not used to. The user interface for them has to have the so called look and feel of an already known program for them to be useful. The easiest would be to have an export and import feature in the preferred software, but a specialized transformation tool which is based on a known user interface would be acceptable to most users. [22]Thorsten Trippel. 2006. The missing links in documentary linguistics: An approach to bridging the gap between annotation tools. Paper for the E-MELD 2006 Workshop on Digital Language Documentation: … Continue reading
However, the term usability suffers from an unfortunate ambiguity in the software design world. Usability can mean that a system is functional and in that sense usable. But usability can also refer to the psychological effect on using a piece of software [23] Stephen P. Anderson. Seductive Interaction Design. http://www.slideshare.net/stephenpa/seductive-interactions-idea-09-version [Slides] [Accessed: 27 January 2012] - in sense how pleasurable or easy to an unlearned user is the software to use. [24]Stephen P. Anderson. 2011. Long After the Thrill: Sustaining Passionate Users. http://www.poetpainter.com/thoughts/article/4-new-presentations. [Video] [Slides] [Presentation] [Accessed: 27 January … Continue reading Considering the image [25] Des Traynor. 2011. Make It Meaningful. Contrast corporate blog. http://contrast.ie/blog/make-it-meaningful/ [Link] [Accessed: 18 January 2012] below from Stephen P. Anderson's presentation Creating Pleasurable Interfaces: Getting from Tasks to Experiences, [26]Stephen P. Anderson. 2007. Creating Pleasurable Interfaces: Getting from tasks to Experiences (Slide 15). … Continue reading which casts a contrast between usability and user experience.
What we can learn from systems like iTunes and Facebook is that they go beyond the analysis of what are the tasks to ask the questions, what will make the user experience pleasurable to the end user. This means taking a systemic perspective view not just a task or interaction view. [27] Des Traynor. 16 January 2012. Copy the Fit, not the Features. The Intercom Blog. http://blog.intercom.io/copy-the-fit-not-the-features/ [Link]Metadata and human behavior - What does metadata do for us?
Metadata allows for the grouping of items of like kinds. It allows for the comparing of apples to apples and fruit to fruit.
___ As archivist, we know this. As typologists, we know this. As linguists, we know this. Yet then why as linguist do we not want to create complex systems every time we want to answer a research question? It is because building the system we are going to be using is not the task we want to be "doing". We have other questions we would rather spend our time answering. In this regard we want to be users of systems, rather than builders. We want to get out there and play the game - that is, the game we want to play not the game we have to play to get out there. ___
Relying on metadata, taxonomies, schemes is what typologists do. WALS [28] Matthew S. Dryer & Martin Haspelmath (eds.). 2011. The World Atlas of Language Structures Online. Munich: Max Planck Digital Library. http://wals.info [Accessed: 12 December 2011] [Link] is great evidence for useful knowledge abstracted from typological work. The Ethnologue [29] M. Paul Lewis. (ed.), 2009. Ethnologue: Languages of the World, 16th Edn. Dallas, Tex.: SIL International. might also be considered another typological work, but one of socio-linguistic factors about languages. This expression typological analysis relies on the classification of knowledge (invoking metadata) and the use of said metadata in comparison in order to turn facts into digestible, comprehendible knowledge.
There are two socially significant applications which represent two different use cases of metadata. Most people probably have used each of these applications and are likely to be familiar with them.
- iTunes
There are several interesting aspects to these two applications as they approach the use, collection, and presentation of metadata for their users. I am going to explain the two use cases and then contrast the two use cases. Finally I will then talk about these two use cases in the language documentation workflow and the challenge of getting metadata out of linguists in the linguists - archive interaction
iTunes
In iTunes there are several user experience elements for working with metadata. The two I will mention here are the Record Edit Pane and the Browse View. The record pane allows for the user to take an in depth look at some of the technical metadata and edit some of the associated metadata like composer, performer, track number, title, etc. The browse view allows for the user to sort, find, filter and play items based on metadata elements. One can sort their music by most any metadata element in the browse view.
At the core of the iTunes metadata management model is the music file and a record containing a list of attributes for the file. The record is displayed as a part of a list containing all the music files in one's collection. So the total object the user interacts with is the sum of each item in the collection and the record which goes with it. That record consists of metadata. iTunes has an interface for editing the metadata of objects in the collection, the Record Edit Pane.
From a record keeping point of view, this iTunes record may seem to be a massive record, records for linguistic digital objects can have a lot more associated metadata. (This applies to both "born digital" objects and objects which are the digital manifestations of objects created prior to the digital age.)
Additionally the user interface allows for the grouping of items based on task type, and provenance. Games and Apps obviously come from the iTunes Store and are for use with mobile devices. Unlike some other file types, audio files can be imported from more sources than just the iTunes Music Store.
iTunes anticipates these differences and creates visual distinctions based on file types and end-user tasks.
The real beauty to the iTunes user is that the user can reap the benefit from the enriched records as soon as they enrich the record. These metadata elements can be used immediately by features like smart lists and play lists to present the desired sets of output.
The other kind of app that has altered the face of the internet and most people's interactions with each other is Facebook. Facebook has a very interesting method for developing and assimilating metadata into structured data. This is partially reflective of their corporate pursuit. That is their business plan - How is it that they make money? Facebook's business question comes into focus a bit clearer by contrasting it with Google. Google's focus is organizing the world's data. However, Facebook's business is about discovering how the world is connected. This is why their theme for TIMELINE comes out now (2011-2012). If you look at the history of Facebook they have introduced and retracted features (including UX redesigns) to systematically affect people's connections over content and how that content is connected to things other people and places. Facebook has been doing this systematically since they started. TIMELINE is the second, or third attempt at collecting data about people from past time. But brings new elements, relationships, and content to the end user's focus. Thereby providing Facebook with more data about the connections between people, places and things. In many regards, Facebook's strategy of acquiring a user and then implementing a strategy to develop data about the user is much like what archives do with progressive metadata enrichment. Facebook "elicits" more (meta)data by altering the dynamics (including frequency) and context of interactions, the visual display of information and services which connect users to other people and things. For example: has Facebook ever verified your cellphone number for security reasons?
However, Facebook really pushes the envelope on the definition of metadata. That is, is metadata data about data or is data about data part of the data? Where does the record begin and where does the record end? Is metadata really part of the data collection and description process? Is the part of my Facebook profile which says that I speak German metdata about me or is it part of my record? Is it metadata about me or is my record metadata about one of the speakers of the German language?
___Insert Diagrams____
By taking metadata from the attribute level to the interactive level, Facebook has made metadata enrichment of objects (or "user profiles") part of the workflow for Facebook Users. This also has the effect to shorten the distance between metadata and data and make the distinction more aptly dependent on the point of view of the target inquirer.
The contrasting and comparing of the two implementations of metadata.
___Bring these two models together and show where RAMP fits in the problem space. Also make clear how the problem space will change.____
Linguistics of today does have a lot to do with digital humanities as applied to linguistics.
What is more true is that Metadata is ever more important to linguistic analysis. Many linguist, and many more linguists should be concerned with the metadata attributes of their archived items (texts, wordlists, speakers identities, etc.).
Leave Typology to the Typologists
So should I resign myself to let the typologist to organize my data? No. The organization of language data, and materials created, and materials used should be taught everywhere that linguistics is. (Surely there will be variation, as long as there are different training institutions.) Some of the value, and certainly the beauty of language, can be realized when we understand a language's unique place in the world because it is different than the typological norm. However, without understanding the power of metadata, linguists can not reach useful analysis.
Group Response and Human Usability of Systems
John Wilbanks
The difference is that .005% of all web users gets us Wikipedia. .005% of geneticists gets us a table at T.G.I. Friday’s. My point was that the math breaks down for crowds and science. [30] John Wilbanks. 2009. Lions, Tigers, and Crowds. http://scienceblogs.com/commonknowledge/2009/03/lions_tigers_and_crowds.php
I take that to mean that a crowd of randomly sampled people will not behave the same way that a crowd of scientist behave. This is probably true for the members of any organization. So, by extrapolation it is probably true that a group of people working under the auspices of SIL International will also not behave the same way as a group of anyone else; good, bad or indifferent.
However it is important to note that the problem space which birthed RAMP is not unique to SIL International. SIL International's use case(s) are unique because SIL's management structures, language development projects and personnel vary (significantly) regionally. There are several other tools out there which belong to use cases in the same problem space.
http://sil.org/sil/news/2011/ramp.htm
https://github.com/edina-jorum/Jorum-DSpace/blob/master/PACKAGER_NOTES.txt
___ Citations for Social-collected metadata [31]Marieke van Erp, Johan Oomen, Roxane Segers, Chiel van den Akker, Lora Aroyo, Geertje Jacobs, Susan Legêne, Lourens van der Meij, Jacco van Ossenbruggen; and Guus Schreiber. 2011. Automatic Heritage … Continue reading [32]Besiki Stvilia and Corinne Jörgensen. 2009. User-generated collection level metadata in an online photo-sharing system. Library & Information Science Research. Vol. 31 No. 1, 54-65. … Continue reading
[33]Adam Mathes. 2004. Folksonomies - Cooperative Classification and Communication Through Shared Metadata. http://www.adammathes.com/academic/computer-mediated-communication/folksonomies.html [HTML] … Continue reading
[34] Karen Smith-Yoshimura. 2010. Social Metadata for Libraries, Archives, and Museums. http://www.diglib.org/wp-content/uploads/2011/01/SocialMetadataforLAMs.pdf [PDF] [Accessed: 18 January 2012]
[35]Krystyna K. Matusiak. 2006. Towards user-centered indexing in digital image collections. OCLC Systems & Services: International digital library perspectives. Vol. 22 No. 4, Pages 283-294. DOI: … Continue reading ___
Dealing with corporate complications. Competencies, and technology.
An alternative analysis of Jeremy's second proposition is that linguists are highly probable to be incapable to become experts at complex metadata schemas. I would like to present some reasons why I do not feel this interpretation should be considered.
Educational Statistics of SIL International on LinkedIn.
many linguists can not be bothered to learn or understand metadata schemas.
However, could be degrading to linguists or reflect on the intellectual prowess of linguists in general.
http://www.disc-uk.org/docs/data_sharing_continuum.pdf data-sharing image...
Autocratic.
Some other RAMP like tools.
http://ptsefton.com/2009/03/05/desktop-eresearch-revolution.htm
http://ptsefton.com/2009/03/16/the-desktop-fascinator-aka-dterrev.htm
Field Helper, which is from Sydney
http://acl.arts.usyd.edu.au/fieldhelper/
Field Helper is a desktop application that enables you to quickly view and categorise groups of related digital files and then submit the resulting package to a repository for long term preservation and access. Digital repositories require a submission to be formated in a specific way and be described according to a standard meta data encoding schema. Working with Field Helper results in a ZIP file containing compressed versions of your files along with a METS (Metadata Encoding and Transmission Standard) file which contains a detailed description of each file and its relationship to other files in the submission. METS is a standard that works with most repositories and – where required – can be easily translated into a form that non METS compliant repositories can work with.
Vireo a tool for depositing Thesis and Dissertations into DSpace
http://sourceforge.net/projects/vireo/
The Fascinator on the desktop
http://ice.usq.edu.au/projects/fascinator/trac
https://github.com/edina-jorum/Jorum-DSpace/blob/master/PACKAGER_NOTES.txt
http://sil.org/sil/news/2011/ramp.htm
RAMP, a new resource for archiving language and culture research
Metadata Magic
Published on August 10, 2011
As a user tool.
It has some really clear questions but some much needed features. The problem here is not in design to meet the business plan but rather in the business plan itself.
References
↑1 | Jeremy Nordmoe. 2011. Introducing RAMP: an application for packaging metadata and resources offline for submission to an institutional repository. In Proceedings of Workshop on Language Documentation & Archiving 18 November 2011 at SOAS, London. Edited by: David Nathan. p. 27-32. [Preprint PDF] |
---|---|
↑2 | Brenda H. Boerger. 2011. To BOLDly go where no one has gone before. Language Documentation and Conservation (5):208-233. http://hdl.handle.net/10125/4499 [PDF] |
↑3 | Debbie, Chang. 2011. TAPS: Checklist for Responsible Archiving of Digital Language Resources. MA thesis: Graduate Institute of Applied Linguistics. http://www.gial.edu/images/theses/Chang_Debbie-thesis.pdf [PDF] [Accessed: 17 February 2012] |
↑4 | Major Lee Moorhouse (photographer). 1888-1916. Picturing the Cayuse, Walla Walla, and Umatilla Tribes. Digitized and held at the University of Oregon Digital Library. http://oregondigital.org/digcol/mh/ [Link] [Accessed: 17 February 2012] |
↑5 | Sadaf Munshi. 2011. Archive of Annotated Burushaski Texts. NSF grant proposal. http://www.neh.gov/grants/guidelines/pdf/DEL_NSF_Munshi.pdf. [PDF] [DEL Awards] [Accessed: 15 February 2011] |
↑6 | Thomas E. Payne and Doris L.Payne. 1983. Yagua language sound recordings. http://hdl.handle.net/1794/4125 [Link] |
↑7 | Thomas E. Payne and Doris L.Payne. 1989. Panare language sound recordings. http://hdl.handle.net/1794/4126 [Link] [Accessed: 15 February 2012] |
↑8 | Aleksandr E. Kibrik. 2005. Collective Field work: Advantages or disadvantages? Studies in Language vol. 30:2, p. 259-279. |
↑9 | Hugh Paterson III. 10 November 2011. Diving into the UX World. https://hugh.thejourneyler.org/2011/diving-into-the-ux-world/ [Accessed: 18 January 2012] [Link] |
↑10 | Nikolaus P. Himmelmann. 1998. Documentary and Descriptive Linguistics. Linguistics vol. 36:161-195. [PDF] [Accessed: 24 December 2010] |
↑11 | Nathan, David. 2010. Sound and unsound practices in documentary linguistics: towards an epistemology for audio. In Peter Austin (ed.), Language Documentation and Description, vol. 7, 262-84. London: SOAS. |
↑12 | Gideon Burton. 27 January 2009. The coming change in Humanities 7: the online archive. http://gideonburton.typepad.com/gideon_burtons_blog/2009/01/the-coming-change-in-humanities-publishing-7-the-online-archive.html [Accessed: 12 December 2011] [Link] |
↑13 | Heidi Johnson. 2005. Language Documentation & Archiving. Presentation at: The Open Language Archives Community Archiving and linguistic resources or How to keep your data from becoming endangered. Special session at the annual meeting of the Linguistic Society of America, 2005, Oakland, California. www.language-archives.org/events/olac05/olac-lsa05-johnson.pdf. [Accessed: 12 December 2011] [PDF] [Link] |
↑14 | rNews. 7 October 2011. rNews version. International Press Telecommunications Council. http://dev.iptc.org/rNews [Accessed: 13 December 2011] [Link] |
↑15 | Liu, Cjien-cheng & Chen, Chao-chen 2009. Archiving and Management of Digital Images Based on an Embedded Metadata Framework. Proc. Int’l Conf. on Dublin Core and Metadata Applications 2009. p. 71-84. |
↑16 | Chris Lacinak. 2010. A Primer on Codecs for Moving Image and Sound Archives: 10 Recommendations for Codec Selection and Management. Edited by: Joshua Ranger. AudioVisual Preservation Solutions: New York. |
↑17 | Scott Farrar and Moran Steven. 2008. The e-linguistics toolkit. Proceedings of e-Humanities—an emerging discipline: Workshop in the 4th IEEE International Conference on e-Science. IEEE/Clarin, IEEE Press. http://www.clarin.eu/system/files/private/FarrarMoran08_eling.pdf [PDF] [Accessed: 14 January 2012] [Link]. |
↑18 | Scott Farrar and D. Terence Langendoen. 2003. A linguistic ontology for the Semantic Web. GLOT International. vol. 7:3, p.97-100. [PDF] |
↑19 | Linguistic Data Consortium. 3 February 2004. Simple Metadata Annotation Specification Version 6.2. from the EARS Metadata Extraction project. http://www.ldc.upenn.edu/Projects/MDE/Guidelines/SimpleMDE_V6.2.pdf. [Accessed: 12 December 2011] [PDF] [Link] |
↑20 | Christopher Cieri, Khalid Choukri, Nicoletta Calzolari, D. Terence Langendoen, Johannes Leveling, Martha Palmer, Nancy Ide, James Pustejovsky. 2010. A Road Map for Interoperable Language Resource Metadata. LREC 2010 Proceedings. May 17-23 2010, Malta. p. 2506-2509. www.lrec-conf.org/proceedings/lrec2010/pdf/951_Paper.pdf [PDF] |
↑21 | Jeff Good, Tom Myers, Alexander Nakhimovsky. 2010. Interoperability for Language Documentation: The Role of Semantic Web Tools. Manuscript. http://www.acsu.buffalo.edu/~jcgood/GoodMyersNakhimovsky-Interoperability.pdf [PDF] [Accessed: 14 January 2012] |
↑22 | Thorsten Trippel. 2006. The missing links in documentary linguistics: An approach to bridging the gap between annotation tools. Paper for the E-MELD 2006 Workshop on Digital Language Documentation: Tools and Standards: The State of the Art. http://wwwhomes.uni-bielefeld.de/ttrippel/mip/trippel_missing_links.pdf [PDF] [Accessed: 14 January 2012] |
↑23 | Stephen P. Anderson. Seductive Interaction Design. http://www.slideshare.net/stephenpa/seductive-interactions-idea-09-version [Slides] [Accessed: 27 January 2012] |
↑24 | Stephen P. Anderson. 2011. Long After the Thrill: Sustaining Passionate Users. http://www.poetpainter.com/thoughts/article/4-new-presentations. [Video] [Slides] [Presentation] [Accessed: 27 January 2012] |
↑25 | Des Traynor. 2011. Make It Meaningful. Contrast corporate blog. http://contrast.ie/blog/make-it-meaningful/ [Link] [Accessed: 18 January 2012] |
↑26 | Stephen P. Anderson. 2007. Creating Pleasurable Interfaces: Getting from tasks to Experiences (Slide 15). http://www.slideshare.net/stephenpa/creating-pleasurable-interfaces-getting-from-tasks-to-experiences. [Link] [Accessed: 18 January 2012] |
↑27 | Des Traynor. 16 January 2012. Copy the Fit, not the Features. The Intercom Blog. http://blog.intercom.io/copy-the-fit-not-the-features/ [Link] |
↑28 | Matthew S. Dryer & Martin Haspelmath (eds.). 2011. The World Atlas of Language Structures Online. Munich: Max Planck Digital Library. http://wals.info [Accessed: 12 December 2011] [Link] |
↑29 | M. Paul Lewis. (ed.), 2009. Ethnologue: Languages of the World, 16th Edn. Dallas, Tex.: SIL International. |
↑30 | John Wilbanks. 2009. Lions, Tigers, and Crowds. http://scienceblogs.com/commonknowledge/2009/03/lions_tigers_and_crowds.php |
↑31 | Marieke van Erp, Johan Oomen, Roxane Segers, Chiel van den Akker, Lora Aroyo, Geertje Jacobs, Susan Legêne, Lourens van der Meij, Jacco van Ossenbruggen; and Guus Schreiber. 2011. Automatic Heritage Metadata Enrichment with Historic Events. In J. Trant and D. Bearman (eds). Museums and the Web 2011: Proceedings. Toronto: Archives & Museum Informatics. Published March 31, 2011. http://conference.archimuse.com/mw2011/papers/automatic_heritage_metadata_enrichment_with_historic_events. [Accessed: 18 January 2012] [Link] |
↑32 | Besiki Stvilia and Corinne Jörgensen. 2009. User-generated collection level metadata in an online photo-sharing system. Library & Information Science Research. Vol. 31 No. 1, 54-65. DOI:10.1016/j.lisr.2008.06.006 [Link] [Accessed: 18 January 2012] |
↑33 | Adam Mathes. 2004. Folksonomies - Cooperative Classification and Communication Through Shared Metadata. http://www.adammathes.com/academic/computer-mediated-communication/folksonomies.html [HTML] [PDF] |
↑34 | Karen Smith-Yoshimura. 2010. Social Metadata for Libraries, Archives, and Museums. http://www.diglib.org/wp-content/uploads/2011/01/SocialMetadataforLAMs.pdf [PDF] [Accessed: 18 January 2012] |
↑35 | Krystyna K. Matusiak. 2006. Towards user-centered indexing in digital image collections. OCLC Systems & Services: International digital library perspectives. Vol. 22 No. 4, Pages 283-294. DOI: 10.1108/10650750610706998 [PDF] [Accessed 18 January 2012] |
thanks! I hope to read it…