January 4-5, 2012, I had the opportunity to participate in the LSA's Satellite Workshop for Sociolinguistic Archival Preparation in Portland, Oregon. There were a great many things I learned there. So here are only a few thoughts.
Part of the discussion at the workshop was on how we can make corpora which are collected by Sociolinguists available to the larger Sociolinguistic community. In particular the discussion I am referencing revolved around the standardisation of metadata in the corpora. (In the discussion it was established that are two levels of metadata, "event level" and "corpus level".) While OLAC gives us some standardization about the corpus level metadata, the event metadata is still unique to each investigation, and arguably this is necessary. However, it was also pointed out that not all "event level" metadata need to be encoded or tracked uniquely. That is, data like date of recording, name of participants, location of recording, gender (male/female) of participant, can all be regularized across the community.
With the above as preface, it is important to realize that we do need to understand that there are still various kinds of metadata which need to be collected. In the workshop it was acknowledged that the field of language documentation was about 10 years ahead of this community of sociolinguists.What was not well defined in the workshop was what the distinction is between a language documentation corpus and a sociolinguistics corpus. It seems to me as a new practitioner that the chief difference between these two types of corpora is the self identifying quality of researcher. That is does the researcher self-identify as a Sociolinguist or as a Language Documenter. Both types of corpora attempt to get at the vernacular, and both types of corpora collect sociolinguistic facts. It would seem that both corpora are essentially the same (give or take a few metadata attributes). So, I will take an example from the metadata write-up I did for the Meꞌphaa language documentation project. In that project we collected metadata about:
- Equipment settings during recording
- Recording Environments
- Linguistic Dynamics
- Sociolinguistic Attitudes
In the following diagram I illustrate the cross cutting of a corpus with these "kinds" of metadata. The heavier, darker line represents the corpus, while the medium heavy lines represent the "kinds" of metadata. Finally, the lighter lines represent the sub-kinds of metadata, where the sub-kinds might be the latitude, longitude, altitude, datum, country, and place name of the location.
This does not mean that the corpus does not also need to be cross cut with these other "sub-kinds". However, these sub-kinds are significantly more in number and will very from project to project. Some of these metadata kinds will be collected in a speaker profile questionnaire. But some of these metadata can only be provided with reflection on the event. To demonstrate the cross cutting of these metadata elements on a corpus I have provided the following diagram. It uses categories which were mentioned in the workshop and is not intended to be comprehensive. In this second diagram, the cross cutting elements might themselves be taxonomies. They may have controlled vocabularies or they may have an open set of possible values, they may also represent a scale.
Both of these diagrams tend to illustrate what in this workshop were referred to a "event level" metadata, rather than "corpus level" metadata.
A note on corpus level metadata v.s. descriptive metadata
There is one more thing which I would like to say about "corpus level" metadata. Metadata is often separated out by function. That is what does the metadata allow us to do, or why is the metadata there?
I have been exposed to the following taxonomy of metadata types though course work and in working with photographs and images.  Photometadata.org. 2011. Classes Of Metadata. http://www.photometadata.org/node/46. [Link] [Accessed: 18 January 2012] These classes of metadata are also similar to those posted by JISC Digital Media as they approach issues with Metadata for digital audio.  JISC Digital Media. 07 January 2010. Metadata and Audio Resources. http://www.jiscdigitalmedia.ac.uk/audio/advice/metadata-and-audio-resources [Link] [Accessed: 19 March 2012]
- Descriptive meta-data: supports discovery, attribution and identification of resources created.
- Administrative meta-data: supports management, preservation, and appropriate usage of resources created.
- Technical: About the machinery used to create the resource and the technical aspects of the resource.
- Use and Rights: Copyright, license and moral ownership of the items.
- Structural meta-data: maintains relationships between the parts of complex, multi-part resources (Spanne 2008).  Spanne, Joan. 2008. Metadata: Why, What and How (the “Who” is You). Presentation for Audio and Video Techniques. Dallas: GIAL. 29 July 2008.
- Situational: this is metadata which describes the events around the creation of the work. Asking questions about the social setting, or the precursory events. It follows ideas put forward by Bergqvist (2007).  Bergqvist, Henrik. 2007. The role of metadata for translation and pragmatics in language documentation. In Peter K. Austin (ed.), Language Documentation and Description, vol. 4, 163-73. London: … Continue reading
- Use metadata: metadata collected from or about the users themselves (e.g. user annotations, number of people accessing a particular resource)  JISC Digital Media. 07 January 2010. An Introduction to Metadata. http://www.jiscdigitalmedia.ac.uk/crossmedia/advice/an-introduction-to-metadata/ [Link] [Accessed: 19 March 2012]
I think it is only fair to point out to archivist and to librarians that linguists and language documenters do not see a difference between descriptive and non-descriptive metadata in their workflows. That is sometimes we want to search all the corpora by licenses or by a technical attribute. This elevates the these attributes to the function of discovery metadata. It does not remove the function of descriptive metadata from its role in finding things but it does functionally mean that the other metadata is also viable as discovery metadata.
|↑1||Photometadata.org. 2011. Classes Of Metadata. http://www.photometadata.org/node/46. [Link] [Accessed: 18 January 2012]|
|↑2||JISC Digital Media. 07 January 2010. Metadata and Audio Resources. http://www.jiscdigitalmedia.ac.uk/audio/advice/metadata-and-audio-resources [Link] [Accessed: 19 March 2012]|
|↑3||Spanne, Joan. 2008. Metadata: Why, What and How (the “Who” is You). Presentation for Audio and Video Techniques. Dallas: GIAL. 29 July 2008.|
|↑4||Bergqvist, Henrik. 2007. The role of metadata for translation and pragmatics in language documentation. In Peter K. Austin (ed.), Language Documentation and Description, vol. 4, 163-73. London: SOAS.|
|↑5||JISC Digital Media. 07 January 2010. An Introduction to Metadata. http://www.jiscdigitalmedia.ac.uk/crossmedia/advice/an-introduction-to-metadata/ [Link] [Accessed: 19 March 2012]|
Over the last several months I have been looking for and comparing digitization services for audio, film, and for images (slides and more). I have been doing this as part of the ongoing work at the Language and Culture Archive to preserve the linguistic and cultural heritage of the people groups SIL International has encountered and served. I have not come to any hard and fast conclusions on “what is the best service provider”. This is partially because we are still looking at various out sourcing options and looking at multiple mediums is time consuming. Then there is also the issue of looking for archival standards and the creation of corporate policy for the digitization of these materials. I am presenting several names here as the results of several searches for digitization services providers.
Last month I was passed a short film on the BBC highlighting one of these providers. The short is well worth the watch because it highlights the reason and madness behind some of the work of digitization.
Several of the companies which have come to the top of the list.
- http://dijifi.com/ – Does the UN’s Collections
- http://www.digmypics.com/ – does work for National Geographic
- http://www.scancafe.com/ – Great consumer grade service
Doing it on our own
Another option the Archive has been looking at is to determine if the the quantity of the work is cost prohibitive to have professional done. Meaning that, we would be better served by buying the equipment and doing the work in house. So in the process I have also been looking at people’s experience with various kinds of equipment and technology used in scanning.
What is an archival version of an audio file?
An archival version of an audio file is a file which represents the original sound faithfully. In archiving we want to keep a version of the audio which can be used to make other products and also be used directly itself if needed. This is usually done through PCM. There are several file types which are associated with PCM or RAW uncompressed faithful (to the original signal) digital audio. These are:
- Standard Wave
- Wave 64
- Broadcast Wave Format (BWF)One way to understand the difference between audio file formats is understanding how different format are used. One place which has been helpful to me has been the DOBBIN website as they explain their software and how it can change audio from one PCM based format to another.
Each one of these file types has the flexibility to have various kinds of components. i.e. several channels of audio can be in the same file. Or one can have .wav files with different bit depths or sampling rates. But they are each a archive friendly format. Before one says that a file is suitable for archiving simply based on its file format one must also consider things like sample rates, bit depth, embedded metadata, channels in the file, etc. I was introduced to DOBBIN as an application resource for audio archivists by a presentation by Rob Poretti.  Rob Poretti. 2011. Audio Analysis and Processing in Multi-Media File Formats. ARSC 2011. [Accessed: 24 October 2011] http://www.arsc-audio.org/conference/audio2011/extra/48-Poretti.pptx [Link] One additional thing that is worth noting in terms of archival versions of digital audio pertains to born digital materials. Sometimes audio is recored directly to a lossy compressed audio format. It would be entirely appropriate to archive a born-digital filetype based on the content. However it should be noted that in this case the recordings should have been done in a PCM file format.
What is a presentation version? (of an audio file)
A presentation version is a file created with a content use in mind. There are several general characteristics of this kind of file:
- It is one that does not retain the whole PCM content.
- It is usually designed for a specific application. (Use on a portable device, or personal audio player)
- It can be thought of as a derivative product from an original audio or video stream.
In terms of file formats, there is not just one file format which is a presentation format. There are many formats. This is because there are many ways to use audio. For instance there are special audio file types optimized for various kinds of applications like:
- 3G and WiFi Audio and A/V services
- Internet audio for streaming and download
- Digital Radio
- Digital Satellite and Cable
- Portable playersA brief look a an explanation by Cube-Tec might help to get the gears moving. It is part of the inspiration for this post.
This means there is a long list of potential audio formats for the presentation form.
- AAC (aac)
- AC3 (ac3)
- Amiga IFF/SVX8/SV16 (iff)
- Apple/SGI (aiff/aifc)
- Audio Visual Research (avr)
- Berkeley/IRCAM/CARL (irca)
- CDXA, like Video-CD (dat)
- DTS (dts)
- DVD-Video (ifo)
- Ensoniq PARIS (paf)
- FastTracker2 Extended (xi)
- Flac (flac)
- Matlab (mat)
- Matroska (mkv/mka/mks)
- Midi Sample dump Format (sds)
- Monkey’s Audio (ape/mac)
- Mpeg 1&2 container (mpeg/mpg/vob)
- Mpeg 4 container (mp4)
- Mpeg audio specific (mp2/mp3)
- Mpeg video specific (mpgv/mpv/m1v/m2v)
- Ogg (ogg/ogm)
- Portable Voice format (pvf)
- Quicktime (qt/mov)
- Real (rm/rmvb/ra)
- Riff (avi/wav)
- Sound Designer 2 (sd2)
- Sun/NeXT (au)
- Windows Media (asf/wma/wmv)
Aside from just the file format difference in media files (.wav vs. .mp3) there are three other differences to be aware of:
- Media stream quality variations
- Media container formats
- Possibilities with embedded metadata
Media stream quality variations
Within the same file type there might be a variation of quality of audio. For instance Mp3 files can have a variable rate encoding or they can have a steady rate of encoding. When they have a steady rate of encoding they can have a High or a low rate of encoding. WAV files can also have a high or a low bit depth and a high or a low sample rate. Some file types can have more channels than others. For instance AAC files can have up to 48 channels where as Mp3 files can only have up to 5.1 channels.  Various Contributors. 21 October 2011 at 21:44 . Wikipedia: Advanced Audio Coding, AAC’s improvements over MP3. http://en.wikipedia.org/wiki/Advanced_Audio_Coding#AAC.27s_improvements_over_MP3 … Continue reading
One argument I have heard in favor of saving disk space is to use lossless compression rather than WAV files for archive quality (and as archive version) recordings. As far as archiving is concerned, these lossless compression formats are still product oriented file formats. One thing to realize is that not every file format can hold the same kind of audio. Some formats have limits on the bit depth of the samples they can contain, or they have a limit on the number of audio channels they can have in a file. This is demonstrated in the table below, taken from wikipedia.  Various Contributors. 21 October 2011 at 10:26 . Wikipedia:Comparison of audio formats, Technical Details of Lossless Audio Compression Formats. … Continue reading This is where understanding the relationship between a file format, a file extension and a media container format is really important.
|Audio compression format||Algorithm||Sample Rate||Bits per sample||Latency||Stereo||Multichannel|
|ALAC||Lossless||44.1 kHz to 192 kHz||16, 24||?||Yes||Yes|
|FLAC||Lossless||1 Hz to 655350 Hz||8, 16, 20, 24, (32)||4.3ms - 92ms (46.4ms typical)||Yes||Yes: Up to 8 channels|
|Monkey's Audio||Lossless||8, 11.025, 12, 16, 22.05, 24, 32, 44.1, 48 kHz||?||?||Yes||No|
|RealAudio Lossless||Lossless||Varies (see article)||Varies (see article)||Varies||Yes||Yes: Up to 6 channels|
|True Audio||Lossless||0–4 GHz||1 to > 64||?||Yes||Yes: Up to 65535 channels|
|WavPack Lossless||Lossless, Hybrid||1 Hz to 16.777216 MHz||varies in lossless mode; 2.2 minimum in lossy mode||?||Yes||Yes: Up to 256 channels|
|Windows Media Audio Lossless||Lossless||8, 11.025, 16, 22.05, 32, 44.1, 48, 88.2, 96 kHz||16, 24||>100ms||Yes||Yes:Up to 6 channels|
Media container formats
Media container formats can look like file types but they really are containers of file types (think like a folder with an extension). Often they allow for the bundling of audio and video files with metadata and then enable this set of data to act like a single file. On wikipedia there is a really nicecomparison of container formats.
MP4 is one such container format. Apple Lossless data is stored within an MP4 container with the filename extension .m4a – this extension is also used by Apple for AAC audio data in an MP4 container (same container, different audio encoding). However, Apple Lossless is not a variant of AAC (which is a lossy format), but rather a distinct lossless format that uses linear prediction similar to other lossless codecs such as FLAC and Shorten.  Various Contributors. 6 October 2011 at 03:11. Wikipedia: Apple Lossless. http://en.wikipedia.org/wiki/Apple_Lossless [Link] Files with a .m4a generally do not have a video stream even though MP4 containers can also have a video stream.
MP4 can contain:
- Video: MPEG-4 Part 10 (H.264) and MPEG-4 Part 2
Other compression formats are less used: MPEG-2 and MPEG-1
- Audio: Advanced Audio Coding (AAC)
Also MPEG-4 Part 3 audio objects, such as Audio Lossless Coding (ALS), Scalable Lossless Coding (SLS), MP3, MPEG-1 Audio Layer II (MP2), MPEG-1 Audio Layer I (MP1), CELP, HVXC (speech), TwinVQ, Text To Speech Interface (TTSI) and Structured Audio Orchestra Language (SAOL)
Other compression formats are less used: Apple Lossless
- Subtitles: MPEG-4 Timed Text (also known as 3GPP Timed Text).
Nero Digital uses DVD Video subtitles in MP4 files  Various Contributors. 11 October 2011 at 15:00. Wikipedia: MPEG-4 Part 14. http://en.wikipedia.org/wiki/.m4a [Link]
This means that an .mp3 file can be contained inside of an .mp4 file. This also means that audio files are not always what they seem to be on the surface. This is why I advocate for an archive of digital files which archives for a digital publishing house to also use technical metadata as discovery metadata. Filetype is not enough to know about a file.
Possibilities with embedded metadata
Audio files also very greatly on what kinds of embedded metadata and metadata formats they support. MPEG-7, BWF and MP4 all support embedded metadata. But this does not mean that audio players in the consumer market or prosumer market respect this embedded metadata. ARSC has in interesting report on the support for embedded metadata in audio recording software.  Chris Lacinak, Walter Forsber. 2011. A Study of Embedded Metadata Support in Audio Recording Software: Summary of Findings and Conclusion. ARSC Technical Committee. … Continue reading Aside from this disregard for embedded metadata there are various metadata formats which are embedded in different file types, one common type ID3, is popular with .mp3 files. But even ID3 comes in different versions.
In archiving Language and Culture Materials our complete package often includes audio but rarely is just audio. However, understanding the audio components of the complete package help us understand what it needs to look like in the archive. In my experience in working with the Language and Culture Archive most contributors are not aware of the difference between Archival and Presentation versions of audio formats and those who think they do, generally are not aware of the differences in codecs used (sometimes with the same file extension). From the archive’s perspective this is a continual point of user/submitter education. This past week have taken the time to listen to a few presentations by Audio Archivist from the 2011 ARSC convention. These in general show that the kinds of issues that I have been dealing with in the Language and Culture Archive are not unique to our context.
- Anthony Seeger, Maureen Russell, David Martinelli. Ethnographic Sound Archives.http://www.arsc-audio.org/conference/audio2011/mp3/14.mp3 [Accessed 24 Oct. 2011]
- Wendy Sistrunk, Sandy Rodriguez. The Goldin Transcription Collection at UMKC. http://www.arsc-audio.org/conference/audio2011/mp3/16.mp3 [Accessed 24 Oct. 2011] [PDF visual of presentation]
- Birgitta Johnson. Gospel music in L.A.http://www.arsc-audio.org/conference/audio2011/mp3/39.mp3 [Accessed 24 Oct. 2011]
|↑1||Rob Poretti. 2011. Audio Analysis and Processing in Multi-Media File Formats. ARSC 2011. [Accessed: 24 October 2011] http://www.arsc-audio.org/conference/audio2011/extra/48-Poretti.pptx [Link]|
|↑2||Various Contributors. 21 October 2011 at 21:44 . Wikipedia: Advanced Audio Coding, AAC’s improvements over MP3. http://en.wikipedia.org/wiki/Advanced_Audio_Coding#AAC.27s_improvements_over_MP3 [Link]|
|↑3||Various Contributors. 21 October 2011 at 10:26 . Wikipedia:Comparison of audio formats, Technical Details of Lossless Audio Compression Formats. http://en.wikipedia.org/wiki/Comparison_of_audio_codecs#Technical_Details_of_Lossless_Audio_Compression_Formats [Link]|
|↑4||Various Contributors. 6 October 2011 at 03:11. Wikipedia: Apple Lossless. http://en.wikipedia.org/wiki/Apple_Lossless [Link]|
|↑5||Various Contributors. 11 October 2011 at 15:00. Wikipedia: MPEG-4 Part 14. http://en.wikipedia.org/wiki/.m4a [Link]|
|↑6||Chris Lacinak, Walter Forsber. 2011. A Study of Embedded Metadata Support in Audio Recording Software: Summary of Findings and Conclusion. ARSC Technical Committee. http://www.arsc-audio.org/pdf/ARSC_TC_MD_Study.pdf [Link]|
I have recently been reading the blog of Martin Fenner and came upon the article Personal names around the world  Martin Fenner. 14 August 2011. Personal names around the world. PLoS Blog Network. http://blogs.plos.org/mfenner/2011/08/14/personal-names-around-the-world . [Accessed: 16 September 2011]. [Link] . His post is in fact a reflection on a W3C paper on Personal Names around the WorldSeveral other reflections are here: http://www.w3.org/International/wiki/Personal_names (same title). This is apparently coming out of the i18n effort and is an effort to help authors and database designers make informed decisions about names on the web.
I read Martin’s post with some interest because in Language Documentation getting someone’s name as a source or for informed consent is very important (from a U.S. context). Working in a archive dealing with language materials, I see lot of names. One of the interesting situations which came to me from an Ecuadorian context was different from what I have seen in the w3.org paper or in the w3.org discussion. The naming convention went like this:
The elder was known by the younger’s name plus a relationship.
My suspicion is that it is a taboo to name the dead. So to avoid possibly naming the dead, the younger was referenced and the the relationship was invoked. This affected me in the archive as I am supposed to note who the speaker is on the recordings. In lue of the speakers name, I have the young son’s first name, who is well known in the community, and is in his 30’s or so, and I have the relationship. So in English this might sound like
John’s mother. Now what am I supposed to put in the metadata record for the audio recordings I am cataloging? I do not have a name but I do have a relationship to a known (to the community) person.
I inquired with a literacy consultant who has worked in Ecuador with indigenous people for some years, she informed me that in one context she was working in everyone knew what family line they were from and all the
names were derived from that family line by position. It was of such that to call someone by there name was an insult.
It sort of reminds me of this sketch by Fry and Laurie.
Working in an archive, I deal with a lot of metadata. Some of this metadata is from controlled vocabularies. Sometimes they show up in lists. Some times these controlled vocabularies can be very large, like for the names of language where there are a limited amount of languages but the amount is just over 7,000. I like to keep an eye out for how websites optimized the options for users. FaceBook, has a pretty cool feature for narrowing down the list of possible family relationships someone has to you. i.e. a sibling could be a brother/sister, step-brother/step-sister, or a half-brother/half-sister. But if the sibling is male, it can only be a brother, step-brother, or a half-brother.
FaceBook narrows the logical selection down based on atributes of the person mentioned in the relationship.
That is if I select Becky, my wife, as an person to be in a relationship with me then FaceBook determines that based on her gender atribute that she can only be referenced by the female relationships.
I have had some ideas I wanted to try out for using the iPad as a tool for collecting photo metadata. Working in a corporate archive, I have become aware of several collections of photos without much metadata associated with them.
The photos are the property of (or are in the custodial care of) the company I work at (in their corporate archive).
The subject of the photos are either of two general areas:
- The minority language speaking people groups that the employees of a particular company worked with, including anthropological topics like ways of life, etc.
- Photos of operational events significant to telling the story of the company holding the photos.
Archives in more modern contexts are trying to show their relevance to not only academics, but also to general members of communities. In the United States there is a whole movement of social history. There are community preservation societies which take on the task of collecting old photographs and their stories and preserving, and presenting them for future generations.
The challenge at hand is: "How do we enrich photos by adding metadata to photos in the collections of archives?" There are many solutions to this kind of task. The refining, distilling, and reduction of stories and memories to writing and even to metadata fields is no easy task, nor is it a task that one person can do on their own. One solution, which is often employed by community historians is the personal interview. By interviewing the photographers or people who were at an event and asking them questions about a series of photos it presents an atmosphere of inquisitiveness and one where the story-teller is valued because they have a story-listener. This basic personal connection allows for interactions to occur across generational and technological barriers.
The crucial question is: "How do we facilitate an interaction which is positive for all the parties involved?" The effort and thinking behind answering this question has more to do with shaping human interactions than with anything else. We are also talking about using technology in this interaction. This is true UX or (User Experience).
This past summer I have had several experiences with facilitating one-on-one interactions between knowledgeable parties working with photographs and with someone acting on behalf of the corporate archive. To facilitate this interaction a GoogleDoc Spreadsheet was set up and the person acting on the behalf of the archive was granted access to the spreadsheet. The individual conducting the interview and listening to the stories brought their own netbook (small laptop) from which to enter any collected data. They were also given a photo album full of photos, which the interviewee would look through. This set-up required overcoming several local environmental challenges. As discussed below, some of these challenges were better addressed than others.
Association of Data to a Given Photo
The challenge of keeping up to 150 photos organized durring an interview so that metadata about any given photo could be collected and associated with only that photo. This was addressed by adhering an inventory sticker to the back of each photo and assigning each photo a single row in the GoogleDoc Spreadsheet. Using GoogleDocs was not the ideal solution, but rather than a solution of some compromises:
Strengths of GoogleDocs
- One of the great things about GoogleDocs is that the capability exists for multiple people to edit the spreadsheet simultaneously.
- Another strength of GoogleDocs is that there is a side bar chat feature so that if there is a question durring the interview that help could be had very quickly from management (me, who was offsite).
- The Data can be exported in the following formats: .xlsx , .xls , .csv , .pdf.
- There was no cost to deploy the technology.
- It is accessible through a web-browser in an OS neutral manner.
- The document is available wherever the internet is available.
- A single solution could be deployed and used by people digitizing photos, recording written metadata on the photos, and gathering metadata during an interview.
- Most people acting on behalf of the archive were familiar with the technology.
Pitfalls of GoogleDocs
- More columns exist in the spread sheet than can be practically managed (The columns are presented below in a table). There are about 48 values in a record and there are about 40,000 records.
- Does not display the various levels of data as levels of data as levels in the user interface.
- Cannot remove unnecessary fields from the UI of various people. (No role-based support.)
- Only available when there is internet.
Maximizing of Interview Time
To maximize time spent with the interviewee the photos and any metadata written or known about a photo was put into the GoogleDoc Spreadsheet prior to the interview. Sometimes this was not done by the interviewer but rather by someone else working on behalf of the archive. Durring the interview the interviewer could tell which data fields were empty by looking for the gray cells in the spreadsheet. However, just because the cells were did not mean that the interviewee was more prone to provide the desired, unknown, information.
Data Input Challenges
One unanticipated challenge which was encountered in the interviews was that as the interviewer would bring out an album or two of photos that the interviewees would be able to cover more photos than the interviewer could record.
Let me spell it out. There is one interviewer and two interviewees there are 150 photos in an album lying open on the table. All three participants are looking at the photo album. The interviewee A says
look that is so-and-so and then interviewee B (because the other page is closer to them) says
and this is so-and-so! This happens for about 8 of the 12 facing photos. Because the interviewer is still typing the first name mentioned they ask
and when do you think that was? But the metadata still comes in faster, as the second interviewee did not hear the question and the first one did but still thinking. The bottom line is that more photos are viewed and commented on faster than can be recorded.
Something that could help this process would be to in some way to slow-down (or moderate) the ability of the interviewee(s) to access the photos. Something that could synchronize the processing times with the viewing times. By scanning the photos and then displaying them on a tablet it slows down the viewing process and integrates the recording of data with the viewing of photos.
Positional Interaction Challenges
An interview is, at some level, an interaction. One question which comes up is How does the technology used affect that interaction? What we found was that a laptop usually was situated between the interviewer and the interviewees. This positioned the parties in an apposing manner. Rather than the content becoming the central focus of both parties, the content was either in front of the interviewer or in front of the interviewees. A tablet changes this dynamic in the interaction. It brings both parties together over a single set of content, both positionally and cognitively. When the photo is displayed on the laptop, the laptop has to be rotated so that the interviewees can see the image and then turned so that the interviewer can input the data. This is not the case for a tablet.
Content Management Challenges
When Paper is used for collecting metadata it is ideal to have one piece of paper for each photo. Sometimes this method is preferable to using a single computer. I used this method when I had a photo display and about 20 albums and about 200 people all filling out details at once.People came and went as they pleased. When someone recognized someone or someplace they knew, they wrote down the picture ID and the info they were contributing along with their name. However, carrying around photo albums and paper there is the challenge of keeping all the photos from getting damaged, and maintaining the order of the photos and associated papers.
When there is no internet there is no access to GoogleDocs. We encountered this when we went to someone's apartment, expecting interent because the interent is available on campus and this apartment was also on campus. Fortunately we did have a back up plan and paper pen was used. But this means that we now had to type out the data, which was written down on the paper; in effect doing the same recording work twice.
Size of Devices
Photo albums have a certain bulk and cumbersome-ness which is multiplied when carrying more than one album at a time. Add to this a computer laptop and one might as well add to the list of required items, a hand truck with which to carry everything. A tablet is all in all a lot smaller and lighter.
Proof of Concept Technology
As I mentioned before, I had an iPad in my possession for a few days. So to capitalize on the opportunity, I bought a few apps from the app store, as I mentioned that I would and tried them out.
Software which does not work for our purposes
The first app I tried was Photoforge2. It is a highly rated app in the app store. I found that it delivered as promised. One could add or edit the IPTC and EXIF metadata. One could even edit where the photo was taken with a pin drop interface.
Meta Editor, another iPad app, which was also highly acclaimed performed task almost as well. Photoforge2 had some photo editing features which were not needed in our project. Whereas Meta Editor was focused only on metadata elements.
- Both applications edit the Standards based IPTC and EXIF metadata fields in photos. We have some custom metadata which does not fit into either of these fileds.One aspect of the technology being discussed, which might be helpful for readers to understand, is that these iPad applications actually embed the metadata into the photos. So when the photos are then taken off of the iPad the metadata travels with them. This is a desirable feature for presentation photos.
- Even if we do embed the metadata with these apps the version of the photo being enriched is not the Archival version of the photo it is the Presentation version of the photo. We still need the data to become associated with the archival version of the photo.
Software with some really functional features
So we needed something with a mechanism for capturing our customized data. Two options were found which seemed to avail themselves as suitable for the task. One was ideal the other rapidly deployable. Understanding the iPads' place in the larger place of corporate architecture, relationship to the digital repository, the process of data flow from the point of collection to dissemination, will help us to visualize the particular challenges that the iPad presents solutions for. Once we see where the iPad sits in relationship to the rest of the digital landscape I think it will be fairly obvious why one solution is ideal and the other rapidly deployable.
Placement in the iPad in the Information Architecture Picture
In my previous post on Social Metadata Collection Hugh J. Paterson III. 29 June 2011. The Journeyler. [Accessed: 13 September 2011] http://hugh.thejourneyler.org/social-meta-data-collection. [Link] I used the below image to show where the iPad was used in the metadata collection process.
Since that time, as I have shown this image when I talk about this idea, I have become aware that the image is not detailed enough. Because it is not detailed enough it can lead to some wrong assumptions on how the iPad use being proposed actually works. So, I am presenting a new image with a greater level of detail to show how the iPad interacts with other corporate systems and workflows.
There are several things to note here:
- Member Disporia as represented here is not just members, it is their families, the people with whom these members worked, it is the members currently working and it the members living close at hand on campus, not just in disporia.
- It is a copy of the presentation file which is pushed out to the iPad or the website for the Member Disporia. This copy of the file does not necessarily need to be brought back to the archive as long as the metadata is synced back appropriately.
- The Institutional Repository for other corporate items is currently in a DSpace instance. However, it has not been decided for sure that photos will be housed in this same instance, or even in DSpace.
That said, it is important that the metadata be embedded in the presentation file of the image, as well as accessible to the Main container for the archival of the photos. The metadata also needs to sync between the iPad application and the Member Diaspora website. Metadata truly needs to flow through the entire system.
FileMaker Pro with File Maker Go
FileMaker Pro is a powerful database app. It could drive the Member Disporia website and then also sync with the iPad. This would be a one-stop solution and therefore and ideal solution. It is also complex and takes more skill to set up than I currently have, or I can currently spare to acquire. Both FileMaker Pro and its younger cousin Bento enable Photos to be embedded in the actual database.Several tips from the Bento forums on syncing photos which are part of the database:
Syncing pictures from Bento-Mac to Bento-iPad
Sync multiple photos or files from desktop to IPad
This is something which is important with regards to syncing with the iPad. To the best of my knowledge (and googling) no other database apps for the iPad or Android platforms allow for the syncing of photos within the app.
Bento is the rapidly deployable option.What are the differences between Bento 4 for Mac, Bento for iPad 1.1.x, and Bento for iPhone/iPod touch 1.1.x?
It took me about 2 hours (while doing other stuff) to download a trial version, find out how it worked, import my data from the GoogleDoc and then sync my database with the iPad.
Here is a YouTube video demonstrating my proof of concept using Bento.
Here is a series of iPad Screen shots.
Some outstanding issues
- Geo-location of Photos in Bento. Bento version 4 does have location fileds which can be used with a pin drop interface to add location data to the appropiate fileds in the database. My proof of concept demo does not demonstrate this feature.Using Geo-location fields in Bento: Working with Location Fields in Bento
How to use Location fields in Bento for iPhone/iPad 1.1.1
- Rapid reuse of data. Because the interview process naturally lends itself to eliciting the same kind of data over a multitude of photos a UX/UI element which allows the rapid reuse of data would be very practical. The kinds of data which would lend themselves to rapid reuse would be peoples' names, locations, dates, photographer, etc. This may mean being able to query a table of already input'd data values with an auto-suggest type function.
Custom iPad App
Of course there is also the option to develop a custom iPad app for just our purposes. This entails some other kinds of planning, including but not limited to:
- Custom App development
- Support plan
- Deploy or develop possible Web-backend - if needed.
Kinds of custom metadata being collected.
The table in this section shows the kinds of questions we are asking in our interviews. It is not only provided for reference as a discussion of the Information Architecture for the storage and elements of the metadata schema is out of the scope of this discussion. The list of questions and values presented in the table was derived as a minimal set of questions based on issues of Image Workflow Processing, Intelectual Property and Permissions, Academic Merit and input from the controlled vocabulary's Caption and Keywording Guidelines  Controlled Vocabulary. Caption and Keywording Guidelines. [Accessed: 13 September 2011] http://www.controlledvocabulary.com/metalogging/ck_guidelines.html. [Link] which is part of their series on metalogging. The table also shows corresponding IPTC, and EXIF data fields. (Though they are currently empty because I have not filed them in.) Understanding the relationships of XMP, IPTC, and EXIF also help us to understand why and how the iPad tool needs to interact with other Archiving solutions. However, it is not within the scope of this post to discuss these differences.Some useful resources on these issues are noted here:
- Photolinker Metadata Tags  Early Innovations, LLC. 2011. Photolinker Metadata Tags. [Accessed: 13 September 2011] http://www.earlyinnovations.com/photolinker/metadata-tags.html. [Link] has a nice display outlining where XMP, IPTC and EXIF data overlap. This is not authoritative, but rather practical.
- List of IPTC fields: List of IPTC fields. However, a list is not enough we also need to know what they mean so that we know that we are using them correctly.
- EXIF and IPTC Header Comments. Here is another list of IPTC fileds. This list also includes a list of list of EXIF fileds. (Again without definitions.)
- Various programs and applications also add their own metadata fields in the IPTC section. Here is a mapping of some of the most popular ones: http://www.controlledvocabulary.com/imagedatabases/iptc_core_mapped.pdf
- IPTC Standard Photo Metadata  David Riecks. 2010. IPTC Standard Photo Metadata (July 2010). International Press Telecommunications Council. [Accessed: 13 September 2011] … Continue reading http://www.iptc.org/std/photometadata/documentation/IPTC-PLUS-Metadata-Panel-UserGuide_6.pdf
- Doublin Core with Photographs: http://makeit.digitalnz.org/askaquestion/questions/26
- Dublin Core Metadata Element Set, Version 1.1: http://dublincore.org/documents/dces/
- DCMI Type Vocabulary: http://dublincore.org/documents/dcmi-type-vocabulary/
- Describing Digital Content: http://makeit.digitalnz.org/guidelines/describing-digital-content/
|Metadata Element||Purpose||Explanation||Doublin Core||IPTC Tags||EXIF Tags|
|Photo Collection||This is the name of the collection in which the photos reside|
|Sub Collection||This is the name of the sub collection in which the photos reside|
|Letter of Collection||Each collection is given an alpha character or a series of alpha characters, if the collection pertains to one people group then the alpha characters given to that collection are the three digit ISO 639-3 code|
|Who input the Meta-data||This is the name of the person inputting the metadata|
|Photo Number||This is the number of the photo as we have inventoried the photo|
|Negative Number||This is the number of the photo as it appears on the negative (film strip)|
|Roll||This is the ID of the Roll||Most sets of negatives are cut into strips of 5 or less this allows us to group these sets together to ID a “set” of photos|
|Section Number||If the items are in a book or a scrap book and that scrap book has a section this is where that is recoreded|
|Page#||If a scrap book has a set of pages then this is where they are recoreded|
|Duplicates||This is where the Photo ID of a duplicate item is referenced.|
|Old Inventory Number(s)||This is the inventory number of an item if it were part of another invenotry system|
|Photographer||This is the name of the photographer|
|Subject 1 (who)||Who is in the photo, this should be an unlimited field. That is sveral names should be able to be added to this.|
|Subject 2||Who is in the photo, this should be an unlimited field. That is sveral names should be able to be added to this.|
|Subject 3||Who is in the photo, this should be an unlimited field. That is sveral names should be able to be added to this.|
|People group||This is the name of the people group meneined in the ISO 639-3 codes|
|ISO 639-3 Code||This is the ISO 639-3 code of the people group being photographed|
|When was the photo Taken?||The date the photo was taken|
|Country||The country in which the photo was taken|
|District/City||This is the City where the photo was taken|
|Exact Place||The exact place name where the photo was taken|
|What is in the Photo (what)||This is an item in the photo|
|What is in the Photo||Additional what is in the photo|
|What is in the Photo||Addtional what is in the photo|
|Why was the Photo Taken?||This is to help metadata providers think about how events get communicated|
|Description||This is a description of the photo’s contents||This is not a caption but could be used as a caption|
|Who Provided This Meta-Data? And when?||We need to keep track of who is the source of certain metadata to understand its authority|
|Who Provided This Meta-Data? And when?||We need to keep track of who is the source of certain metadata to understand its authority|
|Who Provided This Meta-Data? And when?||We need to keep track of who is the source of certain metadata to understand its authority|
|Who Provided This Meta-Data? And when?||We need to keep track of who is the source of certain metadata to understand its authority|
|Who Provided This Meta-Data? And when?||We need to keep track of who is the source of certain metadata to understand its authority|
|I am in this photo and I approve it to be on the internet. Put in "yes" or "No" and write your name in the next column.||Permission to distribute|
|Name:||Name of the person releasing the photo|
|How was this photo digitized?||Method of digitization and the tools used in digitization|
|Who digitized This photo||This is the name of the person who did the digitization|
|↑1||Alia Haley. 31 August 2011. Tablet vs. Laptop. Church Mag. [Accessed: 11 September 2011] http://churchm.ag/tablet-vs-laptop. [Link]|
|↑2||Hugh J. Paterson III. 29 June 2011. The Journeyler. [Accessed: 13 September 2011] http://hugh.thejourneyler.org/social-meta-data-collection. [Link]|
|↑3||Controlled Vocabulary. Caption and Keywording Guidelines. [Accessed: 13 September 2011] http://www.controlledvocabulary.com/metalogging/ck_guidelines.html. [Link]|
|↑4||Early Innovations, LLC. 2011. Photolinker Metadata Tags. [Accessed: 13 September 2011] http://www.earlyinnovations.com/photolinker/metadata-tags.html. [Link]|
|↑5|| David Riecks. 2010. IPTC Standard Photo Metadata (July 2010).|
International Press Telecommunications Council. [Accessed: 13 September 2011] http://www.iptc.org/std/photometadata/documentation/IPTC-PLUS-Metadata-Panel-UserGuide_6.pdf [Link]
|↑6||Metadata Working Group. 2009 Guidelines for Handling Image Metadata, Version 1.0.1. [Accessed: 13 September 2011] http://www.metadataworkinggroup.org/pdf/mwg_guidance_v101.pdf. [Link]|
Working in an archive, one can imagine that letting go of materials is a real challenge. Both in that it is hard to do becasue of policy, but also because it is hard to do because of the emotional “pack-rat” nature of archivist. This is no less the case of the archive where I work. We were recently working through a set of items and getting rid of the duplicates. (Physical space has its price; and the work should soon be available via JASOR.) However, one of the items we were getting rid of was a journal issue on a people group/language. The journal has three articles, of these, only one of them article was written by someone who worked for the same organization I am working for now. So the “employer” and owner-operator of the archive only has rights to one of the three works. (Rights by virtue of “work-for-hire” laws.) We have the the off-print, which is what we have rights to share, so we keep and share that. It all makes sense. However, what we keep is catalogued and inventoried. Our catalogue is shared with the world via OLAC. With this tool someone can search for a resource on a language, by language. It occurs to me that the other two articles on this people group/language will not show in the aggregation of results of OLAC. This is a shame as it would be really helpful in many ways. I wish there was a groundswell, open source, grassroots web facilitated effort where various researchers can go and put metadata (citations) of articles and then they would be added to the OLAC search.
In the course of my experience I have been asked about PDFs and OCR several times. The questions usually follow the main two questions of this post.
So is OCR built into PDFs? or is there a need for independent OCR?
In particular an image based PDF, is it searchable?
The Short answer is Yes. Adobe Acrobat Pro has an OCR function built in. And to the second part: No, an image is not searchable. But what can happen is that Adobe Acrobat Pro can perform an OCR function to an image such as a .tiff file and then add a layer of text, (the out put of the OCR process) behind the image. Then when the PDF is searched it actually searches the text layer which is behind the image and tries to find the match. The OCR process is usually between 80-90% accurate on texts in english. This is usually good enough for finding words or partial words.
The Data Conversion Laboratory has a really nice and detailed write up on the process of converting from images to text with Adobe Acrobat Pro.
University Illinois Chicago explains how to do use Adobe Acrobat Pro and OCR with a scanner using a TWAIN driver.
The better OCR option
Since I work in an industry where we are dealing with multiple languages and the need to professionally OCR thousands of documents I thought I would provide a few links on the comparison of OCR software on the market.
Lifehacker has short write up of the top five OCR tools.
Of those top 5, in this article, two, ABBYY Fine Reader and Adobe Acrobat are compared side by side on both OS X and Windows.
Are all files used to create an orignal PDF included in the PDF?
The Short answer is No. But the long answer is Yes. Depending on the settings of the PDF creator the original files might be altered before they are wrapped in a PDF wrapper.
So the objection, usually in the form of a question sometimes comes up:
Is the PDF file just using the PDF framework as a wrapper around the original content? Therefore, to archive things “properly” do I still need to keep the .tiff images if they are included in the PDF document?
The answer is: “it depends”. It depends on several things, one of which is, what program created the PDF and how it created the PDF. – Did it send the document through PostScript first? Another thing that it depends on is what else might one want to do with the .tiff files?
In an archiving mentality, the real question is: “Should the .tiff files also be saved?” The best practice answer is Yes. The reason is that the PDF is viewed as a presentation version and the .tiff files are views as the digital “originals”.
As part of my job I work with materials created by the company I work for, that is the archived materials. We have several collections of photos by people from around the world. In fact we might have as many as 40,000 photos, slides, and Negatives. Unfortunately most of these images have no Meta-data associated with them. It just happens to be the case that many of the retirees from our company still live around or volunteer in the offices. Much of the meta-data for these images lives in the minds of these retirees. Each image tells a story. As an archivist I want to be able to tell that story to many people. I do not know what that story is. I need to be able to sit down and listen to that story and make the notes on each photo. This is time consuming. More time consuming than I have.
Here is the Data I need to minimally collect:
Photo ID Number: ______________________________
Who (photographer): ____________________________
Who (subject): ________________________________
When (was the photo taken): _______________________
Where (Country): _______________________________
Where (City): _________________________________
Where (Place): ________________________________
What is in the Photo: ____________________________
Why was the photo taken (At what event):_________________________
Photo Description:__short story or caption___
Who (provided the Meta-data): _________________________
Here is my idea: Have 2 volunteers with iPads sit down with the retirees and show these pictures on the iPads to the retirees and then start collecting the data. The iPad app needs to be able to display the photos and then be able to allow the user to answer the above questions quickly and easily.
The iPad is only the first step though. The iPad works in one-on-one sessions working with one person at a time. Part of the overall strategy needs to be a cloud sourcing effort of meta-data collection. To implement this there needs to be central point of access where interested parties can have a many to one relationship with the content. This community added meta-data may have to be kept in a separate taxonomy until it can be verified by a curator, but there should be no reason that this community added meta-data can not be expected to be valid.
However, what the app needs to do is more inline with MetaEditor 3.0. MetaEditor actually edits the IPTC tags in the photos – Allowing the meta-data to travel with the images.In one sense adding meta-data to an image is annotating an image. But this is something completely different than what Photo Annotate does to images.
Photosmith seems to be a move in the right direction, but it is focused on working with Lightroom. Not with a social media platform like Gallery2 & Gallery3, Flickr or CopperMine.While looking at open source photo CMS’s one of the things we have to be aware of is that meta-data needs to come back to the archive in a doublin core “markup”. That is it needs to be mapped and integrated with our current DC aware meta-data scehma. So I looked into modules that make Gallery and Drupal “DC aware”. One of the challenges is that there are many photo management modules for drupal. None of them will do all we want and some of them will do what we want more elegantly (in a Code is Poetry sense). In drupal it is possible that several modules might do what we want. But what is still needed is a theme which elegantly, and intuitively pulls together the users, the content, the questions and the answers. No theme will do what we want out of the box. This is where Form, Function, Design and Development all come together – and each case, especially ours is unique.
- Adding Dublin Core Metadata to Drupal
- Dublin Core to Gallery2 Image Mapping
- Galleries in Drupal
- A Potential Gallery module for drupal – Node Gallery
- Embedding Gallery 3 into Drupal
- Embedding Gallery 2 into Drupal
This, cloud sourcing of meta-data model has been implemented by the Library of Congress in the Chronicling America project. Where the Library of Congress is putting images out on Flickr and the public is annotating (or “enriching” or “tagging” ) them. Flickr has something called Machine Tags, which are also used to enrich the content.
There are two challenges though which still remain:
- How do we sync offline iPad enriched photos with online hosted images?
- How do we sync the public face of the hosted images to the authoritative source for the images in the archive’s files?