It would be great to exemplify Martin Mous's CV in OLAC.
It would be great to exemplify Martin Mous's CV in OLAC.
I attended several papers and presented my own paper at NASKO 2023.
I was impressed with the paper presented by Julia Bullard.
Thesaurus Construction for Community-Centered Metadata [long paper] by Julia Bullard, Nigel Town, Sarah Nocente, Aleha McCauley and Heather O'Brien.
There were several things that I appreciated about it. While my observations and impressions are not directly related to the paper's subject the paper helped me think about other sorts of things as I struggle through my own thoughts and contexts.
Another thing was that Julia used the term "Extractive" as in the scholars of the university had an "extractive relationship" with the community. For me the term "extractive" with regards to "extractive research" has never been very clear. It has always seemed to be a highly charged term with lots of finger pointing without an clear definition. Therefore it seemed to be one of those general accusations which could never be defended against nor proven false. My first exposure to the term was at an ICLDC plenary where the speaker was asked questions and the term came up in either the plenary or the discussion. In reflection on the ICLDC conversation I think the speaker was from Canada, so maybe the term has some wider use in that geographical context than what I am used to. However, in Julia's case I really appreciated the definition of "extractive relationship" that she provided. She defined it as the non-accessibility of research results. Specifically applied to the way that researched peoples would think to access the results. Thi is an interesting dynamic to explore. For example, is it still extractive research if one collects information from individuals, but does provide the information back to the individuals, but then doesn't provide the community access to the sum of the participant's information? What about a summary of the information rather than the raw information? Would that still be extractive? Does extrative only apply in academic contexts or does it also apply in corporate contexts? Can non-profits be extractive? What if the research information was collected but there was no permission to share and the collecting organization can point back to that lack of permission to share, would that be extractive? The information serves the purpose of the organization but not the diverse purposes possible in the community or other actors within the community.
Finally, there was the topic of the creation of the thesaurus. There were a variety of terms that they sought to recontextualize. Presumably subject terms. I assessed these in a 4 part grid based on the kind of management practice needed. Top-left is severity, top-right degree of offensiveness, bottom-right Null-results or no Change, While in the bottom left were addressable terms where they were able to bring in a subject matter expert to engage with the materials and provide alternative terminology.
I found Carlin Soos's paper addressing issues in Generative AI based author attribution very interesting. I need to follow up with Carlin on these issues. He addressed it in terms of attribution and plagiarism, arguing strongly that it is not plagiarism but that there are other trace stakeholders in the mix. This has certain links to Linguistic applications in information annotation. There are other sorts of links to how universities craft policies. At UNT plagiarism includes the idea that an author can plagiarize their own work. This is crazy in my opinion. The administrative goal is to limit creative output to certain classes of creative efforts. Therefore anything outside the KO acknowledged by the administration is plagiarism. Since there is a social supported offense against plagiarism it is seen as evil. We see a similar approach to how governments define "terrorist organization". Different governments apply "security measures" for different reasons.
In the context of my own paper, Thomas Dousa asked a very important, and not unanticipated question regarding the types of bonds in the archival bond. Specifically what types of bonds exist and do these types of bonds infer that different series should be established within a collection of language resources. The clear answer is yes there are different kinds of bonds between resources, but it is less clear if there are any kinds of bonds which don't also occur in other kinds of archival collections. Establishing why something should be split remains an open area of research.
Finally, there was an interesting comment which cam out in a discussion, I think deserves some research. the comment or phrase "metadata is cataloging for men". Where did this phrase get its first use? is that documented?
I attended a totally fascinating presentation on Web Archiving by Matt Kelly of Drexel today.
Here are some resources I need to follow up on:
Defining aboutness of a collection is a challenge. From a philosophical point of view, this is even harder for collections in anthropological linguistics. These kinds of collections are not assembled for the sake of their "about-ness" but rather for the sake of their "is-ness". A collection in a museum might be about 19th century trains but such collections rarely contain the trains themselves. So, does this mean that linguistic collections are really about the people groups the speech is representing? and then the of-ness is the speech? Then linguists come along and write about the grammar of the language, and that is about the language? Often original stories will have an aboutness meaning which is never recorded in metadata. This needs to change.
This thought needs to be explored with MARC 655 $x and $v sub-fields. see: https://www.loc.gov/marc/bibliographic/bd655.html
see email: https://mail.google.com/mail/u/0/#sent/FMfcgzGsmrDLzSSBqXVPfKphwmdGhcZC
https://meta.discourse.org/t/mentionables/192948 <-- content in OLAC
Position conversations within the OLAC search space.
This might be a way forward to an OAI-PMH repo: https://github.com/discourse/discourse-sitemap another option is to use a query mechanism in the JSON api to get all threads and treat these threads as resources for description. https://meta.discourse.org/t/discourse-rest-api-documentation/22706
I wonder how many layers a tag-group can have... https://docs.discourse.org/#tag/Tags/operation/updateTagGroup
Subject analysis is very interesting. In a recent investigation into a theory of subject analysis, I was introduced to the concepts of: "about-ness", "is-ness", "of-ness".
Sometimes I wonder if linguists defy standard practices in subject representation, of if they define what a general population holds as a challenge with subject analysis in cataloging.
I harken to the OLAC application profile, which is based on Dublin Core. Dublin Core does not scope the subject element to "about-ness" analysis. UNT curriculum, informed and based (in structure) on Steven J. Miller', Metadata for Digital Collections: A How-To-Do-It Manual. The issue at hand is that for linguists, about-ness is only relevant for Information resources representing analysis. For other kinds of resources such as primary oral texts, or narratives captured via video which are often the object analyzed and discussed in information resources representing analysis, the primary view on subjecthood is through of-ness. As far as I know no-one has discussed audio and of-ness descriptions of audio.
It also makes me wonder if genre is mostly about utility and not about a binding style. To this end then a scholar looking for a phonology corpus, is looking for what—a combination of things—a MIMEType, with a relationship to another MIMEType, with an of-ness of a kind and a subject of "phonology".
By splitting up the concepts of: "about-ness", "is-ness", and "of-ness" it provides analytical space for more articulate descriptions in the dc:description field. But when it comes to language materials, the question is: is language a subject by virtue of "of-ness" or by virtue of "about-ness"? There are several implications here:
One of the frequent things I hear about OLAC is a critique of its Resource Type vocabulary. The OLAC application profile adds linguistics resource types in addition to DCMITypes and an unqualified DC type value. What I don't hear from these same cries for additional descriptive power is for a structured way to use any of the existing resource type vocabularies. Let me list a few:
It has been argued that the Dublin Core Type field is an example of a genre field. This may be true in some sense, but I have a tendency to think of it in terms of an interactivity type field; more of a modality field.
There is a gap in the subject content of OLAC related to legal theory of artifact ownership. Maybe there is a LCSH tag for this...
From time to time we might wonder where archival materials might be located. There might be a way to discover these using OLAC and FOAF.
Original hand drawn source image.