This post attempts to express an overview of options within the language archiving enterprise with respect to the discoverability and accessibility of resources.
All Archivable Resources
Un-archived Resources
May be:
- known (common knowledge, grant funded, etc.)
- unknown (e.g. individual research projects, that only select few know to exist)
- discoverable (posted on a personal or departmental website)
- privately kept (without public discovery)
However, no instance (and therefore also record) of these resources exists in the curated catalogues of professional libraries or institutional archives dedicated to the care and stewardship of language resources. Furthermore, in the above scenarios there is no long term preservation plan for these resources, even if a redundancy fallback copy of the data exists.
Archived but Private Resources
These resources are severely restricted. Most people (including specialists in the language family, some archive staff and even some community members) do not know about them.
- Meta-data is hidden (not shared publicly).
- Archived objects have restricted access.
While archives may not be able to directly report on these objects, they can indirectly report what percentage of the archive's total content these items comprise.
Example of an indirect report: 10% of XYZ archive's total contents are severely restricted. Most corpora contain less than 0.1% of severely restricted content.
Such reporting is healthy for:
- Funders - to help understand the nature of how language data is viewed by various communities. It also communicates that the archiving institution is being as transparent as possible with the data it does have - a mark of faithful stewardship.
- Archive administrators - to monitor basic trends across individual corpora, across their entire archive's submissions, and across the larger language archiving community.
- Language and linguistic specialists - to realize that these options do exist and if these options need to be exercised, that these options for archiving are used within industry "norms". To this end, linguists also need some example use cases.
- Communities - to realize that archives have not forgotten that they have a connection with communities which are not listed in more public places.
Some restrictions are necessary. They help to build trust in archiving institutions and appropriate expectations for various stakeholders.
Note: The reasons for these restrictions should be documented so that when archive staff change, the rational for the restrictions is not lost. Additionally, the archive staff and the depositors should be in contact at a pre-determined interval to establish the continued necessity for this level of resource suppression. Frequency of communication can vary (but 3-5 years is a long time in today's world).
Archived and Open Resources
Meta-data is publicly advertised, and the resource is openly available.
- Meta-data is open and discoverable.
- Items are open to public access either through direct click and download or through an automated human verification (like login or recaptcha).