The OLAC validator runs off of an unit of software which has the heartbleed security vulnerability. Thinking about implementing a validator the following software comes to mind. https://github.com/zimeon/oaipmh-validator There was also an Online OAI-PMH validator from a former engineer on the Europeana project. I think he is based in Greece. His solution is not open source, but he mentioned that he would consider adding the OLAC profile. https://validator.oaipmh.com/
It would be good to see what other OAI-PMH validators look like and how submitters expect to interact with them.
I was looking at the maturity of golang for data science and for projects in goLang which enable the interaction with OAI-PMH feeds. In my case working with XML is fairly important. I don't see in this XML example how to extract attributes and put those in the struct.
This week we had a lecture on metadata interoperability. Interoperability is a major theme of Gary Simons work on OLAC. It was the keyword or concept that he used to push the social behavior requirements related to the activities around, in, and at language archives.
I think that across the history of OLAC there have been various understandings on the kinds of metadata needed to describe language resources. That is, discovery is the architectural goal of OLAC, but other requirements also exist. In the beginning of OLAC many of the participants were looking at OLAC for a complete solution to the kinds of metadata they should be collecting and using. The other requirements upon resource stewards have always meant additional fields in diverse institutional contexts. The freedom to explore these other requirements has not always been explored or embraced by stewards. Some have seen OLAC as an all or nothing involvement. Maybe the fear has been that there will be divergence from a communal norm.
However, my perspective is that it is quite normal for each institution to have its own metadata schema or application profile some portion of which gets shared with OLAC.
With this as background then, with the assumption that different management practices will produce different metadata schemes it seems reasonable that each institution should update their schema from time to time. This implies that metadata quality in terms of coverage or "encoding" is a moving target. Another implication then, is that even in fields which are shared with the OLAC aggregator and are defined in the OLAC metadata application profile, that those fields may have different internal syntax at different providers or at different time depths of the records creation.
The ISO639-3 field is one evidence of evolutionary change. This standard has fields which split and merge from time to time. Associating a records time of creation with a version of an institutions metadata schema is a useful dynamic when evaluating a record's quality.
The question is how should a record and the version of its applicable metadata profile be associated in the OLAC context? How should this information be communicated to record viewers?
The answer is rather straightforward, but requires two parts. The first part requires a modification to the archive profile to have two information bits:
The name of the native application profile at the data provider
A link to the native metadata application profile documentation
The documentation should be in a publicly accessible place so that the provided metadata makes sense. There are several ways this could be accomplished one way is to create a manifestation record for each iteration of the application profile. These could be related into a collection or they could have a single relation.
which in the listSet
The OLAC OAI record should have in its source in the first harvest the name and version of the native metadata schema used for the generation of the record. The link to the native version of the providers metadata schema's documentation should be provided in the archive section of the OAI describer.
Some utilities in OAI can modify data, some can be servers only, some havesters only, some harvesters and servers.
Some OAI providers are
Using record sets:
OLAC could allow end-users to dynamically create sets of records for export using the setSpec part of OAI. Playing with this and audience interest might create some social interest.
Django application to collect submitted DOIs, acquire their API provided metadata (Bibliographic metadata and citation graph metadata), allow limited (specified) annotation, and then make those records harvestable via OAI-PMH. Language Resource tagger—Adding a layer of language related metadata to published resources.
Some Django modules for OAI-PMH
https://github.com/saw-leipzig/foaipmh
https://github.com/jnphilipp/django_oai_pmh
https://pypi.org/user/jnphilipp/ his topic extraction module looks interesting.
Also look at the xsd schema here https://github.com/saw-leipzig/foaipmh/blob/5b15d5cc4700a3cccf497c47218c2fba6b3421d5/entrypoint.prod.sh#L5
Database Versioning
This depends on how the DB is set up. If we only have one record per item or one record per state... This needs more definition.
https://djangopackages.org/grids/g/versioning/
https://www.wpbeginner.com/beginners-guide/complete-guide-to-wordpress-post-revisions/
Form Builders
https://djangopackages.org/grids/g/form-builder/
Some Javascript tools for creating the specific forms needed:
https://github.com/HughP/dublin-core-generator
https://nsteffel.github.io/dublin_core_generator/generator.html
Markdown for documentation
https://neutronx.github.io/django-markdownx/
Bibtex
https://bibtexparser.readthedocs.io/en/master/
https://github.com/sciunto-org/python-bibtexparser
https://github.com/jnphilipp/bibliothek
https://github.com/lucastheis/django-publications <-- also check the network as "improvements" are all over the place.
Other names include:
* Babybib
* Pybtex
* Pybibliographer
ISSNs
ISSN.org is supposed to have an API.. but not sure if they do.
https://portal.issn.org/resource/ISSN/1904-0008
Any request to the portal may be automated thanks to the use of REST protocol. The download of results is also automated. This service is restricted to subscribing users. Please contact sales [at] issn.org for more information.
https://portal.issn.org/node/170
https://portal.issn.org/resource/ISSN/2549-5089#
https://portal.issn.org/resource/ISSN/2549-5089?format=json
We could also slurp the HTML for the sameAs links to other DBs if needed.
Views:
1. login with ORCID
2. query APIs (DOIs, ISBNs, ISSNs, ORCID, WikiData, etc.)
3. results display and annotation
4. submission
5. List of past submissions
6. update past submission screen (same as #3?)
If we ran a module like this:
https://pybliometrics.readthedocs.io/en/latest/classes/SerialTitle.html
Then we could take a reading on where the least spoken languages appear in the most highly ranked journals and determine if there was a bias or a loss to science.
Data Examples:
Have been moved to:
https://github.com/HughP/CrossRef-to-OLAC-data-examples
PDF Extraction:
https://levelup.gitconnected.com/scrap-data-from-website-and-pdf-document-for-django-app-fa8f37010085
https://towardsdatascience.com/how-to-extract-pdf-data-in-python-876e3d0c288
https://stackoverflow.com/questions/71850349/download-a-pdf-from-url-edit-it-an-render-it-in-django
https://stackoverflow.com/questions/48882768/django-reading-pdf-files-content
https://www.geeksforgeeks.org/working-with-pdf-files-in-python/
PDF Creation:
https://docs.djangoproject.com/en/4.1/howto/outputting-pdf/
https://jeltef.github.io/PyLaTeX/current/examples/header.html
So, in recent OLAC presentation I talked about enabling Omeka or Drupal via recipes for OAI harvesting. Here is some links to internet chatter on these issues.
Umm frankly, I am not sure anything out there right now is going to work to bring OAI-PMH services to WordPressConsider these three resources for more info on OAI:
. If it does then is it going to be able to use WordPress to advertise things or is it going to use WordPress to aggregate things? if the former then nothing out there ever let the admin user choose which fields were matched to which attributes, dynamically. But if it is also the former then why would anyone actually want this functionality? What is the Use Case? If one is using WordPress as a bibliography reference system like some libraries do, then this makes a lot of sense. However, there is another use case I would like to present. That is, the website which is about several or a single language. There are potentially two ways to conceptualize this:
unAPI Server for WordPress. [2] Mike Giarlo. 19 May 2006. unAPI Server for WordPress. Technosophia. http://lackoftalent.org/michael/blog/unapi-wordpress-plug-in/ [Accessed: 5 March 2012]
WordPress, now with added unAPI![3] Peter Binkley. 18 February 2006. WordPress, now with added unAPI!. http://www.wallandbinkley.com/quaedam/2006/02_18_wordpress-now-with-added-unapi.html [Accessed: 5 March 2012]
I think there is a second question here too: why does one need OAI-PMH for wordpress… is it as a provider or as a consumer? If one needs a PHP app for OAI-PMH maybe they can use: https://github.com/caseyamcl/phpoaipmh
Peter Binkley. 9 December 2005. COinS-PMH (unAPI) WordPress Plugin. http://www.wallandbinkley.com/quaedam/2005/12_09_coins-pmh-unapi-wordpress-plugin.html [Accessed: 5 March 2012]
Mike Giarlo. 19 May 2006. unAPI Server for WordPress. Technosophia. http://lackoftalent.org/michael/blog/unapi-wordpress-plug-in/ [Accessed: 5 March 2012]
Peter Binkley. 18 February 2006. WordPress, now with added unAPI!. http://www.wallandbinkley.com/quaedam/2006/02_18_wordpress-now-with-added-unapi.html [Accessed: 5 March 2012]