Django modules and links

Django application to collect submitted DOIs, acquire their API provided metadata (Bibliographic metadata and citation graph metadata), allow limited (specified) annotation, and then make those records harvestable via OAI-PMH. Language Resource tagger—Adding a layer of language related metadata to published resources.

Some Django modules for OAI-PMH

User Authentication


Introducing Crossref, the basics

Database Versioning
This depends on how the DB is set up. If we only have one record per item or one record per state... This needs more definition.

Form Builders

Some Javascript tools for creating the specific forms needed:

Markdown for documentation

Bibtex <-- also check the network as "improvements" are all over the place.
Other names include:
* Babybib
* Pybtex
* Pybibliographer



API Tutorial: Searching the ORCID registry

Crossref API doc
Crossref types:

Others — Mostly citation and references
InternetArchive Scholar
Thor project
Semantic Scholar API --> see:


For generating an ingesting MARC records


Overview see:

ISSNs is supposed to have an API.. but not sure if they do.
Any request to the portal may be automated thanks to the use of REST protocol. The download of results is also automated. This service is restricted to subscribing users. Please contact sales [at] for more information.
We could also slurp the HTML for the sameAs links to other DBs if needed.



Beautiful Soup

There is the issue of how do we add to a Dublin Core OAI record how it was changed over time.... I need to architect this out.

Record Provenance:

Exposing DOI metadata provenance

1. login with ORCID
2. query APIs (DOIs, ISBNs, ISSNs, ORCID, WikiData, etc.)
3. results display and annotation
4. submission
5. List of past submissions
6. update past submission screen (same as #3?)

If we ran a module like this:

Then we could take a reading on where the least spoken languages appear in the most highly ranked journals and determine if there was a bias or a loss to science.

Data Examples:

Have been moved to:

PDF Extraction:

PDF Creation:


NLP Application: Named Entity Recognition (NER) in Python with Spacy