Mastodon communities
https://hcommons.org/docs/mastodon-quick-start-guide-for-humanities-scholars/
https://scholar.social/about
https://vis.social/about
https://mapstodon.space/about
Django modules and links
Django application to collect submitted DOIs, acquire their API provided metadata (Bibliographic metadata and citation graph metadata), allow limited (specified) annotation, and then make those records harvestable via OAI-PMH. Language Resource tagger—Adding a layer of language related metadata to published resources.
Some Django modules for OAI-PMH
https://github.com/saw-leipzig/foaipmh
https://github.com/jnphilipp/django_oai_pmh
https://pypi.org/user/jnphilipp/ his topic extraction module looks interesting.
Also look at the xsd schema here https://github.com/saw-leipzig/foaipmh/blob/5b15d5cc4700a3cccf497c47218c2fba6b3421d5/entrypoint.prod.sh#L5
Metadata utility for OAI-PMH
https://combine.readthedocs.io/en/master/configuration.html
User Authentication
https://github.com/ubffm/django-orcid
https://django.fun/en/docs/social-docs/0.1/backends/orcid/
Crossref
https://github.com/fabiobatalha/crossrefapi
Database Versioning
This depends on how the DB is set up. If we only have one record per item or one record per state... This needs more definition.
https://djangopackages.org/grids/g/versioning/
https://www.wpbeginner.com/beginners-guide/complete-guide-to-wordpress-post-revisions/
Form Builders
https://djangopackages.org/grids/g/form-builder/
Some Javascript tools for creating the specific forms needed:
https://github.com/HughP/dublin-core-generator
https://nsteffel.github.io/dublin_core_generator/generator.html
Markdown for documentation
https://neutronx.github.io/django-markdownx/
Bibtex
https://bibtexparser.readthedocs.io/en/master/
https://github.com/sciunto-org/python-bibtexparser
https://github.com/jnphilipp/bibliothek
https://github.com/lucastheis/django-publications <-- also check the network as "improvements" are all over the place.
Other names include:
* Babybib
* Pybtex
* Pybibliographer
APIs
ORCID
https://github.com/ORCID/python-orcid
Crossref API doc
https://github.com/CrossRef/rest-api-doc/blob/master/demos/crossref-api-demo.ipynb
Crossref types: https://www.crossref.org/documentation/register-maintain-records/
https://api.crossref.org/swagger-ui/index.html#/Types/get_types__id__works
Others — Mostly citation and references
http://www.scholix.org/
https://scholexplorer.openaire.eu/#/query/page=5/q=language
https://crossref.gitlab.io/knowledge_base/products/event-data/
FatCat https://fatcat.wiki/
InternetArchive Scholar https://scholar.archive.org/
Thor project https://project-thor.readme.io/docs/introduction-for-integrators
Corsscite.org
Semantic Scholar API https://api.semanticscholar.org/api-docs/graph
https://core.ac.uk/
https://opencitations.net/
https://unpaywall.org/ --> see: http://musingsaboutlibrarianship.blogspot.com/2017/11/using-oadoi-crossref-event-data-api-to.html
https://openalex.org/
https://arxiv.org/help/api/index
https://www.aminer.org/citation
https://www.aminer.org/download
https://open.aminer.cn/
https://analytics.hathitrust.org/datasets#top
https://pro.dp.la/developers/api-codex
https://pro.europeana.eu/page/apis
LCSH
https://github.com/edsu/id
MARC
For generating an ingesting MARC records
https://pymarc.readthedocs.io/en/latest/
Zotero
https://github.com/urschrei/pyzotero
Overview see: https://researchguides.smu.edu.sg/api-list/scholarly-metadata-api
ISSNs
ISSN.org is supposed to have an API.. but not sure if they do.
https://portal.issn.org/resource/ISSN/1904-0008
Any request to the portal may be automated thanks to the use of REST protocol. The download of results is also automated. This service is restricted to subscribing users. Please contact sales [at] issn.org for more information.
https://portal.issn.org/node/170
https://portal.issn.org/resource/ISSN/2549-5089#
https://portal.issn.org/resource/ISSN/2549-5089?format=json
We could also slurp the HTML for the sameAs links to other DBs if needed.
JATS
https://pypi.org/project/jatsgenerator/
https://stackoverflow.com/questions/42084165/extracting-text-from-jats-xml-file-using-python
https://github.com/sibils/jats-parser
Pandas
https://pypi.org/project/django-pandas/
Beautiful Soup
There is the issue of how do we add to a Dublin Core OAI record how it was changed over time.... I need to architect this out.
Record Provenance:
[]Explore
https://www.w3.org/TR/prov-dc/
https://www.w3.org/2011/prov/track/issues/607?changelog
http://www.ukoln.ac.uk/metadata/dcmi/collection-provenance/
https://edoc.hu-berlin.de/bitstream/handle/18452/2727/332.pdf?sequence=1&isAllowed=y
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4177195/
https://www.loc.gov/standards/mods/userguide/recordinfo.html
https://tsl.access.preservica.com/tslac-digital-preservation-framework/qualified-dublin-core-schema/
https://dl.acm.org/doi/10.5555/2770897.2770924
https://blog.datacite.org/exposing-doi-metadata-provenance/
https://dgarijo.com/papers/dc2011.pdf
https://ceur-ws.org/Vol-670/paper_3.pdf
https://ecommons.cornell.edu/bitstream/handle/1813/55327/Encoding%20Provenance%20for%20Social%20Science%20Data-final.pdf?sequence=3&isAllowed=y
Views:
1. login with ORCID
2. query APIs (DOIs, ISBNs, ISSNs, ORCID, WikiData, etc.)
3. results display and annotation
4. submission
5. List of past submissions
6. update past submission screen (same as #3?)


If we ran a module like this:
https://pybliometrics.readthedocs.io/en/latest/classes/SerialTitle.html
Then we could take a reading on where the least spoken languages appear in the most highly ranked journals and determine if there was a bias or a loss to science.
Data Examples:
Have been moved to:
https://github.com/HughP/CrossRef-to-OLAC-data-examples
PDF Extraction:
https://levelup.gitconnected.com/scrap-data-from-website-and-pdf-document-for-django-app-fa8f37010085
https://towardsdatascience.com/how-to-extract-pdf-data-in-python-876e3d0c288
https://stackoverflow.com/questions/71850349/download-a-pdf-from-url-edit-it-an-render-it-in-django
https://stackoverflow.com/questions/48882768/django-reading-pdf-files-content
https://www.geeksforgeeks.org/working-with-pdf-files-in-python/
PDF Creation:
https://docs.djangoproject.com/en/4.1/howto/outputting-pdf/
https://jeltef.github.io/PyLaTeX/current/examples/header.html
NER:
https://johnfraney.github.io/django-ner-trainer/settings/
Other:
https://prodi.gy/
https://realpython.com/testing-in-django-part-1-best-practices-and-examples/
here is a django app for controlling URIs for linked data vocabularies.
https://github.com/unt-libraries/django-controlled-vocabularies
as seen here https://digital2.library.unt.edu/vocabularies/agent-qualifiers/
And here is a one for source authority records.
https://github.com/unt-libraries/django-name
as seen here: https://digital2.library.unt.edu/name/nm0000001/
Link Checker
https://github.com/Kaltsoon/dead-link-checker
https://pypi.org/project/django-linkcheck/
https://github.com/bartdag/pylinkvalidator
https://stackoverflow.com/questions/43264291/in-django-how-can-i-unit-test-all-links-recursively-every-view-check-for-200-o
Abstract and Table of Contents
If abstract is a sample of about-ness, then a table of contents is sample if is-ness. Some have said that journal articles should not have table of contents (instructional staff at the UNT program teaching the Metadata I course). I disagree, but so does Habing, et al (2001). Sometimes more than an abstract a table of contents can deliver a substantial understanding of what an article is and is about by displaying its structure. In fact many law review articles actually include a table of contents prior to the main part of the article. Law review articles can be over 70 pages in length. An outline offers useful information to the potential reader.
An example of an outline from a linguistics article.
Roberts, David. 2011. “A Tone Orthography Typology.” Written Language & Literacy 14 (1): 82–108. doi:10.1075/wll.14.1.05rob.
- Introduction
- The six parameters
2.1 First parameter: Domain
2.2 Second parameter: Target
2.2.1 Tones
2.2.2 Grammar
2.2.3 Lexicon
2.2.4 Dual strategies
2.3 Third parameter: Symbol
2.3.1 Phonographic representations
2.3.2 Semiographic representations
2.4 Fourth parameter: Position
2.5 Fifth parameter: Density
2.5.1 Introduction
2.5.2 Zero density
2.5.3 Partial density
2.5.4 Exhaustive density
2.6 Sixth parameter: Depth
2.6.1 Introduction
2.6.2 Surface representation
2.6.3 Deep representation
2.6.4 Shallow (transparent) representation - Conclusion
Abbreviations
Notes
Bibliographical references
References
Thomas G. Habing, Timothy W. Cole, and William H. Mischo. 2001. Qualified Dublin Core using RDF for Sci-Tech Journal Articles. https://dli.grainger.uiuc.edu/Publications/metadatacasestudy/HabingDC2001.pdf
Creative Commons Non-Commercial
Here is an interesting example of what people thought the "Non-Commercial" clause meant in 2007...
http://web.archive.org/web/20070705213116/http://www.audioscrobbler.net/data/microformats/
UNT courses to look at
Some of these UNT pages have groups of courses I am interested in.
https://informationscience.unt.edu/digital-curation-and-data-management
Opinions on OCLC and metadata ownership
https://librarytechnology.org/document/7266
https://librarytechnology.org/document/7266/ownership-of-machine-readable-records-a-neglected-consideration-in-retrospective-conversion
https://www.oclc.org/en/worldcat/cooperative-quality/policy.html
https://repository.law.uic.edu/cgi/viewcontent.cgi?article=1557&context=jitpl
https://dltj.org/article/oclc-records-use-policy-1/
https://wiki.harvard.edu/confluence/display/LibraryStaffDoc/OCLC+Institution+records+discontinuation
Matching algorithms
https://www.oclc.org/en/news/announcements/2022/worldcat-quality-enhancements.html
https://www.ohiolink.edu/content/matching_bibliographic_records_central_site
Subjects for images
Somebody told me once that pictures don't have subjects because of the is-ness about-ness separation:
I disagree. Here are some things from the literature.
https://drum.lib.umd.edu/bitstream/handle/1903/15063/Describing_Visual_Materials_in_the_Digital_Age_Hamburger.pdf
http://duspeccoll.github.io/local_authority
https://journals.ala.org/index.php/lrts/article/viewFile/7564/10462
https://listserv.loc.gov/cgi-bin/wa?A2=ind0501&L=MARC&P=4254
https://inevermetadataididntlike.wordpress.com/category/library-of-congress-genreform-terms/
http://netanelganin.com/projects/lcgft/lcgftType.html
https://cornerstone.lib.mnsu.edu/cgi/viewcontent.cgi?article=1000&context=olac-publications
https://www.isko.org/cyclo/subject
MODS and element order
Is element order a thing in XML? That is is the order of appearance of sibling elements within an XML document critical?
https://stackoverflow.com/questions/28268696/is-the-order-of-two-siblings-implementation-dependent
https://xmltutorial.info/xml/node-relationships/
Here is the response from the XSD author:
I haven’t read the entire thread, but I take it the question is whether elements in a mods record need to be in a particular order (i.e. in the order that they are listed in the schema). They don’t.
In the MODS schema, look for:
*********************************************************************** ** Definition of a single MODS record ** **********************************************************************And following that:
<xs:element name="mods" type="modsDefinition"/> <!-- --> <xs:complexType name="modsDefinition"> <xs:group ref="modsGroup" maxOccurs="unbounded"/>……….
This says: a MODS record consists of one or more elements from the “modsGroup (at least one, because that is the default if there is no minOccurs, and as many as you want because maxOccurs=“unbounded”) enclosed within a
element. Next, look for:
*********************************************************************** ** These are the "top level" MODS elements ** ********************************************************************** —>prior to that:
<xs:group name="modsGroup”> <xs:choice> …. and following it is the list of elements: <xs:element ref="abstract"/> <xs:element ref="accessCondition"/> <xs:element ref="classification"/> <xs:element ref="extension"/> <xs:element ref="genre"/> <xs:element ref="identifier"/> <xs:element ref="language"/> <xs:element ref="location"/>……………. and so on.
“Choice: says “choose any one of these elements."
So all together, it says choose an elements from the list. Any element. And then repeat as desired.
So you could choose “genre”, and then choose “classification”, and so on. Chosen in no particular order.
And then enclose your list within a
<mods>record, in the order in which you chose the elements.
Ray
Schema.org templates
Some links to some schema.org templates and documentation.
sometimes these are more useful than the official site.