Chepang Bibliography

While doing my thesis I also was working on a Chepang Bibliography.

some leftover links to include are: Unkown resource.
Chepang text: Descriptive, Personal Jungle encounters Introduction & set 1

Gurung, Tamang, Thakali, Sherpa, and Chepang prosodies (82 items)

Clause, sentence, and discourse patterns in selected languages of Nepal 4: Word lists

Bibliography notes:

ELAR roles and OLAC

While doing my Masters Thesis, I took a look at the contributor roles declared for various works. One thing I noticed is that even though Stuart McGill contributed two corpora to ELAR when these corpora get translated to OALC the translation mucks the metadata so that only one resource shows up with his name.

I asked the archive director about this and my understanding/recollection from that conversation is that metadata was piped through the TLA. I think the above record was also at the preveious link... but it doesn't resolve currently and there have been technology stack changes at ELAR since my Thesis was released. Here is the interface in the Internet Archive for a different record.;jsessionid=00127741134CA14440824DA736655134?0&openhandle=2196/00-0000-0000-0012-D580-4

Massively Multi-lingual Medical Mayhem

When COVID-2020 started making its rounds, a particular NGO started translating a short phrase, but as I watched their progress it became clear that some issues are not perceivable in text alone. Culture bounds the concepts needed inhibiting the broad application of machine learning and machine traslation.

Leftover links on Philippine area research from 2020

In 2020 I was looking at some Philippine language issues in and the following is a list of links that remained open tabs for two years...

The links related to language planning and the term language development. They also related to Linguistic Cartography of the Philippines and digital libraries.

Map of Mindanao

A google link

Some Hugo things to follow up on… -- this one looked like a static OAI harvester.

Part of my reference materials on how I got my fonts to work

Understanding Web Fonts and Getting the Most Out of Them

Hugo Image Gallery
I wonder if I et a GeoJson file of mexico if that would be good on which to display my mexico photos

Django modules and links

Django application to collect submitted DOIs, acquire their API provided metadata (Bibliographic metadata and citation graph metadata), allow limited (specified) annotation, and then make those records harvestable via OAI-PMH. Language Resource tagger—Adding a layer of language related metadata to published resources.

Some Django modules for OAI-PMH

User Authentication


Introducing Crossref, the basics

Database Versioning
This depends on how the DB is set up. If we only have one record per item or one record per state... This needs more definition.

Form Builders

Some Javascript tools for creating the specific forms needed:

Markdown for documentation

Bibtex <-- also check the network as "improvements" are all over the place.
Other names include:
* Babybib
* Pybtex
* Pybibliographer



API Tutorial: Searching the ORCID registry

Crossref API doc
Crossref types:

Others — Mostly citation and references
InternetArchive Scholar
Thor project
Semantic Scholar API --> see:


For generating an ingesting MARC records


Overview see:

ISSNs is supposed to have an API.. but not sure if they do.
Any request to the portal may be automated thanks to the use of REST protocol. The download of results is also automated. This service is restricted to subscribing users. Please contact sales [at] for more information.
We could also slurp the HTML for the sameAs links to other DBs if needed.



Beautiful Soup

There is the issue of how do we add to a Dublin Core OAI record how it was changed over time.... I need to architect this out.

Record Provenance:

Exposing DOI metadata provenance

1. login with ORCID
2. query APIs (DOIs, ISBNs, ISSNs, ORCID, WikiData, etc.)
3. results display and annotation
4. submission
5. List of past submissions
6. update past submission screen (same as #3?)

If we ran a module like this:

Then we could take a reading on where the least spoken languages appear in the most highly ranked journals and determine if there was a bias or a loss to science.

Data Examples:

Have been moved to:

PDF Extraction:

PDF Creation:


NLP Application: Named Entity Recognition (NER) in Python with Spacy


Abstract and Table of Contents

If abstract is a sample of about-ness, then a table of contents is sample if is-ness. Some have said that journal articles should not have table of contents. I disagree. Sometimes more than an abstract a table of contents can deliver a substantial understanding of what an article is and is about by displaying its structure. In fact many law review articles actually include a table of contents prior to the main part of the article. Law review articles can be over 70 pages in length. An outline of feres useful information to the potential reader.

An example of an outline from a linguistics article.

Roberts, David. 2011. “A Tone Orthography Typology.” Written Language & Literacy 14 (1): 82–108. doi:10.1075/wll.14.1.05rob.

1. Introduction
2. The six parameters
2.1 First parameter: Domain
2.2 Second parameter: Target
2.2.1 Tones
2.2.2 Grammar
2.2.3 Lexicon
2.2.4 Dual strategies
2.3 Third parameter: Symbol
2.3.1 Phonographic representations
2.3.2 Semiographic representations
2.4 Fourth parameter: Position
2.5 Fifth parameter: Density
2.5.1 Introduction
2.5.2 Zero density
2.5.3 Partial density
2.5.4 Exhaustive density
2.6 Sixth parameter: Depth
2.6.1 Introduction
2.6.2 Surface representation
2.6.3 Deep representation
2.6.4 Shallow (transparent) representation
3. Conclusion
Bibliographical references