lgessler/microbert

A tiny BERT for low-resource monolingual models

37
/ 100
Emerging

This tool provides compact language models for understanding and processing text in languages with limited digital resources, like Ancient Greek or Coptic. It takes raw text data in these languages and helps researchers, linguists, or cultural heritage professionals build smaller, more efficient models for tasks such as identifying parts of speech or grammatical relationships, even with scarce training data. The output is a specialized language model tailored for that specific low-resource language.

Use this if you are a linguist or researcher working with languages that have very few digital texts available and need to build effective language processing models without extensive computational resources.

Not ideal if you are working with well-resourced languages like English or Spanish, as standard, larger BERT models are likely more suitable.

linguistics-research low-resource-languages ancient-language-processing natural-language-processing digital-humanities
No License No Package No Dependents
Maintenance 6 / 25
Adoption 7 / 25
Maturity 8 / 25
Community 16 / 25

How are scores calculated?

Stars

31

Forks

6

Language

HTML

License

Last pushed

Dec 24, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/lgessler/microbert"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.