asahala/BabyLemmatizer
State-of-the-art neural tagger and lemmatizer for ancient languages
This tool helps ancient language scholars and researchers analyze transliterated texts from languages like Akkadian, Sumerian, or Ancient Greek. It takes a transliterated text as input and identifies the root form (lemma) and part-of-speech (POS) tag for each word, making the text searchable and useful for further study. The primary user is anyone working with historical linguistic data who needs to systematically categorize words.
No commits in the last 6 months.
Use this if you need to automatically identify lemmas and part-of-speech tags for words in transliterated ancient texts, particularly Cuneiform languages, to make them searchable and analyzable.
Not ideal if you are working with modern languages or if you require a simple, out-of-the-box solution without any command-line setup.
Stars
14
Forks
2
Language
Python
License
—
Category
Last pushed
Mar 09, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/asahala/BabyLemmatizer"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
chakki-works/seqeval
A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)
Hironsan/anago
Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition, Part-of-Speech Tagging and so on.
jbesomi/texthero
Text preprocessing, representation and visualization from zero to hero.
hamelsmu/ktext
Utilities for preprocessing text for deep learning with Keras
asahi417/tner
Language model fine-tuning on NER with an easy interface and cross-domain evaluation. "T-NER: An...