thespectrewithin/joint_align
Cross-lingual Alignment vs Joint Training: A Comparative Study and A Simple Unified Framework
This project helps natural language processing practitioners create word embeddings that accurately capture word meanings across different languages. By combining unsupervised training with alignment methods, it takes monolingual text data or pre-trained multilingual models and produces improved cross-lingual word embeddings. NLP engineers and researchers working on multilingual applications would find this useful.
No commits in the last 6 months.
Use this if you need to build or enhance cross-lingual word embeddings for tasks like bilingual lexicon induction or cross-lingual named entity recognition.
Not ideal if your main goal is strictly monolingual text processing or if you only need pre-trained embeddings without further refinement.
Stars
52
Forks
8
Language
Python
License
—
Category
Last pushed
Feb 01, 2020
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/thespectrewithin/joint_align"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
luheng/deep_srl
Code and pre-trained model for: Deep Semantic Role Labeling: What Works and What's Next
sileod/tasksource
Datasets collection and preprocessings framework for NLP extreme multitask learning
loomchild/maligna
Bilingual sengence aligner
CK-Explorer/DuoSubs
Semantic subtitle aligner and merger for bilingual subtitle syncing.
coastalcph/lex-glue
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English