TalSchuster/CrossLingualContextualEmb
Cross-Lingual Alignment of Contextual Word Embeddings
This project helps natural language processing (NLP) researchers and practitioners work with text across different languages. It takes text-based data in various languages and provides 'aligned' word embeddings, which are numerical representations of words that can be directly compared and used interchangeably across languages. This means you can apply an NLP model trained on English data to, say, Spanish text, without needing to retrain the model from scratch.
No commits in the last 6 months.
Use this if you need to perform natural language processing tasks, like text classification or sentiment analysis, on data in multiple languages without extensive retraining for each new language.
Not ideal if you are a non-developer and are looking for an off-the-shelf application to process text, as this requires some programming knowledge.
Stars
99
Forks
8
Language
Python
License
MIT
Category
Last pushed
Feb 12, 2020
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/TalSchuster/CrossLingualContextualEmb"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
luheng/deep_srl
Code and pre-trained model for: Deep Semantic Role Labeling: What Works and What's Next
sileod/tasksource
Datasets collection and preprocessings framework for NLP extreme multitask learning
loomchild/maligna
Bilingual sengence aligner
CK-Explorer/DuoSubs
Semantic subtitle aligner and merger for bilingual subtitle syncing.
coastalcph/lex-glue
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English