roeeaharoni/unsupervised-domain-clusters

Code and data accompanying our ACL 2020 paper, "Unsupervised Domain Clusters in Pretrained Language Models".

/ 100

Experimental

This project provides tools for natural language processing researchers to understand how different subject areas are represented within large language models. It takes parallel text from various domains, like medical or legal documents, and reveals underlying groups or 'clusters' of these domains. Researchers working with multilingual text or pre-trained language models can use this to analyze domain relationships.

No commits in the last 6 months.

Use this if you are a researcher studying how language models handle information from diverse real-world topics and want to identify inherent groupings of these topics.

Not ideal if you are looking for a ready-to-use translation tool or a way to train new language models from scratch.

natural-language-processing computational-linguistics multilingual-data domain-adaptation text-analysis

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 8 / 25

Community 8 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

—

Higher-rated alternatives

airaria/TextBrewer

A PyTorch-based knowledge distillation toolkit for natural language processing

sunyilgdx/NSP-BERT

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original...

princeton-nlp/CoFiPruning

[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408

kssteven418/LTP

[KDD'22] Learned Token Pruning for Transformers

georgian-io/Transformers-Domain-Adaptation

:no_entry: [DEPRECATED] Adapt Transformer-based language models to new text domains

Explore NLP Tools

All categories Trending NLP directory Insights