kajyuuen/daaja

This repository has implementations of data augmentation for NLP for Japanese.

/ 100

Experimental

This tool helps Japanese NLP practitioners expand their limited datasets for tasks like text classification and named entity recognition. It takes Japanese sentences or sequences of words with their labels as input and generates variations by swapping, inserting, deleting, or replacing words with synonyms. This is for data scientists or machine learning engineers working with Japanese text who need more data to train robust models.

No commits in the last 6 months.

Use this if you are building machine learning models for Japanese text and need to artificially increase the size and diversity of your training data.

Not ideal if you are working with languages other than Japanese, or if you already have a very large and diverse dataset.

Japanese-NLP text-classification named-entity-recognition data-preparation machine-learning-engineering

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 8 / 25

Community 9 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

dsfsi/textaugment

TextAugment: Text Augmentation Library

425776024/nlpcda

一键中文数据增强包； NLP数据增强、bert数据增强、EDA：pip install nlpcda

google-research/uda

Unsupervised Data Augmentation (UDA)

searchableai/KitanaQA

KitanaQA: Adversarial training and data augmentation for neural question-answering models

SanghunYun/UDA_pytorch

UDA(Unsupervised Data Augmentation) implemented by pytorch

Explore NLP Tools

All categories Trending NLP directory Insights