kajyuuen/daaja

This repository has implementations of data augmentation for NLP for Japanese.

25
/ 100
Experimental

This tool helps Japanese NLP practitioners expand their limited datasets for tasks like text classification and named entity recognition. It takes Japanese sentences or sequences of words with their labels as input and generates variations by swapping, inserting, deleting, or replacing words with synonyms. This is for data scientists or machine learning engineers working with Japanese text who need more data to train robust models.

No commits in the last 6 months.

Use this if you are building machine learning models for Japanese text and need to artificially increase the size and diversity of your training data.

Not ideal if you are working with languages other than Japanese, or if you already have a very large and diverse dataset.

Japanese-NLP text-classification named-entity-recognition data-preparation machine-learning-engineering
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 8 / 25
Maturity 8 / 25
Community 9 / 25

How are scores calculated?

Stars

64

Forks

5

Language

Python

License

Last pushed

Feb 16, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/kajyuuen/daaja"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.