osoleve/glitchlings

Enemies for your LLM

/ 100

Emerging

This project offers tools to intentionally introduce realistic text corruptions into your data. You provide original text, and it outputs versions with typos, homophone substitutions, or confusable characters. It's designed for machine learning engineers and researchers who are building and testing robust language models.

Available on PyPI.

Use this if you need to test how well your language models handle real-world data imperfections or want to train models to be more resilient to errors.

Not ideal if you're looking for tools to clean or correct existing noisy text data.

natural-language-processing model-robustness data-augmentation machine-learning-engineering language-model-evaluation

Maintenance 10 / 25

Adoption 7 / 25

Maturity 24 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

Apache-2.0

Higher-rated alternatives

thunlp/OpenAttack

An Open-Source Package for Textual Adversarial Attack.

thunlp/TAADpapers

Must-read Papers on Textual Adversarial Attack and Defense

jind11/TextFooler

A Model for Natural Language Attack on Text Classification and Inference

thunlp/OpenBackdoor

An open-source toolkit for textual backdoor attack and defense (NeurIPS 2022 D&B, Spotlight)

thunlp/SememePSO-Attack

Code and data of the ACL 2020 paper "Word-level Textual Adversarial Attacking as Combinatorial...

Explore NLP Tools

All categories Trending NLP directory Insights