uw-swag/tokdrift

Repository for TokDrift: When LLM Speaks in Subwords but Code Speaks in Grammar.

26
/ 100
Experimental

TokDrift is a research framework for evaluating how changes in code style or grammar (like using `snake_case` vs. `camelCase` for variables, or modifying punctuation) affect the performance of large language models on coding tasks. It takes a specific code transformation rule and a code-related task (e.g., code generation, fixing tests) as input, and outputs metrics showing how the LLM's accuracy is impacted. This tool is for researchers and developers working on code-generating LLMs.

Use this if you are a researcher or developer who needs to systematically evaluate how semantic-preserving code rewrites, such as changes in naming conventions or operator spacing, influence the accuracy and robustness of large language models on various coding tasks.

Not ideal if you are a practitioner looking for a tool to refactor your existing codebase or automatically improve code style; this is a research framework for analyzing LLM behavior, not a production code refactoring tool.

LLM evaluation code generation programming language research software engineering research natural language processing
No License No Package No Dependents
Maintenance 6 / 25
Adoption 5 / 25
Maturity 7 / 25
Community 8 / 25

How are scores calculated?

Stars

9

Forks

1

Language

Python

License

Last pushed

Jan 06, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/uw-swag/tokdrift"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.