lenML/tokenizers

a lightweight no-dependency fork from transformers.js (only tokenizers)

46
/ 100
Emerging

This project helps developers integrate text processing capabilities from various large language models (LLMs) into their applications, especially when working offline. It takes raw text as input and breaks it down into individual tokens (words or sub-word units), which are essential for feeding text into LLMs. This is for developers building applications that need efficient, offline text tokenization without relying on external servers or heavy dependencies.

Available on npm.

Use this if you are a developer building an application that needs to tokenize text for various LLMs and requires offline functionality or a lightweight solution without full model dependencies.

Not ideal if you need to use the full ONNX models alongside the tokenizers, as this project focuses solely on tokenization without model inference.

AI-application-development NLP-implementation offline-AI text-preprocessing LLM-tooling
No Dependents
Maintenance 10 / 25
Adoption 7 / 25
Maturity 25 / 25
Community 4 / 25

How are scores calculated?

Stars

32

Forks

1

Language

JavaScript

License

MIT

Last pushed

Jan 21, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/lenML/tokenizers"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.