sugarme/tokenizer

NLP tokenizers written in Go language

54
/ 100
Established

This tool helps Go language developers prepare text data for Natural Language Processing (NLP) models. It takes raw text input and breaks it down into individual words or sub-word units, along with their positions, making it ready for use in machine learning models. It's designed for Go developers who are building AI/deep-learning applications.

316 stars.

Use this if you are a Go developer building NLP applications and need to preprocess text by converting it into tokens for tasks like training or inference.

Not ideal if you are not a Go developer or if you need a pre-built NLP solution without needing to integrate a tokenizer into your Go application.

Go-programming NLP-development text-preprocessing AI-application-development machine-learning-engineering
No Package No Dependents
Maintenance 6 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 22 / 25

How are scores calculated?

Stars

316

Forks

61

Language

Go

License

Apache-2.0

Last pushed

Nov 27, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/sugarme/tokenizer"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.