nlpodyssey/gotokenizers

Go implementation of today's most used tokenizers

35
/ 100
Emerging

This is a foundational tool for Go developers who are building applications that process human language. It takes raw text and converts it into numerical tokens, which are essential for feeding text into machine learning models for tasks like translation or sentiment analysis. The output is a structured sequence of tokens, ready for further natural language processing. This is for Go developers who need to integrate modern text processing capabilities directly into their Go-based systems.

No commits in the last 6 months.

Use this if you are a Go developer building an application that needs to break down natural language text into discrete tokens for machine learning or advanced text analysis, and you prefer a pure Go implementation.

Not ideal if you are looking for a high-performance library for production-ready NLP systems today, as this is an early-stage project focused on functionality parity rather than optimization.

Go development natural language processing text tokenization machine learning infrastructure AI application development
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 8 / 25
Maturity 16 / 25
Community 11 / 25

How are scores calculated?

Stars

44

Forks

5

Language

Go

License

BSD-2-Clause

Last pushed

Dec 12, 2020

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/nlpodyssey/gotokenizers"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.