tokenizer and go-tokenizer

These are competitors offering overlapping functionality—both provide standalone tokenization libraries for Go without clear differentiation in scope or features, so a user would select one based on maturity (sugarme/tokenizer's higher star count suggests broader adoption) rather than using them together.

tokenizer
54
Established
go-tokenizer
39
Emerging
Maintenance 6/25
Adoption 10/25
Maturity 16/25
Community 22/25
Maintenance 6/25
Adoption 5/25
Maturity 16/25
Community 12/25
Stars: 316
Forks: 61
Downloads:
Commits (30d): 0
Language: Go
License: Apache-2.0
Stars: 11
Forks: 2
Downloads:
Commits (30d): 0
Language: Go
License: MIT
No Package No Dependents
No Package No Dependents

About tokenizer

sugarme/tokenizer

NLP tokenizers written in Go language

This tool helps Go language developers prepare text data for Natural Language Processing (NLP) models. It takes raw text input and breaks it down into individual words or sub-word units, along with their positions, making it ready for use in machine learning models. It's designed for Go developers who are building AI/deep-learning applications.

Go-programming NLP-development text-preprocessing AI-application-development machine-learning-engineering

About go-tokenizer

euskadi31/go-tokenizer

A Text Tokenizer library for Golang

This is a fundamental tool for Go developers working with text. It takes a block of text and breaks it down into individual words or meaningful units, removing punctuation and separating contractions. Developers building applications that process or analyze human language, like search engines, chatbots, or content analysis tools, will find this project useful.

Go development text processing natural language programming data preprocessing search engine development

Scores updated daily from GitHub, PyPI, and npm data. How scores work