eliben/go-sentencepiece
Go implementation of the SentencePiece tokenizer
This is a tool for developers working with large language models, specifically Google's Gemma or Gemini models. It takes raw text as input and converts it into numerical tokens, or takes tokens and converts them back to text. Developers integrate this into their Go applications to prepare text for these AI models or interpret their outputs.
Use this if you are a Go developer building applications that need to process text for Google's Gemma or Gemini language models.
Not ideal if you are not a Go developer or if your project uses a different tokenizer or language model beyond Gemma/Gemini.
Stars
47
Forks
12
Language
Go
License
Apache-2.0
Category
Last pushed
Dec 10, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/eliben/go-sentencepiece"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
sefineh-ai/Amharic-Tokenizer
Syllable-aware BPE tokenizer for the Amharic language (አማርኛ) – fast, accurate, trainable.
mdabir1203/BPE_Tokenizer_Visualizer
A Visualizer to check how BPE Tokenizer in an LLM Works
franciszekparma/GBPET
GPT-style language model with Byte Pair Encoding tokenizer, built from scratch in PyTorch.
BobMcDear/minbpe-hs
Byte-level byte pair encoding (BPE) in Haskell
sajjadh47/bpe-encoder-php
BPE (Byte-Pair Encoding) Encoder Decoder for OpenAI's GPT-2 / GPT-3 Implemented In Pure PHP,...