eliben/go-sentencepiece

Go implementation of the SentencePiece tokenizer

/ 100

Emerging

This is a tool for developers working with large language models, specifically Google's Gemma or Gemini models. It takes raw text as input and converts it into numerical tokens, or takes tokens and converts them back to text. Developers integrate this into their Go applications to prepare text for these AI models or interpret their outputs.

Use this if you are a Go developer building applications that need to process text for Google's Gemma or Gemini language models.

Not ideal if you are not a Go developer or if your project uses a different tokenizer or language model beyond Gemma/Gemini.

Go-development NLP-engineering LLM-integration Gemma-models Gemini-models

No Package No Dependents

Maintenance 6 / 25

Adoption 8 / 25

Maturity 16 / 25

Community 18 / 25

How are scores calculated?

Stars

Forks

Language

License

Apache-2.0

Related tools

sefineh-ai/Amharic-Tokenizer

Syllable-aware BPE tokenizer for the Amharic language (አማርኛ) – fast, accurate, trainable.

mdabir1203/BPE_Tokenizer_Visualizer

A Visualizer to check how BPE Tokenizer in an LLM Works

franciszekparma/GBPET

GPT-style language model with Byte Pair Encoding tokenizer, built from scratch in PyTorch.

BobMcDear/minbpe-hs

Byte-level byte pair encoding (BPE) in Haskell

sajjadh47/bpe-encoder-php

BPE (Byte-Pair Encoding) Encoder Decoder for OpenAI's GPT-2 / GPT-3 Implemented In Pure PHP,...

Explore LLM Tools

All categories Trending LLM Tool directory Insights