microsoft/Tokenizer
Typescript and .NET implementation of BPE tokenizer for OpenAI LLMs.
This tool helps developers prepare text for use with OpenAI's Large Language Models (LLMs) when building applications in Node.js or .NET environments. It takes raw text as input and converts it into numerical 'tokens' – the format LLMs understand – enabling efficient processing before the text is sent to the AI model. Developers building AI-powered features would use this to manage text input for their LLM applications.
210 stars. No commits in the last 6 months.
Use this if you are a developer building an application with Node.js or .NET and need to efficiently prepare text prompts for OpenAI's LLMs.
Not ideal if you are not a developer working with Node.js or .NET, or if your application doesn't interact with OpenAI's LLMs.
Stars
210
Forks
35
Language
C#
License
MIT
Category
Last pushed
Apr 25, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/microsoft/Tokenizer"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
aiqinxuancai/TiktokenSharp
Token calculation for OpenAI models, using `o200k_base` `cl100k_base` `p50k_base` encoding.
dqbd/tiktokenizer
Online playground for OpenAPI tokenizers
pkoukk/tiktoken-go
go version of tiktoken
lenML/tokenizers
a lightweight no-dependency fork from transformers.js (only tokenizers)
tryAGI/Tiktoken
This project implements token calculation for OpenAI's gpt-4 and gpt-3.5-turbo model,...