dariush-bahrami/character-tokenizer

A character tokenizer for Hugging Face Transformers

41
/ 100
Emerging

This tool helps developers working with natural language processing (NLP) models. It converts text into individual characters, which are then represented as numerical inputs for machine learning models. The output is a format suitable for Hugging Face Transformer models, enabling more granular text analysis. This is ideal for machine learning engineers and NLP researchers who need fine-grained control over text processing.

No commits in the last 6 months.

Use this if you are an NLP developer who needs to process text character by character for a Hugging Face Transformer model.

Not ideal if you are an end-user looking for a ready-to-use application, rather than a developer tool.

natural-language-processing machine-learning-engineering text-tokenization transformer-models model-preprocessing
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 7 / 25
Maturity 16 / 25
Community 18 / 25

How are scores calculated?

Stars

32

Forks

13

Language

Python

License

MIT

Last pushed

Jun 21, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/dariush-bahrami/character-tokenizer"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.