kaiidams/voice100

Voice100 includes neural TTS/ASR models. Inference of Voice100 is low cost as its models are tiny and only depend on CNN without autoregression.

32
/ 100
Emerging

This project helps generate natural-sounding speech from text (Text-to-Speech, TTS) and transcribe spoken audio into text (Automatic Speech Recognition, ASR). You input text to get speech, or audio to get text. It's designed for creators or businesses needing to add voiceovers to content, create audio messages, or automatically subtitle videos, even on less powerful devices like smartphones.

No commits in the last 6 months.

Use this if you need efficient, high-quality text-to-speech or speech-to-text capabilities that can run on standard personal computers or mobile devices without requiring expensive hardware.

Not ideal if you require extremely specialized voice cloning, real-time transcription of very noisy audio in complex environments, or support for a vast array of less common languages.

audio-content-creation voice-synthesis speech-transcription media-localization accessibility
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 7 / 25
Maturity 16 / 25
Community 9 / 25

How are scores calculated?

Stars

28

Forks

3

Language

Python

License

MIT

Last pushed

Nov 23, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/kaiidams/voice100"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.