met4citizen/HeadTTS

HeadTTS: Free neural text-to-speech (Kokoro) with timestamps and visemes for lip-sync. Runs in-browser (WebGPU/WASM) or on local Node.js WebSocket/REST server (CPU).

54
/ 100
Established

This tool helps animators, game developers, or content creators bring digital characters to life by generating natural-sounding speech directly from text. You provide text input, and it outputs audio along with precise timing data for phonemes and visemes (mouth shapes). It's designed for anyone needing to synchronize character lip movements with spoken dialogue, making animated characters speak realistically without manual effort.

112 stars. Available on npm.

Use this if you need to generate high-quality, free English speech with detailed lip-sync data for animated characters in a browser-based application or a local Node.js environment.

Not ideal if you require text-to-speech in languages other than English or if you are using older browsers that don't support WebGPU, as the performance will be significantly slower.

character-animation game-development digital-content-creation virtual-assistants lip-sync
Maintenance 6 / 25
Adoption 9 / 25
Maturity 24 / 25
Community 15 / 25

How are scores calculated?

Stars

112

Forks

16

Language

JavaScript

License

MIT

Last pushed

Dec 08, 2025

Commits (30d)

0

Dependencies

2

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/met4citizen/HeadTTS"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.