modelscope/FunCodec

FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.

/ 100

Emerging

This toolkit helps you efficiently manage and process speech and audio data. It takes raw audio files (like WAVs) and converts them into compact 'audio codes,' which can then be used to reconstruct the original audio or generate new speech. This is ideal for researchers and developers working on advanced audio applications like text-to-speech systems or music generation.

442 stars. No commits in the last 6 months.

Use this if you need to compress, reproduce, or synthesize speech and audio, especially for developing AI models like text-to-speech.

Not ideal if you are an end-user simply looking to convert text to speech or generate music without developing underlying models.

Speech Synthesis Audio Processing Voice AI Deep Learning Research Neural Audio

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 15 / 25

How are scores calculated?

Stars

442

Forks

Language

Python

License

MIT

Higher-rated alternatives

kan-bayashi/ParallelWaveGAN

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

fatchord/WaveRNN

WaveRNN Vocoder + TTS

shangeth/wavencoder

WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation,...

rishikksh20/iSTFTNet-pytorch

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier...

seungwonpark/melgan

MelGAN vocoder (compatible with NVIDIA/tacotron2)

Explore Voice AI Tools

All categories Trending Voice AI directory Insights