modelscope/FunCodec
FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.
This toolkit helps you efficiently manage and process speech and audio data. It takes raw audio files (like WAVs) and converts them into compact 'audio codes,' which can then be used to reconstruct the original audio or generate new speech. This is ideal for researchers and developers working on advanced audio applications like text-to-speech systems or music generation.
442 stars. No commits in the last 6 months.
Use this if you need to compress, reproduce, or synthesize speech and audio, especially for developing AI models like text-to-speech.
Not ideal if you are an end-user simply looking to convert text to speech or generate music without developing underlying models.
Stars
442
Forks
33
Language
Python
License
MIT
Category
Last pushed
Jan 25, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/modelscope/FunCodec"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
kan-bayashi/ParallelWaveGAN
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
fatchord/WaveRNN
WaveRNN Vocoder + TTS
shangeth/wavencoder
WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation,...
rishikksh20/iSTFTNet-pytorch
iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier...
seungwonpark/melgan
MelGAN vocoder (compatible with NVIDIA/tacotron2)