yl4579/PitchExtractor
Deep Neural Pitch Extractor for Voice Conversion and TTS Training
This project helps developers train deep neural networks for advanced voice manipulation, specifically for tasks like voice conversion or generating speech from text. It takes audio recordings and processes them to extract fundamental frequency (pitch) information, which is crucial for creating natural-sounding synthetic voices. This tool is designed for machine learning engineers and researchers working on speech synthesis and voice conversion technologies.
147 stars. No commits in the last 6 months.
Use this if you are a machine learning engineer building a voice conversion system or a text-to-speech model and need to accurately extract pitch contours from audio data to train your neural network.
Not ideal if you are an end-user looking for a ready-to-use application to convert voices or generate speech; this is a component for building such systems.
Stars
147
Forks
34
Language
Python
License
MIT
Category
Last pushed
Aug 22, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/yl4579/PitchExtractor"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
bshall/Tacotron
A PyTorch implementation of Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis
Kyubyong/dc_tts
A TensorFlow Implementation of DC-TTS: yet another text-to-speech model
DemisEom/SpecAugment
A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain
Rayhane-mamah/Tacotron-2
DeepMind's Tacotron-2 Tensorflow implementation
Kyubyong/tacotron
A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model