pragyak412/Improving-Voice-Separation-by-Incorporating-End-To-End-Speech-Recognition
Implementing the paper -
This project helps audio engineers and researchers improve the clarity of individual voices in mixed audio recordings, such as interviews or conference calls. You feed it an audio file containing multiple overlapping speakers, and it outputs separated audio tracks, with each track isolating a single speaker's voice. This is valuable for professionals working with noisy or complex audio environments who need to analyze or process individual speech.
No commits in the last 6 months.
Use this if you need to cleanly isolate individual speech from mixed audio sources to improve transcription accuracy, speaker diarization, or other downstream audio analysis tasks.
Not ideal if your primary goal is to simply transcribe clear, single-speaker audio, as it focuses on the more complex task of separating mixed voices.
Stars
19
Forks
2
Language
Python
License
—
Category
Last pushed
Jul 06, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/pragyak412/Improving-Voice-Separation-by-Incorporating-End-To-End-Speech-Recognition"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
TensorSpeech/TensorFlowASR
:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2....
dangvansam/viet-asr
VietASR - Vietnamese Automatic Speech Recognition
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
xinjli/allosaurus
Allosaurus is a pretrained universal phone recognizer for more than 2000 languages
srvk/eesen
The official repository of the Eesen project