harmlessman/PAFTS
PAFTS : Library That Preprocessing Audio For TTS.
This tool helps content creators, educators, or researchers prepare audio for text-to-speech (TTS) systems. It takes raw audio files, cleans them by removing background noise and music, separates individual speakers, and then transcribes their speech into text. The output is a collection of clean, speaker-separated audio clips and a corresponding JSON file with their transcriptions, ready for TTS training.
No commits in the last 6 months. Available on PyPI.
Use this if you need to process large batches of raw audio recordings containing multiple speakers and background noise into a clean, transcribed dataset for text-to-speech model training.
Not ideal if you only need basic audio trimming or single-speaker transcription, or if your audio is already perfectly clean and separated.
Stars
27
Forks
5
Language
Python
License
MIT
Category
Last pushed
Nov 15, 2024
Commits (30d)
0
Dependencies
25
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/harmlessman/PAFTS"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
nateshmbhat/pyttsx3
Offline Text To Speech synthesis for python
KoljaB/RealtimeTTS
Converts text to speech in realtime
pndurette/gTTS
Python library and CLI tool to interface with Google Translate's text-to-speech API
n1teshy/yapper-tts
offline text to speech and free SOTA LLM APIs to let your programs speak to you
dputhier/pygtftk
A python package and a set of shell commands to handle GTF files