EMUNES/Auto-Subtitle-File-Generation
Generate subtitle files with timelines in an automatic way.
ArchivedLeverages deep learning-based Sound Event Detection with PyTorch to identify speech segments and their precise timelines from audio/video inputs, outputting subtitles in .ass or .srt formats. The architecture uses pretrained PANNs audio neural networks combined with weak-label training, and integrates Vosk API for offline speech-to-text recognition to populate subtitle content. Includes open-source training pipeline with automated dataset generation from existing videos and subtitle files, enabling custom model fine-tuning.
No commits in the last 6 months.
Stars
62
Forks
17
Language
Jupyter Notebook
License
Apache-2.0
Category
Last pushed
Aug 10, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/EMUNES/Auto-Subtitle-File-Generation"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
zerounintezaragler/whisper_python
Whisper Python Untuk mendapatkan teks dari sebuah audio kini tidak perlu convert manual tidak...
Dicklesworthstone/franken_whisper
Agent-first Rust ASR orchestration stack: Bayesian backend routing across...
gopiashokan/Voice-AI-Automatic-Speech-Recognition
Developed a Marathi speech-to-text application using the Hugging Face whisper ASR models....
Donny-Hikari/realtime-transcribe
Transcribe your speech or the audio playing on your computer with Whisper in realtime, and show...
papi-el/theinsyeds-whisper-analysis
Analyze OpenAI's Whisper on Mac M4 with performance benchmarks and quality assessments. Discover...