tiefenauer/ip9
Code for my master thesis at FHNW
This project helps linguists, speech researchers, or educators accurately match specific words or phrases in a written text to their corresponding locations within an audio recording. You input an audio file and its full transcription, and it outputs precise timing information for each word, indicating when it's spoken. This is ideal for anyone needing to analyze speech patterns or create synchronized captions.
No commits in the last 6 months.
Use this if you need to determine the exact start and end times of words or phrases in an audio recording, given its complete text transcript.
Not ideal if you need a solution for operating systems other than Linux or require a pre-built, easy-to-install application without technical setup.
Stars
7
Forks
1
Language
Python
License
MIT
Category
Last pushed
Jul 29, 2019
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/tiefenauer/ip9"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Picovoice/rhino
On-device Speech-to-Intent engine powered by deep learning
yandexdataschool/speech_course
YSDA course in Speech Processing.
MycroftAI/adapt
Adapt Intent Parser
Picovoice/speech-to-intent-benchmark
benchmark for Speech-to-Intent engines
IBM/BigLittleNet
Official repository for Big-Little Net