guxm2021/MM_ALT
[MM 2022] MM-ALT: A Multimodal Automatic Lyric Transcription System (Oral, Top paper award)
This project helps music producers, researchers, or archivists automatically transcribe sung lyrics from performances. It takes audio recordings of singing along with video of lip movements and earbud sensor data. The output is a highly accurate textual transcription of the lyrics, even when instrumental music makes the vocals hard to distinguish. This is ideal for anyone needing precise lyric data from multimodal sources.
No commits in the last 6 months.
Use this if you need to accurately transcribe lyrics from singing performances where traditional audio-only methods struggle due to accompanying music.
Not ideal if you only have audio data for lyric transcription, or if you need to transcribe spoken word rather than sung lyrics.
Stars
21
Forks
—
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 16, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/guxm2021/MM_ALT"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
guxm2021/ALT_SpeechBrain
[ISMIR 2022] Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription
subhasis-ai/Hindi-ASR-Wav2Vec2
This repository demonstrates development of Hindi ASR model using transformers.
jvel07/wav2vec2_patho
Fine-tuning wav2vec2 to for Pathological Speech Processing
hammaad2002/ASRAdversarialAttacks
An ASR (Automatic Speech Recognition) adversarial attack repository.
maximkm/DLA_ASR_HW
ASR pytorch project