guxm2021/MM_ALT

[MM 2022] MM-ALT: A Multimodal Automatic Lyric Transcription System (Oral, Top paper award)

22
/ 100
Experimental

This project helps music producers, researchers, or archivists automatically transcribe sung lyrics from performances. It takes audio recordings of singing along with video of lip movements and earbud sensor data. The output is a highly accurate textual transcription of the lyrics, even when instrumental music makes the vocals hard to distinguish. This is ideal for anyone needing precise lyric data from multimodal sources.

No commits in the last 6 months.

Use this if you need to accurately transcribe lyrics from singing performances where traditional audio-only methods struggle due to accompanying music.

Not ideal if you only have audio data for lyric transcription, or if you need to transcribe spoken word rather than sung lyrics.

music-transcription lyric-analysis vocal-performance music-information-retrieval multimedia-content-analysis
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 16 / 25
Community 0 / 25

How are scores calculated?

Stars

21

Forks

Language

Python

License

Apache-2.0

Last pushed

Mar 16, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/guxm2021/MM_ALT"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.