guxm2021/MM_ALT

[MM 2022] MM-ALT: A Multimodal Automatic Lyric Transcription System (Oral, Top paper award)

/ 100

Experimental

This project helps music producers, researchers, or archivists automatically transcribe sung lyrics from performances. It takes audio recordings of singing along with video of lip movements and earbud sensor data. The output is a highly accurate textual transcription of the lyrics, even when instrumental music makes the vocals hard to distinguish. This is ideal for anyone needing precise lyric data from multimodal sources.

No commits in the last 6 months.

Use this if you need to accurately transcribe lyrics from singing performances where traditional audio-only methods struggle due to accompanying music.

Not ideal if you only have audio data for lyric transcription, or if you need to transcribe spoken word rather than sung lyrics.

music-transcription lyric-analysis vocal-performance music-information-retrieval multimedia-content-analysis

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

Apache-2.0

Higher-rated alternatives

guxm2021/ALT_SpeechBrain

[ISMIR 2022] Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription

subhasis-ai/Hindi-ASR-Wav2Vec2

This repository demonstrates development of Hindi ASR model using transformers.

jvel07/wav2vec2_patho

Fine-tuning wav2vec2 to for Pathological Speech Processing

hammaad2002/ASRAdversarialAttacks

An ASR (Automatic Speech Recognition) adversarial attack repository.

maximkm/DLA_ASR_HW

ASR pytorch project

Explore Transformer Models

All categories Trending Transformer directory Insights