haoxiangsnr/llm-tse

Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)

21
/ 100
Experimental

This helps you isolate a specific speaker's voice from a noisy audio recording, much like how people focus on one conversation in a crowded room. You provide an audio recording with multiple speakers and a text description of the speaker you want to hear. The output is a clear audio track of only that speaker's voice. This is for anyone who needs to extract specific voices from complex sound environments.

No commits in the last 6 months.

Use this if you need to cleanly separate one person's voice from a chaotic audio recording using only a text description, without needing a prior voice sample of that person.

Not ideal if you already have a high-quality, pre-recorded voice sample (voiceprint) of the target speaker you wish to extract.

audio-transcription meeting-minutes forensic-audio-analysis broadcast-editing podcast-production
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 8 / 25
Maturity 8 / 25
Community 5 / 25

How are scores calculated?

Stars

42

Forks

2

Language

JavaScript

License

Last pushed

Oct 13, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/haoxiangsnr/llm-tse"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.