minguinho26/Prefix_AAC_ICASSP2023
Official Implementation of "Prefix tuning for Automated Audio Captioning(ICASSP 2023)"
This tool helps researchers and audio content creators automatically generate descriptive text captions for audio recordings. You input an audio file, and it outputs a human-readable sentence or phrase describing the sounds within. It's designed for anyone working with large collections of audio who needs to quickly understand or catalog their content without manually listening to every file.
No commits in the last 6 months.
Use this if you need to automatically create textual descriptions for sound events or environmental audio recordings.
Not ideal if you need to caption spoken dialogue or music compositions, as this focuses on general sound events.
Stars
31
Forks
2
Language
Jupyter Notebook
License
—
Category
Last pushed
Dec 06, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/minguinho26/Prefix_AAC_ICASSP2023"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
iver56/audiomentations
A Python library for audio data augmentation. Useful for making audio ML models work well in the...
Rikorose/DeepFilterNet
Noise supression using deep filtering
torchsynth/torchsynth
A GPU-optional modular synthesizer in pytorch, 16200x faster than realtime, for audio ML researchers.
marl/openl3
OpenL3: Open-source deep audio and image embeddings
archinetai/audio-data-pytorch
A collection of useful audio datasets and transforms for PyTorch.