YuanGongND/psla
Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".
This project helps audio engineers and researchers automatically identify and label sounds within long audio recordings. It takes raw audio files as input and outputs detailed tags describing the sounds present, even for very long recordings. Anyone working with large audio datasets who needs to categorize or analyze sound events can use this tool.
149 stars. No commits in the last 6 months.
Use this if you need to accurately identify and tag sound events in extensive audio collections, such as environmental recordings or broadcast archives, and want state-of-the-art performance with efficient resource use.
Not ideal if your primary goal is to analyze human speech, musical content, or very specific, niche audio events that require highly specialized models.
Stars
149
Forks
15
Language
Python
License
BSD-3-Clause
Category
Last pushed
Jul 13, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/YuanGongND/psla"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
iver56/audiomentations
A Python library for audio data augmentation. Useful for making audio ML models work well in the...
Rikorose/DeepFilterNet
Noise supression using deep filtering
torchsynth/torchsynth
A GPU-optional modular synthesizer in pytorch, 16200x faster than realtime, for audio ML researchers.
marl/openl3
OpenL3: Open-source deep audio and image embeddings
archinetai/audio-data-pytorch
A collection of useful audio datasets and transforms for PyTorch.