YuanGongND/ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

50
/ 100
Established

This project offers an advanced tool for automatically categorizing audio recordings, whether you're identifying different sounds, spoken commands, or musical genres. You provide raw audio data, and it outputs labels or classifications, telling you what's in the sound. This is ideal for researchers or developers building systems that need to understand and react to various types of audio information.

1,432 stars. No commits in the last 6 months.

Use this if you need to accurately classify diverse audio inputs, like environmental sounds, speech commands, or musical snippets, and want to leverage state-of-the-art, attention-based models.

Not ideal if your primary goal is real-time audio synthesis, voice manipulation, or purely acoustic signal processing without a classification objective.

audio-classification sound-recognition speech-commands acoustic-analysis audio-event-detection
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 24 / 25

How are scores calculated?

Stars

1,432

Forks

244

Language

Jupyter Notebook

License

BSD-3-Clause

Last pushed

May 21, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/YuanGongND/ast"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.