mravanelli/SincNet
SincNet is a neural architecture for efficiently processing raw audio samples.
SincNet helps with identifying who is speaking in an audio recording by analyzing raw audio waveforms. You provide sound files, and it processes them to create a customized filter bank that specifically tunes into the unique characteristics of each speaker's voice. This is ideal for researchers or engineers working on voice authentication or personalizing voice interfaces.
1,235 stars. No commits in the last 6 months.
Use this if you need to build a system that can accurately identify individual speakers from raw audio recordings.
Not ideal if your primary goal is general speech-to-text transcription, as this tool is specifically designed for speaker identification.
Stars
1,235
Forks
270
Language
Python
License
MIT
Category
Last pushed
Apr 28, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/mravanelli/SincNet"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
iver56/audiomentations
A Python library for audio data augmentation. Useful for making audio ML models work well in the...
Rikorose/DeepFilterNet
Noise supression using deep filtering
torchsynth/torchsynth
A GPU-optional modular synthesizer in pytorch, 16200x faster than realtime, for audio ML researchers.
marl/openl3
OpenL3: Open-source deep audio and image embeddings
archinetai/audio-data-pytorch
A collection of useful audio datasets and transforms for PyTorch.