alibaba-damo-academy/SpokenNLP
A wide variety of research projects developed by the SpokenNLP team of Speech Lab, Alibaba Group.
This collection of projects helps analyze spoken language and long documents. It can take lecture videos, spoken dialogue, or long text, and output things like topic boundaries, keyphrases, or segmented documents. Researchers and data scientists working with large volumes of conversational data or textual content would find these useful.
124 stars.
Use this if you need to extract meaningful segments or information from audio recordings of speeches, meetings, or long written documents.
Not ideal if you are looking for a simple, out-of-the-box application for general speech-to-text transcription or basic text summarization.
Stars
124
Forks
12
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 11, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/alibaba-damo-academy/SpokenNLP"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
nltk/nltk
NLTK Source
explosion/spaCy
💫 Industrial-strength Natural Language Processing (NLP) in Python
undertheseanlp/underthesea
Underthesea - Vietnamese NLP Toolkit
stanfordnlp/stanza
Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many...
flairNLP/flair
A very simple framework for state-of-the-art Natural Language Processing (NLP)