neocl/speach
ππ Python 3 library for managing, annotating, and converting natural language corpuses using popular formats (CoNLL, ELAN, Praat, CSV, JSON, SQLite, VTT, Audacity, TTL, TIG, ISF, etc.)
This tool helps researchers, linguists, or anyone working with spoken language data to manage and analyze their audio/video recordings and transcriptions. It takes in various transcription formats like ELAN .eaf files, Praat, or VTT, along with corresponding media files. The output is your data transformed, merged, or converted into different formats like CSV, JSON, or SQLite, making it easier to work with. It's designed for professionals who collect and annotate speech for research or analysis.
No commits in the last 6 months. Available on PyPI.
Use this if you need to standardize, convert, or preprocess your audio, video, and transcription files from diverse formats for linguistic analysis or data management.
Not ideal if you're looking for an interactive transcription tool or a simple audio editor for non-linguistic purposes.
Stars
21
Forks
6
Language
Python
License
MIT
Category
Last pushed
Jun 26, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/neocl/speach"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
chrismattmann/tika-python
Tika-Python is a Python binding to the Apache Tikaβ’ REST services allowing Tika to be called...
sloria/TextBlob
Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase...
cltk/cltk
The Classical Language Toolkit
allenai/scispacy
A full spaCy pipeline and models for scientific/biomedical documents.
wi2trier/cbrkit
Customizable Case-Based Reasoning (CBR) toolkit for Python with a built-in API and CLI.