neocl/speach

🐍🍑 Python 3 library for managing, annotating, and converting natural language corpuses using popular formats (CoNLL, ELAN, Praat, CSV, JSON, SQLite, VTT, Audacity, TTL, TIG, ISF, etc.)

/ 100

Emerging

This tool helps researchers, linguists, or anyone working with spoken language data to manage and analyze their audio/video recordings and transcriptions. It takes in various transcription formats like ELAN .eaf files, Praat, or VTT, along with corresponding media files. The output is your data transformed, merged, or converted into different formats like CSV, JSON, or SQLite, making it easier to work with. It's designed for professionals who collect and annotate speech for research or analysis.

No commits in the last 6 months. Available on PyPI.

Use this if you need to standardize, convert, or preprocess your audio, video, and transcription files from diverse formats for linguistic analysis or data management.

Not ideal if you're looking for an interactive transcription tool or a simple audio editor for non-linguistic purposes.

linguistic-research discourse-analysis corpus-management qualitative-data-analysis speech-transcription

Stale 6m No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 25 / 25

Community 16 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

chrismattmann/tika-python

Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called...

sloria/TextBlob

Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase...

cltk/cltk

The Classical Language Toolkit

allenai/scispacy

A full spaCy pipeline and models for scientific/biomedical documents.

wi2trier/cbrkit

Customizable Case-Based Reasoning (CBR) toolkit for Python with a built-in API and CLI.

Explore NLP Tools

All categories Trending NLP directory Insights