reazon-research/ReazonSpeech
Massive open Japanese speech corpus
This project provides advanced Japanese speech recognition capabilities, converting spoken Japanese audio into text. It takes audio recordings, including noisy ones, and outputs accurate text transcripts. This is ideal for researchers or developers creating applications that need to understand spoken Japanese, especially in human-robot interaction or other noisy environments.
373 stars.
Use this if you need to accurately transcribe Japanese audio, even if it's mixed with English or contains background noise, for applications like voice assistants or robotics.
Not ideal if your primary need is for speech recognition in languages other than Japanese, or if you require an off-the-shelf, no-code transcription service.
Stars
373
Forks
34
Language
Python
License
Apache-2.0
Category
Last pushed
Jan 19, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/reazon-research/ReazonSpeech"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
ynop/audiomate
Python library for handling audio datasets.
common-voice/cv-dataset
Metadata and versioning details for the Common Voice dataset
davidmartinrius/speech-dataset-generator
🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset...
EgorLakomkin/KTSpeechCrawler
Automatically constructing corpus for automatic speech recognition from YouTube videos
coqui-ai/open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies