CanCLID/sentences

粵語對話語料

/ 100

Experimental

This project helps people gather and clean Cantonese sentences for speech recognition. You provide raw Cantonese text, and it guides you to transform it into clean, standardized sentences suitable for training AI models. This is for language enthusiasts, researchers, and AI developers building Cantonese voice applications.

No commits in the last 6 months.

Use this if you need high-quality, standardized Cantonese sentence data to train speech recognition systems or other natural language processing tools.

Not ideal if you need a dataset that includes mixed English and Cantonese, numbers, abbreviations, or extensive punctuation.

Cantonese-language speech-recognition natural-language-processing linguistic-data AI-training-data

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 8 / 25

Community 9 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

—

Higher-rated alternatives

wx-chevalier/NLP-Notes

人工智能与深度学习实战 - 自然语言处理篇

wx-chevalier/DeepLearning-Notes

人工智能与深度学习实战 - 深度学习篇

TingFree/NLPer-Arsenal

收录NLP竞赛策略实现、各任务baseline、相关竞赛经验贴（当前赛事、往期赛事、训练赛）、NLP会议时间、常用自媒体、GPU推荐等，持续更新中

hscspring/All4NLP

All For NLP, especially Chinese.

duoergun0729/nlp

兜哥出品 <一本开源的NLP入门书籍>

Explore NLP Tools

All categories Trending NLP directory Insights