NVIDIA-NeMo/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
This project helps AI researchers and PyTorch developers build, customize, and deploy advanced speech and multimodal AI models. It takes in various speech and text data, allowing users to create custom automatic speech recognition (ASR), text-to-speech (TTS), and speech-enabled large language models (LLMs). The primary users are machine learning engineers and researchers focused on developing sophisticated conversational AI applications.
16,894 stars. Actively maintained with 51 commits in the last 30 days.
Use this if you are an AI researcher or developer working with PyTorch and need a robust framework to build, customize, and deploy scalable speech and multimodal generative AI models.
Not ideal if you are looking for a pre-built, out-of-the-box solution without needing to customize or train models, or if you are not comfortable with deep learning development frameworks like PyTorch.
Stars
16,894
Forks
3,365
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 13, 2026
Commits (30d)
51
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/NVIDIA-NeMo/NeMo"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Community Discussion
Recent Releases
Related tools
alexiglad/EBT
PyTorch Code for Energy-Based Transformers paper -- generalizable reasoning and scalable learning
vlm-run/vlmrun-hub
A hub for various industry-specific schemas to be used with VLMs.
HyperGAI/HPT
HPT - Open Multimodal LLMs from HyperGAI
yash9439/Falcon-Local-AI-Model
Explore this GitHub repository housing 3 versions of Falcon code for text generation. Each...
bastien-muraccioli/svlr
SVLR: Scalable, Training-Free Visual Language Robotics: a modular multi-model framework for...