ParCIS/Chimera

Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.

/ 100

Emerging

This project helps machine learning researchers and engineers train very large neural networks more efficiently. It takes your prepared dataset (like Wikipedia text for BERT models) and uses specialized techniques to distribute the training workload across multiple GPUs. The output is a more quickly trained, large-scale neural network model, ready for deployment or further research. This is for professionals working with state-of-the-art deep learning models.

No commits in the last 6 months.

Use this if you are a machine learning researcher or engineer struggling with the time and computational resources required to train extremely large neural networks.

Not ideal if you are working with smaller models, do not have access to a multi-GPU cluster managed by SLURM, or are not already comfortable with advanced distributed training concepts.

deep-learning-research large-scale-model-training distributed-machine-learning neural-network-efficiency AI-model-development

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 9 / 25

Maturity 16 / 25

Community 13 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

GPL-3.0

Higher-rated alternatives

openvinotoolkit/nncf

Neural Network Compression Framework for enhanced OpenVINO™ inference

huggingface/optimum

🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers...

NVIDIA/Megatron-LM

Ongoing research training transformer models at scale

huggingface/optimum-intel

🤗 Optimum Intel: Accelerate inference with Intel optimization tools

eole-nlp/eole

Open language modeling toolkit based on PyTorch

Explore Transformer Models

All categories Trending Transformer directory Insights