Santosh-Gupta/SpeedTorch
Library for faster pinned CPU <-> GPU transfer in Pytorch
This project helps machine learning engineers and researchers accelerate their deep learning workflows, especially when training models with very large numbers of parameters like embeddings. It speeds up the movement of data (tensors) between the computer's main memory (CPU RAM) and the graphics card's memory (GPU RAM). This means you can train more complex models faster, making better use of your hardware resources.
683 stars. No commits in the last 6 months. Available on PyPI.
Use this if you are training large deep learning models in PyTorch, especially those with numerous embeddings, and are encountering performance bottlenecks due to slow data transfer between CPU and GPU memory.
Not ideal if your deep learning models are small, or if you are not experiencing significant CPU-GPU data transfer bottlenecks in your PyTorch training.
Stars
683
Forks
40
Language
Python
License
MIT
Category
Last pushed
Feb 21, 2020
Commits (30d)
0
Dependencies
2
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/Santosh-Gupta/SpeedTorch"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
MinishLab/model2vec
Fast State-of-the-Art Static Embeddings
AnswerDotAI/ModernBERT
Bringing BERT into modernity via both architecture changes and scaling
tensorflow/hub
A library for transfer learning by reusing parts of TensorFlow models.
Embedding/Chinese-Word-Vectors
100+ Chinese Word Vectors 上百种预训练中文词向量
twang2218/vocab-coverage
语言模型中文认知能力分析