hpcaitech/CachedEmbedding
A memory efficient DLRM training solution using ColossalAI
This project helps machine learning engineers and researchers train deep learning recommendation models, especially when dealing with extremely large embedding tables that exceed GPU memory limits. It takes large categorical datasets (like Criteo 1TB) and outputs a trained recommendation model more efficiently by dynamically managing embedding data between CPU and GPU memory. This allows for training models that would otherwise be impossible on a single GPU due as it significantly reduces the required GPU memory.
107 stars. No commits in the last 6 months.
Use this if you are a machine learning engineer or researcher training deep learning recommendation models and struggling with out-of-memory errors due to very large embedding tables on your GPUs.
Not ideal if your recommendation model's embedding tables easily fit within your available GPU memory, as it introduces a slight overhead compared to direct GPU-only solutions.
Stars
107
Forks
14
Language
Python
License
Apache-2.0
Category
Last pushed
Nov 22, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/hpcaitech/CachedEmbedding"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ContextualAI/gritlm
Generative Representational Instruction Tuning
xlang-ai/instructor-embedding
[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings
liuqidong07/LLMEmb
[AAAI'25 Oral] The official implementation code of LLMEmb
ritesh-modi/embedding-hallucinations
This repo shows how foundational model hallucinates and how we can fix such hallucinations using...
ritesh-modi/fine-tuning-embeddings-template
This repo is a template to fine-tune embedding models using sentencetransformers based on...