cli99/llm-analysis
Latency and Memory Analysis of Transformer Models for Training and Inference
This tool helps machine learning engineers and researchers estimate the latency and memory usage of Large Language Models (LLMs) like Transformers. By inputting details about your chosen model, GPU hardware, data type, and parallel processing setup, you receive predictions for how long training or inference will take and how much memory it will consume. It's designed for anyone planning or optimizing LLM deployments to understand performance before running costly experiments.
479 stars. No commits in the last 6 months.
Use this if you need to quickly assess different LLM configurations (model, GPU, parallelism) to predict training or inference time and memory requirements without running actual hardware tests.
Not ideal if you need precise, real-world measurements of your specific LLM implementation on actual hardware, as this provides theoretical estimates rather than empirical benchmarks.
Stars
479
Forks
56
Language
Python
License
Apache-2.0
Category
Last pushed
Apr 19, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/cli99/llm-analysis"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
TsinghuaC3I/MARTI
A Framework for LLM-based Multi-Agent Reinforced Training and Inference
zjunlp/KnowLM
An Open-sourced Knowledgable Large Language Model Framework.
tanyuqian/redco
NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to...
stanleylsx/llms_tool
一个基于HuggingFace开发的大语言模型训练、测试工具。支持各模型的webui、终端预测,低参数量及全参数模型训练(预训练、SFT、RM、PPO、DPO)和融合、量化。
slp-rl/slamkit
SlamKit is an open source tool kit for efficient training of SpeechLMs. It was used for...