huggingface/text-generation-inference
Large Language Model Text Generation Inference
This project helps deploy and serve large language models (LLMs) for generating text efficiently. It takes a chosen LLM and a text prompt as input, then generates a natural language response or completion. This is ideal for machine learning engineers or developers looking to integrate powerful text generation into their applications or services.
10,802 stars. Used by 3 other packages. Actively maintained with 1 commit in the last 30 days. Available on PyPI.
Use this if you are a machine learning engineer or developer needing to deploy and serve a large language model for text generation with high performance and specific features like streaming or structured output.
Not ideal if you are looking for a pre-built application that directly solves an end-user problem, as this is an infrastructure tool for developers.
Stars
10,802
Forks
1,261
Language
Python
License
Apache-2.0
Category
Last pushed
Jan 08, 2026
Commits (30d)
1
Dependencies
3
Reverse dependents
3
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/huggingface/text-generation-inference"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Recent Releases
Related models
OpenMachine-ai/transformer-tricks
A collection of tricks and tools to speed up transformer models
poloclub/transformer-explainer
Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization
IBM/TabFormer
Code & Data for "Tabular Transformers for Modeling Multivariate Time Series" (ICASSP, 2021)
tensorgi/TPA
[NeurIPS 2025 Spotlight] TPA: Tensor ProducT ATTenTion Transformer (T6)...
lorenzorovida/FHE-BERT-Tiny
Source code for the paper "Transformer-based Language Models and Homomorphic Encryption: an...