huggingface/text-generation-inference

Large Language Model Text Generation Inference

/ 100

Established

This project helps deploy and serve large language models (LLMs) for generating text efficiently. It takes a chosen LLM and a text prompt as input, then generates a natural language response or completion. This is ideal for machine learning engineers or developers looking to integrate powerful text generation into their applications or services.

10,802 stars. Used by 3 other packages. Actively maintained with 1 commit in the last 30 days. Available on PyPI.

Use this if you are a machine learning engineer or developer needing to deploy and serve a large language model for text generation with high performance and specific features like streaming or structured output.

Not ideal if you are looking for a pre-built application that directly solves an end-user problem, as this is an infrastructure tool for developers.

LLM deployment MLOps text generation AI infrastructure model serving

Maintenance 9 / 25

Adoption 13 / 25

Maturity 25 / 25

Community 21 / 25

How are scores calculated?

Stars

10,802

Forks

1,261

Language

Python

License

Apache-2.0

Recent Releases

v3.3.7 19 Dec 2025 v3.3.6 17 Sep 2025 v3.3.5 02 Sep 2025 v3.3.4 19 Jun 2025 v3.3.3 18 Jun 2025

Related models

OpenMachine-ai/transformer-tricks

A collection of tricks and tools to speed up transformer models

poloclub/transformer-explainer

Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization

IBM/TabFormer

Code & Data for "Tabular Transformers for Modeling Multivariate Time Series" (ICASSP, 2021)

tensorgi/TPA

[NeurIPS 2025 Spotlight] TPA: Tensor ProducT ATTenTion Transformer (T6)...

lorenzorovida/FHE-BERT-Tiny

Source code for the paper "Transformer-based Language Models and Homomorphic Encryption: an...

Explore Transformer Models

All categories Trending Transformer directory Insights