lucidrains/simple-hierarchical-transformer

Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT

59
/ 100
Established

This project offers an experimental approach to training large language models (LLMs) more efficiently by introducing multiple levels of data compression. It takes text or token sequences as input and produces logits for next-token prediction, aiming to maintain predictive quality while reducing computational cost. This is for machine learning researchers or practitioners who build and experiment with new LLM architectures.

225 stars. Available on PyPI.

Use this if you are a machine learning researcher exploring novel, more efficient transformer architectures for language modeling and want to experiment with hierarchical predictive coding.

Not ideal if you are looking for a production-ready, off-the-shelf language model or a tool for general natural language processing tasks without deep architectural experimentation.

large-language-models transformer-architecture neural-network-research predictive-modeling computational-efficiency
Maintenance 13 / 25
Adoption 10 / 25
Maturity 25 / 25
Community 11 / 25

How are scores calculated?

Stars

225

Forks

13

Language

Python

License

MIT

Last pushed

Mar 25, 2026

Commits (30d)

0

Dependencies

5

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/lucidrains/simple-hierarchical-transformer"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.