kyegomez/Mixture-of-MQA

An implementation of a switch transformer like Multi-query attention model

20
/ 100
Experimental

This is an advanced neural network architecture designed for developers building large-scale AI models. It processes sequences of data, such as text or other sequential inputs, to produce learned representations or predictions, improving efficiency and scalability. AI/ML engineers and researchers who are working on complex natural language processing or sequence modeling tasks would find this useful.

No commits in the last 6 months.

Use this if you are a machine learning engineer or researcher developing transformer-based models and need to process very long sequences more efficiently.

Not ideal if you are a data scientist looking for an off-the-shelf model for immediate use or if you are not comfortable with deep learning model architecture.

deep-learning natural-language-processing sequence-modeling large-scale-ai machine-learning-engineering
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 4 / 25
Maturity 16 / 25
Community 0 / 25

How are scores calculated?

Stars

8

Forks

Language

Python

License

MIT

Last pushed

Feb 20, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/kyegomez/Mixture-of-MQA"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.