xmarva/transformer-architectures

Teaching transformer-based architectures

/ 100

Experimental

This collection of notebooks helps machine learning engineers and researchers understand how Transformer neural networks work by guiding them through building one from scratch. You'll work with raw text data, learn how to prepare it for models, implement the core components of Transformers like attention mechanisms, and then train these models. The output is a deep, practical understanding of advanced natural language processing architectures.

No commits in the last 6 months.

Use this if you are a machine learning practitioner looking to build a foundational understanding of Transformer architectures, from basic components to training complete models like BERT or GPT.

Not ideal if you are looking for a plug-and-play solution or an application-focused tool to solve a specific NLP problem without delving into the underlying model architecture.

natural-language-processing machine-learning-engineering deep-learning-education neural-network-design

No License Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 4 / 25

Maturity 8 / 25

Community 9 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

—

Higher-rated alternatives

xv44586/toolkit4nlp

transformers implement (architecture, task example, serving and more)

luozhouyang/transformers-keras

Transformer-based models implemented in tensorflow 2.x(using keras).

ufal/neuralmonkey

An open-source tool for sequence learning in NLP built on TensorFlow.

graykode/xlnet-Pytorch

Simple XLNet implementation with Pytorch Wrapper

uzaymacar/attention-mechanisms

Implementations for a family of attention mechanisms, suitable for all kinds of natural language...

Explore NLP Tools

All categories Trending NLP directory Insights