tech-srl/layer_norm_expressivity_role

Code for the paper "On the Expressivity Role of LayerNorm in Transformers' Attention" (Findings of ACL'2023)

/ 100

Experimental

This project helps machine learning researchers and academics understand how Layer Normalization impacts the performance of Transformer models, particularly in their attention mechanisms. It takes experimental setups for tasks like 'Majority' and 'Unselectable Keys' and outputs results that demonstrate the expressivity role of Layer Normalization. This is for researchers specializing in deep learning architecture and natural language processing.

No commits in the last 6 months.

Use this if you are a machine learning researcher investigating the fundamental properties and architectural choices within Transformer networks.

Not ideal if you are looking for an off-the-shelf solution for an applied NLP task or a general-purpose Transformer library.

Machine Learning Research Transformer Models Neural Network Architecture Natural Language Processing Deep Learning Theory

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 8 / 25

Community 7 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

lucidrains/x-transformers

A concise but complete full-attention transformer with a set of promising experimental features...

kanishkamisra/minicons

Utility for behavioral and representational analyses of Language Models

lucidrains/simple-hierarchical-transformer

Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT

lucidrains/dreamer4

Implementation of Danijar's latest iteration for his Dreamer line of work

Nicolepcx/Transformers-in-Action

This is the corresponding code for the book Transformers in Action

Explore Transformer Models

All categories Trending Transformer directory Insights