RobertCsordas/transformer_generalization

The official repository for our paper "The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers". We significantly improve the systematic generalization of transformer models on a variety of datasets using simple tricks and careful considerations.

/ 100

Emerging

This project helps machine learning researchers improve how well transformer models can apply learned rules to new, unseen examples. It takes in datasets designed to test systematic generalization (like mathematical problems or natural language queries) and outputs trained transformer models with enhanced generalization capabilities, along with performance plots and tables. Machine learning researchers, especially those working on natural language processing or logical reasoning, would find this useful for advancing model robustness.

No commits in the last 6 months.

Use this if you are a machine learning researcher focused on improving the systematic generalization of transformer models, and you need to benchmark novel approaches against established datasets.

Not ideal if you are looking for a plug-and-play solution for applying pre-trained transformers to standard tasks like sentiment analysis or machine translation, as this is a research toolkit for model development.

machine-learning-research natural-language-processing model-generalization transformer-models academic-research

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 16 / 25

Community 18 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

transformerlab/transformerlab-app

The open source research environment for AI researchers to seamlessly train, evaluate, and scale...

naru-project/naru

Neural Relation Understanding: neural cardinality estimators for tabular data

neurocard/neurocard

State-of-the-art neural cardinality estimators for join queries

danielzuegner/code-transformer

Implementation of the paper "Language-agnostic representation learning of source code from...

salesforce/CodeTF

CodeTF: One-stop Transformer Library for State-of-the-art Code LLM

Explore Transformer Models

All categories Trending Transformer directory Insights