LennartKeller/roberta2longformer

Convert pretrained RoBerta models to various long-document transformer models

/ 100

Experimental

This tool helps machine learning engineers and NLP researchers adapt existing RoBerta language models to handle much longer text documents, which standard models struggle with. You provide a pre-trained RoBerta model and its tokenizer, and the tool converts it into a Longformer or Nyströmformer model, ready for continued pre-training or fine-tuning on long-document tasks. This is ideal for those working with extensive legal texts, research papers, or large-scale content analysis.

No commits in the last 6 months.

Use this if you need to apply the knowledge from a short-text RoBerta model to tasks involving very long documents, such as analyzing full articles, books, or extensive conversation logs.

Not ideal if you expect immediate, out-of-the-box performance on long-document tasks, as the converted models require additional training to become competitive.

Natural Language Processing Large Document Analysis Text Classification Information Extraction Machine Learning Engineering

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 5 / 25

Maturity 8 / 25

Community 12 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

Tongjilibo/bert4torch

An elegent pytorch implement of transformers

nyu-mll/jiant

jiant is an nlp toolkit

lonePatient/TorchBlocks

A PyTorch-based toolkit for natural language processing

monologg/JointBERT

Pytorch implementation of JointBERT: "BERT for Joint Intent Classification and Slot Filling"

grammarly/gector

Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite"...

Explore Transformer Models

All categories Trending Transformer directory Insights