LennartKeller/roberta2longformer

Convert pretrained RoBerta models to various long-document transformer models

25
/ 100
Experimental

This tool helps machine learning engineers and NLP researchers adapt existing RoBerta language models to handle much longer text documents, which standard models struggle with. You provide a pre-trained RoBerta model and its tokenizer, and the tool converts it into a Longformer or Nyströmformer model, ready for continued pre-training or fine-tuning on long-document tasks. This is ideal for those working with extensive legal texts, research papers, or large-scale content analysis.

No commits in the last 6 months.

Use this if you need to apply the knowledge from a short-text RoBerta model to tasks involving very long documents, such as analyzing full articles, books, or extensive conversation logs.

Not ideal if you expect immediate, out-of-the-box performance on long-document tasks, as the converted models require additional training to become competitive.

Natural Language Processing Large Document Analysis Text Classification Information Extraction Machine Learning Engineering
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 8 / 25
Community 12 / 25

How are scores calculated?

Stars

11

Forks

2

Language

Python

License

Last pushed

Apr 05, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/LennartKeller/roberta2longformer"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.