Tixierae/OrangeSum

The French summarization dataset introduced in "BARThez: a Skilled Pretrained French Sequence-to-Sequence Model".

29
/ 100
Experimental

This dataset provides a collection of French news articles from the "Orange Actu" website, spanning almost a decade, along with their professionally written titles and brief abstracts. It offers distinct tasks for generating a single-sentence title or a short abstract from longer articles. This resource is for researchers and developers working on natural language processing, specifically in automatic text summarization for the French language.

No commits in the last 6 months.

Use this if you are developing or evaluating machine learning models for summarizing French news content, especially if you need a dataset with both single-sentence titles and short abstracts.

Not ideal if you are looking to summarize content in languages other than French, or if you require long, multi-sentence summaries.

French news text summarization natural language processing AI research abstractive summarization
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 16 / 25
Community 7 / 25

How are scores calculated?

Stars

23

Forks

2

Language

Jupyter Notebook

License

CC-BY-SA-4.0

Last pushed

Apr 23, 2021

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/Tixierae/OrangeSum"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.