Tixierae/OrangeSum
The French summarization dataset introduced in "BARThez: a Skilled Pretrained French Sequence-to-Sequence Model".
This dataset provides a collection of French news articles from the "Orange Actu" website, spanning almost a decade, along with their professionally written titles and brief abstracts. It offers distinct tasks for generating a single-sentence title or a short abstract from longer articles. This resource is for researchers and developers working on natural language processing, specifically in automatic text summarization for the French language.
No commits in the last 6 months.
Use this if you are developing or evaluating machine learning models for summarizing French news content, especially if you need a dataset with both single-sentence titles and short abstracts.
Not ideal if you are looking to summarize content in languages other than French, or if you require long, multi-sentence summaries.
Stars
23
Forks
2
Language
Jupyter Notebook
License
CC-BY-SA-4.0
Category
Last pushed
Apr 23, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/Tixierae/OrangeSum"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
kenlimmj/rouge
A Javascript implementation of the Recall-Oriented Understudy for Gisting Evaluation (ROUGE)...
uoneway/KoBertSum
KoBertSum은 BertSum모델을 한국어 데이터에 적용할 수 있도록 수정한 한국어 요약 모델입니다.
udibr/headlines
Automatically generate headlines to short articles
bheinzerling/pyrouge
A Python wrapper for the ROUGE summarization evaluation package
xiongma/transformer-pointer-generator
A Abstractive Summarization Implementation with Transformer and Pointer-generator