Skip to content

Commit cb2bf2f

Browse files
author
The TensorFlow Datasets Authors
committed
Fix the multi_news dataset.
PiperOrigin-RevId: 795632521
1 parent 5afdc02 commit cb2bf2f

File tree

15 files changed

+139
-91
lines changed

15 files changed

+139
-91
lines changed
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
@misc{alex2019multinews,
2+
title={Multi-News: a Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model},
3+
author={Alexander R. Fabbri and Irene Li and Tianwei She and Suyi Li and Dragomir R. Radev},
4+
year={2019},
5+
eprint={1906.01749},
6+
archivePrefix={arXiv},
7+
primaryClass={cs.CL}
8+
}
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
Multi-News, consists of news articles and human-written summaries
2+
of these articles from the site newser.com.
3+
Each summary is professionally written by editors and
4+
includes links to the original articles cited.
5+
6+
There are two features:
7+
- document: text of news articles seperated by special token "|||||".
8+
- summary: news summary.

tensorflow_datasets/datasets/multi_news/TAGS.txt

Whitespace-only changes.
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# coding=utf-8
2+
# Copyright 2025 The TensorFlow Datasets Authors.
3+
#
4+
# Licensed under the Apache License, Version 2.0 (the "License");
5+
# you may not use this file except in compliance with the License.
6+
# You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
https://huggingface.co/datasets/alexfabbri/multi_news/raw/main/data/test.src.cleaned 133 d04c4581d52321a30c246d2caa72853ee7f28c6b7a3985ee436f54c4bc264315 test.src.cleaned
2+
https://huggingface.co/datasets/alexfabbri/multi_news/raw/main/data/test.tgt 132 afba4aa26d95bb557c0eaa0cb8f7495af2104f1e43f4b5f9ef429b8752477abd test.tgt
3+
https://huggingface.co/datasets/alexfabbri/multi_news/raw/main/data/train.src.cleaned 134 75f87b786ff1982bf1bd5803c6a7377d1834b81956ac680a6955789ba047cc0b train.src.cleaned
4+
https://huggingface.co/datasets/alexfabbri/multi_news/raw/main/data/train.tgt 133 9f1e9b290a6aae1aa67bd5b361c934ee9db32486e5cd97d83184c097ef8b27e5 train.tgt
5+
https://huggingface.co/datasets/alexfabbri/multi_news/raw/main/data/val.src.cleaned 133 8df3ef6bd1882094de8120fa635c3abf758e10427f81f306aaa4786df7b57861 val.src.cleaned
6+
https://huggingface.co/datasets/alexfabbri/multi_news/raw/main/data/val.tgt 132 9c0377a443ea92b17449f7df17f1cdfa7c7ebbfe3a45f2f8cd7b3e0ffb47b1df val.tgt

0 commit comments

Comments
 (0)