Open Access
Multi-Document Summarization Using Cross-Language Texts
Jung-Min Lim,In-Su Kang,Jong-Hyeok Lee +2 more
- 01 Jan 2004
TL;DR: This work tries to generate a summary in source language, using translated documents by a machine translator and a summarization system in target language, and shows the possibility of multi-documents summarization, using crosslanguage texts.
read more
Abstract: Without a summarization system in source language, we try to generate a summary in source language, using translated documents by a machine translator and a summarization system in target language. For summarizing multiple documents translated by a machine translator, we extract important sentences, and remove redundant sentences using an improved term-weighting method. It assigns weights to words, using syntactic information. According to the score of the extracted sentence, we choose sentences, and map them to Japanese sentences in original documents. Finally, we arrange Japanese sentences in chronological order, and report them as the result of our system. We submitted both a short and long type of summary, and the evaluation of our results is not good. However, our approach shows the possibility of multi-documents summarization, using crosslanguage texts.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
WikiLingua: A New Benchmark Dataset for Cross-Lingual Abstractive Summarization
Faisal Ladhak,Esin Durmus,Claire Cardie,Kathleen R. McKeown +3 more
- 07 Oct 2020
TL;DR: The WikiLingua dataset as mentioned in this paper is a large-scale, multilingual dataset for the evaluation of cross-lingual abstractive summarization systems, which contains how-to guides on a diverse set of topics written by human authors.
•Posted Content
WikiLingua: A New Benchmark Dataset for Cross-Lingual Abstractive Summarization
TL;DR: A method for direct crosslingual summarization without requiring translation at inference time is proposed by leveraging synthetic data and Neural Machine Translation as a pre-training step, which significantly outperforms the baseline approaches, while being more cost efficient during inference.
125
Improving Neural Cross-Lingual Abstractive Summarization via Employing Optimal Transport Distance for Knowledge Distillation
Thong Nguyen,Anh Tuan Luu +1 more
TL;DR: This paper propose a knowledge distillation loss using Sinkhorn divergence, an Optimal-Transport distance, to estimate the discrepancy between those teacher and student representations, which can explicitly construct cross-lingual correlation by distilling the knowledge of the summarization teacher into the student.
•Posted Content
NCLS: Neural Cross-Lingual Summarization
TL;DR: Wang et al. as discussed by the authors proposed an end-to-end cross-lingual summarization (NCLS) framework with multi-task learning to improve the quality of generated summaries.
17
Neural Label Search for Zero-Shot Multi-Lingual Extractive Summarization
Ruipeng Jia,Xingxing Zhang,Yanan Cao,Shi Wang,Zheng Lin,Furu Wei +5 more
- 28 Apr 2022
TL;DR: NLSSum (Neural Label Search for Summarization), which jointly learns hierarchical weights for these different sets of labels together with the summarization model, and achieves state-of-the-art results using both human and automatic evaluations across these two datasets.
References
•Journal Article
Accurate methods for the statistics of surprise and coincidence
TL;DR: The basis of a measure based on likelihood ratios that can be applied to the analysis of text is described, and in cases where traditional contingency table methods work well, the likelihood ratio tests described here are nearly identical.
Centroid-based summarization of multiple documents
TL;DR: A multi-document summarizer, MEAD, is presented, which generates summaries using cluster centroids produced by a topic detection and tracking system and an evaluation scheme based on sentence utility and subsumption is applied.
Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies
Dragomir R. Radev,Hongyan Jing,Malgorzata Budzikowska +2 more
- 30 Apr 2000
TL;DR: A multi-document summarizer, called MEAD, is presented, which generates summaries using cluster centroids produced by a topic detection and tracking system and two new techniques, based on sentence utility and subsumption, are described.
Experiments in multidocument summarization
Barry Schiffman,Ani Nenkova,Kathleen R. McKeown +2 more
- 24 Mar 2002
TL;DR: A multidocument summarizer built upon research into the detection of new information uses several new strategies to select interesting and informative sentences, including an innovative measure of importance derived from the analysis of a large corpus.
A Summarization System with Categorization of Document Sets.
Chikashi Nobata,Satoshi Sekine,Kiyotaka Uchimoto,Hitoshi Isahara +3 more
- 01 Oct 2002
TL;DR: Two modules are incorporated into the earlier summarization system, which is based on a sentenceextraction technique, so that it could apply the system to the multi-document summarization task.