Journal Article10.48550/arxiv.2309.04087
Unsupervised Multi-document Summarization with Holistic Inference
Haopeng Zhang,Sangwoo Cho,Kaiqiang Song,Xiaoyang Wang,Hongwei Wang,Jianwei Zhang,Dong-Han Yu +6 more
TL;DR: This paper proposes a new holistic framework for unsupervised multi-document extractive summarization that incorporates the holistic beam search inference method associated with the holistic measurements, named Subset Representative Index (SRI).
read more
Abstract: Multi-document summarization aims to obtain core information from a collection of documents written on the same topic. This paper proposes a new holistic framework for unsupervised multi-document extractive summarization. Our method incorporates the holistic beam search inference method associated with the holistic measurements, named Subset Representative Index (SRI). SRI balances the importance and diversity of a subset of sentences from the source documents and can be calculated in unsupervised and adaptive manners. To demonstrate the effectiveness of our method, we conduct extensive experiments on both small and large-scale multi-document summarization datasets under both unsupervised and adaptive settings. The proposed method outperforms strong baselines by a significant margin, as indicated by the resulting ROUGE scores and diversity measures. Our findings also suggest that diversity is essential for improving multi-document summary performance.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
A Systematic Survey of Text Summarization: From Statistical Methods to Large Language Models
Haopeng Zhang,Philip S. Yu,Jiawei Zhang +2 more
- 17 Jun 2024
TL;DR: A comprehensive survey of text summarization research covering traditional methods, deep learning approaches, PLM fine-tuning, and recent advancements in LLMs. It provides an overview of datasets, evaluation metrics, summarization methods, and future research directions.
Multi-Document Summarization Using LLAMA 2 Model with Transfer Learning
K. N. Sunilkumar,J Sheela +1 more
- 14 Mar 2024
TL;DR: This research introduces LLama2, a novel multi-document summarization approach leveraging advanced language models, natural language processing, and machine learning to efficiently condense complex narratives into concise summaries with superior performance.
References
•Posted Content
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
81.7K
•Proceedings Article
The PageRank Citation Ranking : Bringing Order to the Web
Lawrence Page,Sergey Brin,Rajeev Motwani,Terry Winograd +3 more
- 11 Nov 1999
TL;DR: This paper describes PageRank, a mathod for rating Web pages objectively and mechanically, effectively measuring the human interest and attention devoted to them, and shows how to efficiently compute PageRank for large numbers of pages.
16.4K
•Proceedings Article
ROUGE: A Package for Automatic Evaluation of Summaries
Chin-Yew Lin
- 25 Jul 2004
TL;DR: Four different RouGE measures are introduced: ROUGE-N, ROUge-L, R OUGE-W, and ROUAGE-S included in the Rouge summarization evaluation package and their evaluations.
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers,Iryna Gurevych +1 more
- 14 Aug 2019
TL;DR: Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity is presented.
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
Michael Lewis,Yinhan Liu,Naman Goyal,Marjan Ghazvininejad,Abdelrahman Mohamed,Omer Levy,Veselin Stoyanov,Luke Zettlemoyer +7 more
- 01 Jul 2020
TL;DR: BART is presented, a denoising autoencoder for pretraining sequence-to-sequence models, which matches the performance of RoBERTa on GLUE and SQuAD, and achieves new state-of-the-art results on a range of abstractive dialogue, question answering, and summarization tasks.