Multi-document summarization

Topic Tools

Papers published on a yearly basis

Papers

Proceedings Article•

ROUGE: A Package for Automatic Evaluation of Summaries

[...]

Chin-Yew Lin¹•Institutions (1)

Information Sciences Institute¹

25 Jul 2004

TL;DR: Four different RouGE measures are introduced: ROUGE-N, ROUge-L, R OUGE-W, and ROUAGE-S included in the Rouge summarization evaluation package and their evaluations.

...read moreread less

Abstract: ROUGE stands for Recall-Oriented Understudy for Gisting Evaluation. It includes measures to automatically determine the quality of a summary by comparing it to other (ideal) summaries created by humans. The measures count the number of overlapping units such as n-gram, word sequences, and word pairs between the computer-generated summary to be evaluated and the ideal summaries created by humans. This paper introduces four different ROUGE measures: ROUGE-N, ROUGE-L, ROUGE-W, and ROUGE-S included in the ROUGE summarization evaluation package and their evaluations. Three of them have been used in the Document Understanding Conference (DUC) 2004, a large-scale summarization evaluation sponsored by NIST.

...read moreread less

14,830 citations

Journal Article•10.1145/3130348.3130369•

The use of MMR, diversity-based reranking for reordering documents and producing summaries

[...]

Jaime Carbinell¹, Jade Goldstein¹•Institutions (1)

Carnegie Mellon University¹

1 Aug 1998

TL;DR: A method for combining query-relevance with information-novelty in the context of text retrieval and summarization and preliminary results indicate some benefits for MMR diversity ranking in document retrieval and in single document summarization.

...read moreread less

Abstract: This paper presents a method for combining query-relevance with information-novelty in the context of text retrieval and summarization. The Maximal Marginal Relevance (MMR) criterion strives to reduce redundancy while maintaining query relevance in re-ranking retrieved documents and in selecting apprw priate passages for text summarization. Preliminary results indicate some benefits for MMR diversity ranking in document retrieval and in single document summarization. The latter are borne out by the recent results of the SUMMAC conference in the evaluation of summarization systems. However, the clearest advantage is demonstrated in constructing non-redundant multi-document summaries, where MMR results are clearly superior to non-MMR passage selection.

...read moreread less

2,398 citations

Journal Article•10.1613/JAIR.1523•

LexRank: graph-based lexical centrality as salience in text summarization

[...]

Gunes Erkan¹, Dragomir R. Radev¹•Institutions (1)

University of Michigan¹

01 Jul 2004-Journal of Artificial Intelligence Research

TL;DR: LexRank as discussed by the authors is a stochastic graph-based method for computing relative importance of textual units for Natural Language Processing (NLP), which is based on the concept of eigenvector centrality.

...read moreread less

Abstract: We introduce a stochastic graph-based method for computing relative importance of textual units for Natural Language Processing. We test the technique on the problem of Text Summarization (TS). Extractive TS relies on the concept of sentence salience to identify the most important sentences in a document or set of documents. Salience is typically defined in terms of the presence of particular important words or in terms of similarity to a centroid pseudo-sentence. We consider a new approach, LexRank, for computing sentence importance based on the concept of eigenvector centrality in a graph representation of sentences. In this model, a connectivity matrix based on intra-sentence cosine similarity is used as the adjacency matrix of the graph representation of sentences. Our system, based on LexRank ranked in first place in more than one task in the recent DUC 2004 evaluation. In this paper we present a detailed analysis of our approach and apply it to a larger data set including data from earlier DUC evaluations. We discuss several methods to compute centrality using the similarity graph. The results show that degree-based methods (including LexRank) outperform both centroid-based methods and other systems participating in DUC in most of the cases. Furthermore, the LexRank with threshold method outperforms the other degree-based techniques including continuous LexRank. We also show that our approach is quite insensitive to the noise in the data that may result from an imperfect topical clustering of documents.

...read moreread less

2,367 citations

Journal Article•10.1016/J.IPM.2003.10.006•

Centroid-based summarization of multiple documents

[...]

Dragomir R. Radev¹, Hongyan Jing², Małgorzata Styś², Daniel Tam¹•Institutions (2)

University of Michigan¹, IBM²

01 Nov 2004-Information Processing and Management

TL;DR: A multi-document summarizer, MEAD, is presented, which generates summaries using cluster centroids produced by a topic detection and tracking system and an evaluation scheme based on sentence utility and subsumption is applied.

...read moreread less

Abstract: We present a multi-document summarizer, MEAD, which generates summaries using cluster centroids produced by a topic detection and tracking system. We describe two new techniques, a centroid-based summarizer, and an evaluation scheme based on sentence utility and subsumption. We have applied this evaluation to both single and multiple document summaries. Finally, we describe two user studies that test our models of multi-document summarization.

...read moreread less

1,248 citations

Proceedings Article•10.1145/383952.383955•

Generic text summarization using relevance measure and latent semantic analysis

[...]

Yihong Gong¹, Xin Liu¹•Institutions (1)

NEC¹

1 Sep 2001

TL;DR: This paper proposes two generic text summarization methods that create text summaries by ranking and extracting sentences from the original documents, and uses the latent semantic analysis technique to identify semantically important sentences, for summary creations.

...read moreread less

Abstract: In this paper, we propose two generic text summarization methods that create text summaries by ranking and extracting sentences from the original documents. The first method uses standard IR methods to rank sentence relevances, while the second method uses the latent semantic analysis technique to identify semantically important sentences, for summary creations. Both methods strive to select sentences that are highly ranked and different from each other. This is an attempt to create a summary with a wider coverage of the document's main content and less redundancy. Performance evaluations on the two summarization methods are conducted by comparing their summarization outputs with the manual summaries generated by three independent human evaluators. The evaluations also study the influence of different VSM weighting schemes on the text summarization performances. Finally, the causes of the large disparities in the evaluators' manual summarization results are investigated, and discussions on human text summarization patterns are presented.

...read moreread less

991 citations

...

Expand

Year	Papers
2025	31
2024	53
2023	120
2022	178
2021	57
2020	63

Topic Tools

Papers published on a yearly basis

Papers

ROUGE: A Package for Automatic Evaluation of Summaries

The use of MMR, diversity-based reranking for reordering documents and producing summaries

LexRank: graph-based lexical centrality as salience in text summarization

Centroid-based summarization of multiple documents

Generic text summarization using relevance measure and latent semantic analysis

Related Topics (5)

Performance Metrics