Experiments in multidocument summarization
Barry Schiffman,Ani Nenkova,Kathleen R. McKeown +2 more
- 24 Mar 2002
- pp 52-58
TL;DR: A multidocument summarizer built upon research into the detection of new information uses several new strategies to select interesting and informative sentences, including an innovative measure of importance derived from the analysis of a large corpus.
read more
Abstract: This paper describes a multidocument summarizer built upon research into the detection of new information. The summarizer uses several new strategies to select interesting and informative sentences, including an innovative measure of importance derived from the analysis of a large corpus. The system also computes concept frequencies rather than word frequencies as an additional measure of importance. It merges these strategies with a number of familiar summarization heuristics to rank sentences. The initial version of the summarizer performed successfully in the evaluation reported at the Document Understanding Conference last year, although the system addressed only the content of the summary and not the presentation. We also discuss here the procedures we are developing to improve the presentation and readability of the summaries.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Book
Automatic Summarization
Ani Nenkova,Sameer Maskey,Yang Liu +2 more
- 27 Jun 2011
TL;DR: The challenges that remain open, in particular the need for language generation and deeper semantic understanding of language that would be necessary for future advances in the field are discussed.
889
A survey of text summarization techniques
Ani Nenkova,Kathleen R. McKeown +1 more
- 01 Jan 2012
TL;DR: This chapter gives a broad overview of existing approaches based on how representation, sentence scoring or summary selection strategies alter the overall performance of the summarizer, and points out some of the peculiarities of the task of summarization.
686
Sentence Fusion for Multidocument News Summarization
TL;DR: This article introduces sentence fusion, a novel text-to-text generation technique for synthesizing common information across documents that moves the summarization field from the use of purely extractive methods to the generation of abstracts that contain sentences not found in any of the input documents and can synthesize information across sources.
514
Inferring strategies for sentence ordering in multidocument news summarization
TL;DR: This article propose a methodology for studying the properties of ordering information in the news genre and describe experiments done on a corpus of multiple acceptable orderings they developed for the task, based on these experiments, they implemented a strategy for ordering information that combines constraints from chronological order of events and topical relatedness.
Newsjunkie: providing personalized newsfeeds via analysis of information novelty
Evgeniy Gabrilovich,Susan T. Dumais,Eric Horvitz +2 more
- 17 May 2004
TL;DR: Newsjunkie is described, a system that personalizes news for users by identifying the novelty of stories in the context of stories they have already reviewed, and employs novelty-analysis algorithms that represent articles as words and named entities.
272
References
Introduction to WordNet: An On-line Lexical Database
TL;DR: Standard alphabetical procedures for organizing lexical information put together words that are spelled alike and scatter words with similar or related meanings haphazardly through the list.
Centroid-based summarization of multiple documents
TL;DR: A multi-document summarizer, MEAD, is presented, which generates summaries using cluster centroids produced by a topic detection and tracking system and an evaluation scheme based on sentence utility and subsumption is applied.
The automated acquisition of topic signatures for text summarization
Chin-Yew Lin,Eduard Hovy +1 more
- 31 Jul 2000
TL;DR: A method for automatically training topic signatures-sets of related words, with associated weights, organized around head topics, is described and illustrated with signatures the authors created with 6,194 TREC collection texts over 4 selected topics.
Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies
Dragomir R. Radev,Hongyan Jing,Malgorzata Budzikowska +2 more
- 30 Apr 2000
TL;DR: A multi-document summarizer, called MEAD, is presented, which generates summaries using cluster centroids produced by a topic detection and tracking system and two new techniques, based on sentence utility and subsumption, are described.
Multi-document summarization by sentence extraction
Jade Goldstein,Vibhu Mittal,Jaime G. Carbonell,Mark Kantrowitz +3 more
- 30 Apr 2000
TL;DR: This paper discusses a text extraction approach to multi- document summarization that builds on single-document summarization methods by using additional, available information about the document set as a whole and the relationships between the documents.