Open Access10.7916/D8TB1G77
Summarization Evaluation Methods: Experiments and Analysis
Kathleen R. McKeown,Hongyan Jing,Regina Barzilay,Michael Elhadad +3 more
- 01 Jan 1998
TL;DR: The results show that different parameters of an experiment can affect how well a system scores, and describe how parameters can be controlled to produce a sound evaluation.
read more
Abstract: Two methods are used for evaluation of summarization systems: an evaluation of generated summaries against an "ideal" summary and evaluation of how well summaries help a person perform in a task such as informa. tion retrieval. We carried out two large experiments to study the two evaluation methods. Our results show that different parameters of an experiment can (h-amatically affect how well a system scores. For example, summary length was found to affect both types of evaluations. For the "ideal" summary based evaluation, accuracy decreases as summary length increases, while for task based evaluations summary length and accuracy on an information retrieval task appear to correlate randomly. In this paper, we show how this parameter and others can affect evaluation results and describe how parameters can be controlled to produce a sound evaluation.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Centroid-based summarization of multiple documents
TL;DR: A multi-document summarizer, MEAD, is presented, which generates summaries using cluster centroids produced by a topic detection and tracking system and an evaluation scheme based on sentence utility and subsumption is applied.
What is user engagement? A conceptual framework for defining user engagement with technology
TL;DR: An extensive, critical multidisciplinary literature review and exploratory study of users of Web searching, online shopping, Webcasting, and gaming applications indicates that engagement is a process comprised of four distinct stages: point of engagement, period of sustained engagement, disengagement, and reengagement.
1.2K
Dialogue act recognition using maximum entropy
TL;DR: A feature-based classification approach for DA recognition is applied, by using the maximum entropy (ME) method to build a classifier for labeling utterances with DA tags, which simplifies the implementation of the classifier and improves the efficiency of DA recognition, without sacrificing the classification accuracy.
1.2K
•Book
Automatic Summarization
Ani Nenkova,Sameer Maskey,Yang Liu +2 more
- 27 Jun 2011
TL;DR: The challenges that remain open, in particular the need for language generation and deeper semantic understanding of language that would be necessary for future advances in the field are discussed.
889
Evaluating Content Selection in Summarization: The Pyramid Method
Ani Nenkova,Rebecca J. Passonneau +1 more
- 01 Jan 2004
TL;DR: It is argued that the method presented is reliable, predictive and diagnostic, thus improves considerably over the shortcomings of the human evaluation method currently used in the Document Understanding Conference.
References
New Methods in Automatic Extracting
TL;DR: New methods of automatically extracting documents for screening purposes, i.e. the computer selection of sentences having the greatest potential for conveying to the reader the substance of the document, indicate that the three newly proposed components dominate the frequency component in the production of better extracts.
A trainable document summarizer
Julian M. Kupiec,Jan O. Pedersen,Francine Chen +2 more
- 01 Jul 1995
TL;DR: The trends in the results are in agreement with those of Edmundson who used a subjectively weighted combination of features as opposed to training the feature weights using a corpus, which suggests that even shorter extracts may be useful indicative summmies.
Multi-paragraph segmentation expository text
Marti A. Hearst
- 27 Jun 1994
TL;DR: TextTiling as mentioned in this paper is an algorithm for partitioning expository texts into coherent multi-paragraph discourse units which reflect the subtopic structure of the texts using domain-independent lexical frequency and distribution information to recognize the interactions of multiple simultaneous themes.
•Posted Content
Multi-Paragraph Segmentation of Expository Text
TL;DR: TextTiling, an algorithm for partitioning expository texts into coherent multi-paragraph discourse units which reflect the subtopic structure of the texts, is described and shown to produce segmentation that corresponds well to human judgments of the major subtopic boundaries of thirteen lengthy texts.
Recall of prose as a function of the structural importance of the linguistic units.
TL;DR: In this article, the structural importance of linguistic units was shown to be related to their recall, and the linguistic units were then objectively ordered according to their importance to the structure of the larger prose passage.
412
Related Papers (5)
Julian M. Kupiec,Jan O. Pedersen,Francine Chen +2 more
- 01 Jul 1995
Chin-Yew Lin
- 25 Jul 2004
Ani Nenkova,Rebecca J. Passonneau +1 more
- 01 Jan 2004