Top 96 papers published in the topic of Multi-document summarization in 2007

Showing papers on "Multi-document summarization published in 2007"

Proceedings Article•

Manifold-ranking based topic-focused multi-document summarization

[...]

Xiaojun Wan¹, Jianwu Yang¹, Jianguo Xiao¹•Institutions (1)

6 Jan 2007

TL;DR: A novel extractive approach based on manifold-ranking of sentences to this summarization task can significantly outperform existing approaches of the top performing systems in DUC tasks and baseline approaches.

...read moreread less

Abstract: Topic-focused multi-document summarization aims to produce a summary biased to a given topic or user profile. This paper presents a novel extractive approach based on manifold-ranking of sentences to this summarization task. The manifold-ranking process can naturally make full use of both the relationships among all the sentences in the documents and the relationships between the given topic and the sentences. The ranking score is obtained for each sentence in the manifold-ranking process to denote the biased information richness of the sentence. Then the greedy algorithm is employed to impose diversity penalty on each sentence. The summary is produced by choosing the sentences with both high biased information richness and high information novelty. Experiments on DUC2003 and DUC2005 are performed and the ROUGE evaluation results show that the proposed approach can significantly outperform existing approaches of the top performing systems in DUC tasks and baseline approaches.

...read moreread less

254 citations

Proceedings Article•

Enhancing Single-Document Summarization by Combining RankNet and Third-Party Sources

[...]

Krysta M. Svore¹, Lucy Vanderwende, Christopher J. C. Burges•Institutions (1)

Microsoft¹

1 Jan 2007

TL;DR: A new approach to automatic summarization based on neural nets, called NetSum, that extracts a set of features from each sentence that helps identify its importance in the document, and applies novel features based on news search query logs and Wikipedia entities.

...read moreread less

Abstract: We present a new approach to automatic summarization based on neural nets, called NetSum. We extract a set of features from each sentence that helps identify its importance in the document. We apply novel features based on news search query logs and Wikipedia entities. Using the RankNet learning algorithm, we train a pair-based sentence ranker to score every sentence in the document and identify the most important sentences. We apply our system to documents gathered from CNN.com, where each document includes highlights and an article. Our system significantly outperforms the standard baseline in the ROUGE-1 measure on over 70% of our document set.

...read moreread less

246 citations

Proceedings Article•10.1145/1242572.1242766•

Character-based automated media summarization

[...]

Frank Elmo Weber

23 Apr 2007

TL;DR: In this article, the authors present methods, devices, systems and tools that allow the summarization of text, audio, and audiovisual presentations, such as movies, into less lengthy forms.

...read moreread less

Abstract: Methods, devices, systems and tools are presented that allow the summarization of text, audio, and audiovisual presentations, such as movies, into less lengthy forms. High-content media files are shortened in a manner that preserves important details, by splitting the files into segments, rating the segments, and reassembling preferred segments into a final abridged piece. Summarization of media can be customized by user selection of criteria, and opens new possibilities for delivering entertainment, news, and information in the form of dense, information-rich content that can be viewed by means of broadcast or cable distribution, “on-demand” distribution, internet and cell phone digital video streaming, or can be downloaded onto an iPod™ and other portable video playback devices.

...read moreread less

229 citations

Proceedings Article•

Towards an Iterative Reinforcement Approach for Simultaneous Document Summarization and Keyword Extraction

[...]

Xiaojun Wan¹, Jianwu Yang¹, Jianguo Xiao¹•Institutions (1)

Peking University¹

1 Jun 2007

TL;DR: A novel iterative reinforcement approach to simultaneously extractingsummary and keywords from single document under the assumption that the summary and keywords of a document can be mutually boosted.

...read moreread less

Abstract: Though both document summarization and keyword extraction aim to extract concise representations from documents, these two tasks have usually been investigated independently. This paper proposes a novel iterative reinforcement approach to simultaneously extracting summary and keywords from single document under the assumption that the summary and keywords of a document can be mutually boosted. The approach can naturally make full use of the reinforcement between sentences and keywords by fusing three kinds of relationships between sentences and words, either homogeneous or heterogeneous. Experimental results show the effectiveness of the proposed approach for both tasks. The corpus-based approach is validated to work almost as well as the knowledge-based approach for computing word semantics.

...read moreread less

204 citations

Journal Article•10.1016/J.IPM.2007.01.026•

The use of domain-specific concepts in biomedical text summarization

[...]

Lawrence H. Reeve¹, Hyoil Han¹, Ari D. Brooks¹•Institutions (1)

Drexel University¹

01 Nov 2007-Information Processing and Management

TL;DR: Two independent methods for identifying salient sentences in biomedical texts using concepts derived from domain-specific resources are presented and it is shown that the best performance is achieved when the two methods are combined.

...read moreread less

Abstract: Text summarization is a method for data reduction. The use of text summarization enables users to reduce the amount of text that must be read while still assimilating the core information. The data reduction offered by text summarization is particularly useful in the biomedical domain, where physicians must continuously find clinical trial study information to incorporate into their patient treatment efforts. Such efforts are often hampered by the high-volume of publications. This paper presents two independent methods (BioChain and FreqDist) for identifying salient sentences in biomedical texts using concepts derived from domain-specific resources. Our semantic-based method (BioChain) is effective at identifying thematic sentences, while our frequency-distribution method (FreqDist) removes information redundancy. The two methods are then combined to form a hybrid method (ChainFreq). An evaluation of each method is performed using the ROUGE system to compare system-generated summaries against a set of manually-generated summaries. The BioChain and FreqDist methods outperform some common summarization systems, while the ChainFreq method improves upon the base approaches. Our work shows that the best performance is achieved when the two methods are combined. The paper also presents a brief physician's evaluation of three randomly-selected papers from an evaluation corpus to show that the author's abstract does not always reflect the entire contents of the full-text.

...read moreread less

139 citations

Proceedings Article•

Explorations in Automatic Book Summarization

[...]

Rada Mihalcea¹, Hakan Ceylan¹•Institutions (1)

University of North Texas¹

1 Jun 2007

TL;DR: A new data set specifically designed for the evaluation of systems for book summarization is introduced, and summarization techniques that explicitly account for the length of the documents are described.

...read moreread less

Abstract: Most of the text summarization research carried out to date has been concerned with the summarization of short documents (e.g., news stories, technical reports), and very little work if any has been done on the summarization of very long documents. In this paper, we try to address this gap and explore the problem of book summarization. We introduce a new data set specifically designed for the evaluation of systems for book summarization, and describe summarization techniques that explicitly account for the length of the documents.

...read moreread less

95 citations

Journal Article•10.1186/1471-2105-8-S9-S4•

A coherent graph-based semantic clustering and summarization approach for biomedical literature and a new summarization evaluation method

[...]

Illhoi Yoo¹, Xiaohua Hu², Il-Yeol Song²•Institutions (2)

University of Missouri¹, Drexel University²

27 Nov 2007-BMC Bioinformatics

TL;DR: A coherent graph-based semantic clustering and summarization approach for biomedical literature that takes advantage of ontology-enriched graphical representations significantly improves the quality of document clusters and understandability of documents through summaries.

...read moreread less

Abstract: Background A huge amount of biomedical textual information has been produced and collected in MEDLINE for decades. In order to easily utilize biomedical information in the free text, document clustering and text summarization together are used as a solution for text information overload problem. In this paper, we introduce a coherent graph-based semantic clustering and summarization approach for biomedical literature.

...read moreread less

80 citations

Proceedings Article•10.3115/1557769.1557825•

Measuring Importance and Query Relevance in Topic-focused Multi-document Summarization

[...]

Surabhi Gupta¹, Ani Nenkova¹, Dan Jurafsky¹•Institutions (1)

Stanford University¹

25 Jun 2007

TL;DR: A principled comparison between the two most commonly used schemes for assigning importance to words in the context of query focused multi-document summarization finds that log-likelihood ratio is more suitable for query-focused summarization since, unlike raw frequency, it is more sensitive to the integration of the information need defined by the user.

...read moreread less

Abstract: The increasing complexity of summarization systems makes it difficult to analyze exactly which modules make a difference in performance. We carried out a principled comparison between the two most commonly used schemes for assigning importance to words in the context of query focused multi-document summarization: raw frequency (word probability) and log-likelihood ratio. We demonstrate that the advantages of log-likelihood ratio come from its known distributional properties which allow for the identification of a set of words that in its entirety defines the aboutness of the input. We also find that LLR is more suitable for query-focused summarization since, unlike raw frequency, it is more sensitive to the integration of the information need defined by the user.

...read moreread less

66 citations

Journal Article•10.1016/J.IPM.2007.01.004•

Satisfying information needs with multi-document summaries

[...]

Sanda M. Harabagiu¹, Andrew Hickl², Finley Lacatusu²•Institutions (2)

University of Texas at Dallas¹, Language Computer Corporation²

01 Nov 2007-Information Processing and Management

TL;DR: This novel framework for summarization has the advantage of producing highly responsive summaries, as indicated by the evaluation results.

...read moreread less

Abstract: Generating summaries that meet the information needs of a user relies on (1) several forms of question decomposition; (2) different summarization approaches; and (3) textual inference for combining the summarization strategies. This novel framework for summarization has the advantage of producing highly responsive summaries, as indicated by the evaluation results.

...read moreread less

57 citations

Proceedings Article•10.21437/INTERSPEECH.2007-717•

A Comparative Study on Speech Summarization of Broadcast News and Lecture Speech

[...]

Jian Zhang, Ho Yin Chan, Pascale Fung, Lu Cao

27 Aug 2007

TL;DR: It is found that acoustic and structural features are more important for Broadcast News summarization due to the speaking styles of anchors and reporters, as well as typical news story flow.

...read moreread less

Abstract: We carry out a comprehensive study of acoustic/prosodic, linguistic and structural features for speech summarization, contrasting two genres of speech, namely Broadcast News and Lecture Speech. We find that acoustic and structural features are more important for Broadcast News summarization due to the speaking styles of anchors and reporters, as well as typical news story flow. Due to the relatively small contribution of lexical features, Broadcast News summarization does not depend heavily on ASR accuracies. We use SVM based summarizer to select the best features for extractive summarization, and obtain state-of-the-art performances: ROUGE-L F-measure of 0.64 for Mandarin Broadcast News, and 0.65 for Mandarin Lecture Speech. In the case of Lecture Speech summarization where lexical features are more important, we make the surprising discovery that summarization performance is very high (0.63 ROUGE-L F-measure) even when the ASR accuracy is low (21% CER). Index Terms: speech summarization

...read moreread less

Proceedings Article•10.1145/1277741.1277768•

CollabSum: exploiting multiple document clustering for collaborative single document summarizations

[...]

Xiaojun Wan, Jianwu Yang

23 Jul 2007

TL;DR: This paper proposes a novel framework called CollabSum for collaborative single document summarizations by making use of mutual influences of multiple documents within a cluster context by first employing the clustering algorithm to obtain appropriate document clusters and then exploiting the graph-ranking based algorithm for collaborative document summarization within each cluster.

...read moreread less

Abstract: Almost all existing methods conduct the summarization tasks for single documents separately without interactions for each document under the assumption that the documents are considered independent of each other. This paper proposes a novel framework called CollabSum for collaborative single document summarizations by making use of mutual influences of multiple documents within a cluster context. In this study, CollabSum is implemented by first employing the clustering algorithm to obtain appropriate document clusters and then exploiting the graph-ranking based algorithm for collaborative document summarizations within each cluster. Both the with-document and cross-document relationships between sentences are incorporated in the algorithm. Experiments on the DUC2001 and DUC2002 datasets demonstrate the encouraging performance of the proposed approach. Different clustering algorithms have been investigated and we find that the summarization performance relies positively on the quality of document cluster.

...read moreread less

Resource Lean and Portable Automatic Text Summarization

[...]

Martin Hassel

1 Jan 2007

TL;DR: Today, with digitally stored information available in abundance, even for many minor languages, this information must by some means be filtered and extracted in order to avoid drowning in it.

...read moreread less

Abstract: Today, with digitally stored information available in abundance, even for many minor languages, this information must by some means be filtered and extracted in order to avoid drowning in it. Autom ...

...read moreread less

DUC 2005: Evaluation of Question-Focused Summarization Systems

[...]

Hoa Trang Dang

22 Jan 2007

TL;DR: The Document Understanding Conference (DUC) 2005 evaluation had a single user-oriented, question-focused summarization task, which was to synthesize from a set of 25--50 documents a well-organized, fluent answer to a complex question as discussed by the authors.

...read moreread less

Abstract: The Document Understanding Conference (DUC) 2005 evaluation had a single user-oriented, question-focused summarization task, which was to synthesize from a set of 25--50 documents a well-organized, fluent answer to a complex question The evaluation shows that the best summarization systems have difficulty extracting relevant sentences in response to complex questions (as opposed to representative sentences that might be appropriate to a generic summary) The relatively generous allowance of 250 words for each answer also reveals how difficult it is for current summarization systems to produce fluent text from multiple documents

...read moreread less

A Semantic Free-text Summarization System Using Ontology Knowledge

[...]

Rakesh M. Verma, Ping Chen, Wei Lu

1 Jan 2007

TL;DR: A new user query based text summarization technique that makes use of WordNet, a general knowledge source from Princeton University, is proposed that is specially tuned to summarize medical documents by integrating Unified Medical Language System, a medical ontologyknowledge source from National Library of Medicine.

...read moreread less

Abstract: As huge amounts of knowledge are created rapidly, effective information access becomes an important issue. Especially for critical domains, such as medical and financial areas, efficient retrieval of concise and relevant information is highly desired. In this paper we propose a new user query based text summarization technique that makes use of WordNet, a general knowledge source from Princeton University. Our summarization system is specially tuned to summarize medical documents by integrating Unified Medical Language System, a medical ontology knowledge source from National Library of Medicine. We participated in the Document Understanding Conference 2007 Main Task and ranked in the middle tier of 32 systems.

...read moreread less

Journal Article•10.1016/J.IPM.2007.01.014•

Older versions of the ROUGEeval summarization evaluation system were easier to fool

[...]

Jonas Sjöbergh

01 Nov 2007-Information Processing and Management

TL;DR: A method for automatic summarization based on a Markov model of the source text, by a simple greedy word selection strategy, is presented, and summaries with high ROUGE-scores are generated.

...read moreread less

Abstract: We show some limitations of the ROUGE evaluation method for automatic summarization. We present a method for automatic summarization based on a Markov model of the source text. By a simple greedy word selection strategy, summaries with high ROUGE-scores are generated. These summaries would however not be considered good by human readers. The method can be adapted to trick different settings of the ROUGEeval package.

...read moreread less

Proceedings Article•10.1109/ICSC.2007.16•

Automated summarization of narrative video on a semantic level

[...]

Tsvetomira Tsoneva¹, Mauro Barbieri¹, Hans Weda¹•Institutions (1)

Philips¹

17 Sep 2007

TL;DR: An automated content analysis and summarization framework for creating moving-image summaries for narrative videos aimed at preserving the story line to the level that users can watch the summary instead of the original content.

...read moreread less

Abstract: The movie industry produces thousands of feature films and TV series annually. Such massive data volumes would take consumers more than a lifetime to watch. Therefore, summarization of narrative media, which engages in providing concise and informative video summaries, has become a popular topic of research. However, most of the summarization solutions so far aim to represent just the overall atmosphere of the video at the expense of the story line. In this paper we describe a novel approach for automated creation of summaries for narrative videos. We propose an automated content analysis and summarization framework for creating moving-image summaries. We aim at preserving the story line to the level that users can watch the summary instead of the original content. Our solution is based on textual cues available in subtitles and movie scripts. We extract features like keywords, main characters names and presence, and combine them in an importance function to identify the moments most relevant for preserving the story line. We develop several summarization methods and evaluate the quality of the resulting summaries in terms of user understanding and user satisfaction through a user test.

...read moreread less

Multiple Alternative Sentence Compressions for Automatic Text Summarization

[...]

Nitin Madnani, David Zajic, Bonnie J. Dorr, Necip Fazil Ayan, Jimmy Lin - Show less +1 more

1 Jan 2007

TL;DR: A parse-and-trim approach with a novel technique for producing multiple alternative compressions for source sentences and using weighted features of these candidates to construct summaries for multi-document summarization.

...read moreread less

Abstract: We perform multi-document summarization by generating compressed versions of source sentences as summary candidates and using weighted features of these candidates to construct summaries. We combine a parse-and-trim approach with a novel technique for producing multiple alternative compressions for source sentences. In addition, we use a novel method for tuning the feature weights that maximizes the change in the ROUGE-2 score ( ROUGE) between the already existing summary state and the new state that results from the addition of the candidate under consideration. We also describe experiments using a new paraphrase-based feature for redundancy checking. Finally, we present the results of our DUC2007 submissions and some ideas for future work.

...read moreread less

10.5555/1931390.1931403•

A language independent approach to multilingual text summarization

[...]

Alkesh Patel¹, Tanveer J. Siddiqui¹, Uma Shanker Tiwary¹•Institutions (1)

Indian Institute of Information Technology, Allahabad¹

30 May 2007

TL;DR: An efficient algorithm for language independent generic extractive summarization for single document based on structural and statistical factors is described, which shows that the method performs equally well regardless of the language.

...read moreread less

Abstract: This paper describes an efficient algorithm for language independent generic extractive summarization for single document The algorithm is based on structural and statistical (rather than semantic) factors Through evaluations performed on a single-document summarization for English, Hindi, Gujarati and Urdu documents, we show that the method performs equally well regardless of the language The algorithm has been applied on DUC data for English documents and various newspaper articles for other languages with corresponding stop words list and modified stemmer The results of summarization have been compared with DUC 2002 data using degree of representativeness For other languages, the degree of representativeness we get is highly encouraging

...read moreread less

Proceedings Article•10.1145/1277741.1277949•

TimedTextRank: adding the temporal dimension to multi-document summarization

[...]

Xiaojun Wan¹•Institutions (1)

Peking University¹

23 Jul 2007

TL;DR: This work proposes the TimedTextRank algorithm to make use of the temporal information of documents based on the graph-ranking based algorithm for dynamic multi-document summarization.

...read moreread less

Abstract: Graph-ranking based algorithms (e.g. TextRank) have been proposed for multi-document summarization in recent years. However, these algorithms miss an important dimension, the temporal dimension, for summarizing evolving topics. For an evolving topic, recent documents are usually more important than earlier documents because recent documents contain much more novel information than earlier documents and a novelty-oriented summary should be more appropriate to reflect the changing topic. We propose the TimedTextRank algorithm to make use of the temporal information of documents based on the graph-ranking based algorithm. A preliminary study is performed to demonstrate the effectiveness of the proposed TimedTextRank algorithm for dynamic multi-document summarization.

...read moreread less

Book Chapter•10.1007/978-3-540-69507-3_66•

Multi-document Summarization Based on Cluster Using Non-negative Matrix Factorization

[...]

Sun Park¹, Ju-Hong Lee¹, Deok-Hwan Kim¹, Chan-Min Ahn¹•Institutions (1)

Inha University¹

20 Jan 2007

TL;DR: A new summarization method, which uses non-negative matrix factorization (NMF) and K-means clustering, is introduced to extract meaningful sentences from multi-documents and has better performance than other methods using the LSA, the Kmeans, and the NMF.

...read moreread less

Abstract: In this paper, a new summarization method, which uses non-negative matrix factorization (NMF) and K-means clustering, is introduced to extract meaningful sentences from multi-documents. The proposed method can improve the quality of document summaries because the inherent semantics of the documents are well reflected by using the semantic features calculated by NMF and the sentences most relevant to the given topic are extracted efficiently by using the semantic variables derived by NMF. Besides, it uses K-means clustering to remove noises so that it can avoid the biased inherent semantics of the documents to be reflected in summaries. We perform detail experiments with the well-known DUC test dataset. The experimental results demonstrate that the proposed method has better performance than other methods using the LSA, the Kmeans, and the NMF.

...read moreread less

Proceedings Article•10.1145/1216295.1216311•

From social bookmarking to social summarization: an experiment in community-based summary generation

[...]

Oisín Boydell¹, Barry Smyth¹•Institutions (1)

University College Dublin¹

28 Jan 2007

TL;DR: A comprehensive evaluation demonstrates how the social summarization technique can generate summaries that are of significantly higher quality that those produced by a number of leading alternatives.

...read moreread less

Abstract: We describe a novel document summarization technique that uses informational cues, such as social bookmarks or search queries, as the basis for summary construction by leveraging the snippet-generation capabilities of standard search engines. A comprehensive evaluation demonstrates how the social summarization technique can generate summaries that are of significantly higher quality that those produced by a number of leading alternatives.

...read moreread less

Journal Article•10.3103/S0005105507030041•

A method for evaluating modern systems of automatic text summarization

[...]

V. A. Yatsko, T. N. Vishnyakov

01 Jun 2007-Automatic Documentation and Mathematical Linguistics

TL;DR: Four modern systems of automatic text summarization are tested on the basis of a model vocabulary composed by subjects and principles for evaluation of the efficiency of the current systems are described.

...read moreread less

Abstract: Four modern systems of automatic text summarization are tested on the basis of a model vocabulary composed by subjects. Distribution of terms of the vocabulary in the source text is compared with their distribution in summaries of different length generated by the systems. Principles for evaluation of the efficiency of the current systems of automatic text summarization are described.

...read moreread less

Proceedings Article•10.7916/D8W384S4•

Question Answering Using Integrated Information Retrieval and Information Extraction

[...]

Barry Schiffman¹, Kathleen R. McKeown¹, Ralph Grishman, James Allan²•Institutions (2)

Columbia University¹, University of Massachusetts Amherst²

1 Apr 2007

TL;DR: An approach which draws on methods from each of these areas of information retrieval, topical summarization, and Information Extraction is presented, and the effectiveness of this approach with a query-focused summarization approach is compared.

...read moreread less

Abstract: This paper addresses the task of providing extended responses to questions regarding specialized topics. This task is an amalgam of information retrieval, topical summarization, and Information Extraction (IE). We present an approach which draws on methods from each of these areas, and compare the effectiveness of this approach with a query-focused summarization approach. The two systems are evaluated in the context of the prosecution queries like those in the DARPA GALE distillation evaluation.

...read moreread less

Book Chapter•10.1007/978-3-540-72665-4_41•

Fuzzy Clustering for Topic Analysis and Summarization of Document Collections

[...]

René Witte¹, Sabine Bergler²•Institutions (2)

Karlsruhe Institute of Technology¹, Concordia University²

28 May 2007

TL;DR: A clustering algorithm based on fuzzy set theory, which is easy to implement and integrate into a personal information system, generates a highly flexible data structure for topic analysis and summarization, and also delivers excellent performance is shown.

...read moreread less

Abstract: Large document collections, such as those delivered by Internet search engines, are difficult and time-consuming for users to read and analyse. The detection of common and distinctive topics within a document set, together with the generation of multi-document summaries, can greatly ease the burden of information management. We show how this can be achieved with a clustering algorithm based on fuzzy set theory, which (i) is easy to implement and integrate into a personal information system, (ii) generates a highly flexible data structure for topic analysis and summarization, and (iii) also delivers excellent performance.

...read moreread less

Proceedings Article•10.1109/ICCITECHN.2007.4579374•

A study on text summarization techniques and implement few of them for Bangla language

[...]

M.N. Uddin¹, S.A. Khan•Institutions (1)

BRAC University¹

1 Dec 2007

TL;DR: This paper presents a text summarizer for Bangla, which uses some extraction methods for text summarization.

...read moreread less

Abstract: Text summarization is the technique which automatically creates an abstract or summary of a text. The technique has been developed for many years. So a survey has been done on different summarization techniques. No work in this area has been done for Bangla language. This paper presents a text summarizer for Bangla, which uses some extraction methods for text summarization.

...read moreread less

Proceedings Article•

Single document summarization with document expansion

[...]

Xiaojun Wan¹, Jianwu Yang¹•Institutions (1)

Peking University¹

22 Jul 2007

TL;DR: The experimental results on the DUC2002 dataset demonstrate the effectiveness of the proposed approach based on document expansion, and the cross-document relationships between sentences in the expanded document set are validated to be very important for single document summarization.

...read moreread less

Abstract: Existing methods for single document summarization usually make use of only the information contained in the specified document This paper proposes the technique of document expansion to provide more knowledge to help single document summarization A specified document is expanded to a small document set by adding a few neighbor documents close to the document, and then the graph-ranking based algorithm is applied on the expanded document set for extracting sentences from the single document, by making use of both the within-document relationships between sentences of the specified document and the cross-document relationships between sentences of all documents in the document set The experimental results on the DUC2002 dataset demonstrate the effectiveness of the proposed approach based on document expansion The cross-document relationships between sentences in the expanded document set are validated to be very important for single document summarization

...read moreread less

Journal Article•10.1016/J.DSS.2005.05.012•

An information delivery system with automatic summarization for mobile commerce

[...]

Christopher C. Yang¹, Fu Lee Wang²•Institutions (2)

The Chinese University of Hong Kong¹, City University of Hong Kong²

1 Feb 2007

TL;DR: The fractal summarization model for document summarization on handheld devices, developed based on the fractal theory, is introduced and the three-tier architecture with the middle-tier conducting the major computation is discussed.

...read moreread less

Abstract: Wireless access with handheld devices is a promising addition to the WWW and traditional electronic business. Handheld devices provide convenience and portable access to the huge information space on the Internet without requiring users to be stationary with network connection. Many customer-centered m-services applications have been developed. The mobile computing, however, should be extended to decision support in an organization. There is a desire of accessing most update and accurate information on handheld devices for fast decision making in an organization. Unfortunately, loading and visualizing large documents on handheld devices are impossible due to their shortcomings. In this paper, we introduce the fractal summarization model for document summarization on handheld devices. Fractal summarization is developed based on the fractal theory. It generates a brief skeleton of summary at the first stage, and the details of the summary on different levels of the document are generated on demands of users. Such interactive summarization reduces the computation load in comparing with the generation of the entire summary in one batch by the traditional automatic summarization, which is ideal for wireless access. The three-tier architecture with the middle-tier conducting the major computation is also discussed. Visualization of summary on handheld devices is also investigated. The automatic summarization, the three-tier architecture, and the information visualization are potential solutions to the existing problems in information delivery to handheld devices for mobile commerce.

...read moreread less

LCC's GISTexter at DUC 2007: Machine Reading for Update Summarization

[...]

Andrew Hickl, Kirk Roberts, Finley Lacatusu

1 Jan 2007

TL;DR: By using a machine reading (MR) framework in order to construct representations of the knowledge inferable from a text collection, Language Computer Corporation’s GISTEXTER systems were able to create coherent sets of iupdatei summaries that were likely to contain inewi information that could not be inferred from any previously considered document.

...read moreread less

Abstract: In this paper, we describe Language Computer Corporation’s GISTEXTER question-focused and update-based multidocument summarization (MDS) systems. We show that by using a machine reading (MR) framework in order to construct representations of the knowledge inferable from a text collection, we were able to create coherent sets of iupdatei summaries that were likely to contain inewi information that could not be inferred from any previously considered document. Details of our DUC 2007 Main Task submission are provided as well.

...read moreread less

Journal Article•10.1075/TERM.13.2.07CUN•

Summarization of specialized discourse: the case of medical articles in spanish

[...]

Iria da Cunha, Leo Wanner, Teresa Cabré

01 Jan 2007-Terminology

TL;DR: A linguistically-motivated model for automatic summarization of medical articles in Spanish that takes into account the textual, lexical, discursive, syntactic and communicative dimensions and is suitable to provide high quality summarizations.

...read moreread less

Abstract: In this article, we present the current state of our work on a linguistically-motivated model for automatic summarization of medical articles in Spanish. The model takes into account the results of an empirical study which reveals that, on the one hand, domain-specific summarization criteria can often be derived from the summaries of domain specialists, and, on the other hand, adequate summarization strategies must be multidimensional, i.e., cover various types of linguistic clues. We take into account the textual, lexical, discursive, syntactic and communicative dimensions. This is novel in the field of summarization. The experiments carried out so far indicate that our model is suitable to provide high quality summarizations.

...read moreread less

...

Expand

Showing papers on "Multi-document summarization published in 2007"

Manifold-ranking based topic-focused multi-document summarization

Enhancing Single-Document Summarization by Combining RankNet and Third-Party Sources

Tag clouds for summarizing web search results

Character-based automated media summarization

Towards an Iterative Reinforcement Approach for Simultaneous Document Summarization and Keyword Extraction

The use of domain-specific concepts in biomedical text summarization

Explorations in Automatic Book Summarization

A coherent graph-based semantic clustering and summarization approach for biomedical literature and a new summarization evaluation method

Measuring Importance and Query Relevance in Topic-focused Multi-document Summarization

Satisfying information needs with multi-document summaries

A Comparative Study on Speech Summarization of Broadcast News and Lecture Speech

CollabSum: exploiting multiple document clustering for collaborative single document summarizations

Resource Lean and Portable Automatic Text Summarization

DUC 2005: Evaluation of Question-Focused Summarization Systems

A Semantic Free-text Summarization System Using Ontology Knowledge

Older versions of the ROUGEeval summarization evaluation system were easier to fool

Automated summarization of narrative video on a semantic level

Multiple Alternative Sentence Compressions for Automatic Text Summarization

A language independent approach to multilingual text summarization

TimedTextRank: adding the temporal dimension to multi-document summarization

Multi-document Summarization Based on Cluster Using Non-negative Matrix Factorization

From social bookmarking to social summarization: an experiment in community-based summary generation

A method for evaluating modern systems of automatic text summarization

Question Answering Using Integrated Information Retrieval and Information Extraction

Fuzzy Clustering for Topic Analysis and Summarization of Document Collections

A study on text summarization techniques and implement few of them for Bangla language

Single document summarization with document expansion

An information delivery system with automatic summarization for mobile commerce

LCC's GISTexter at DUC 2007: Machine Reading for Update Summarization

Summarization of specialized discourse: the case of medical articles in spanish