Top 46 papers published in the topic of Multi-document summarization in 2003

Showing papers on "Multi-document summarization published in 2003"

Intrinsic Evaluation of Generic News Text Summarization Systems

[...]

Paul Over

1 Jan 2003

140 citations

Proceedings Article•10.3115/1075178.1075197•

iNeATS: Interactive Multi-Document Summarization

[...]

Anton Leuski¹, Chin-Yew Lin¹, Eduard Hovy¹•Institutions (1)

University of Southern California¹

7 Jul 2003

TL;DR: iNeATS is an interactive multi-document summarization system that integrates a state-of-the-art summarization engine with an advanced user interface that combines text summaries with alternative presentations such as a map-based visualization of documents.

...read moreread less

Abstract: We describe iNeATS -- an interactive multi-document summarization system that integrates a state-of-the-art summarization engine with an advanced user interface. Three main goals of the system are: (1) provide a user with control over the summarization process, (2) support exploration of the document set with the summary as the staring point, and (3) combine text summaries with alternative presentations such as a map-based visualization of documents.

...read moreread less

62 citations

Proceedings Article•10.7916/D8348TR5•

Automatic Summarization of Broadcast News using Structural Features

[...]

Julia Hirschberg, Sameer Maskey

1 Jan 2003

TL;DR: A directed graphical model is constructed to represent the probability distribution and dependencies among the structural features of broadcast news, which is trained by finding the values of parameters of the conditional probability tables.

...read moreread less

Abstract: We present a method for summarizing broadcast news that is not affected by word errors in an automatic speech recognition transcription, using information about the structure of the news program. We construct a directed graphical model to represent the probability distribution and dependencies among the structural features which we train by finding the values of parameters of the conditional probability tables. We then rank segments of the test set and extract the highest ranked ones as a summary. We present the procedure and preliminary test results.

...read moreread less

58 citations

Proceedings Article•10.3115/1119467.1119474•

Text summarization challenge 2: text summarization evaluation at NTCIR workshop 3

[...]

Manabu Okumura¹, Takahiro Fukusima², Hidetsugu Nanba³•Institutions (3)

Tokyo Institute of Technology¹, Otemon Gakuin University², Hiroshima City University³

31 May 2003

TL;DR: The outline of Text Summarization Challenge 2 (TSC2 hereafter), a sequel text summarization evaluation conducted as one of the tasks at the NTCIR Workshop 3, is described.

...read moreread less

Abstract: We describe the outline of Text Summarization Challenge 2 (TSC2 hereafter), a sequel text summarization evaluation conducted as one of the tasks at the NTCIR Workshop 3. First, we describe briefly the previous evaluation, Text Summarization Challenge (TSC1) as introduction to TSC2. Then we explain TSC2 including the participants, the two tasks in TSC2, data used, evaluation methods for each task, and brief report on the results.

...read moreread less

55 citations

Proceedings Article•10.3115/1073445.1073482•

A web-trained extraction summarization system

[...]

Liang Zhou¹, Eduard Hovy¹•Institutions (1)

Information Sciences Institute¹

27 May 2003

TL;DR: This paper presents a summarization system that uses the web as the source of training data and automatically learning to perform the task of extraction-based summarization at a level comparable to the best DUC systems.

...read moreread less

Abstract: A serious bottleneck in the development of trainable text summarization systems is the shortage of training data. Constructing such data is a very tedious task, especially because there are in general many different correct ways to summarize a text. Fortunately we can utilize the Internet as a source of suitable training data. In this paper, we present a summarization system that uses the web as the source of training data. The procedure involves structuring the articles downloaded from various websites, building adequate corpora of (summary, text) and (extract, text) pairs, training on positive and negative data, and automatically learning to perform the task of extraction-based summarization at a level comparable to the best DUC systems.

...read moreread less

42 citations

Proceedings Article•10.1117/12.515733•

Video summarization: methods and landscape

[...]

Mauro Barbieri¹, Lalitha Agnihotri¹, Nevenka Dimitrova¹•Institutions (1)

Philips¹

26 Nov 2003

TL;DR: The goal of the paper is to provide the basic definitions of widely used terms such as skimming, summarization, and highlighting and distinguish among the dimensions of task, content, and method and provide an extensive classification model for the same.

...read moreread less

Abstract: The ability to summarize and abstract information will be an essential part of intelligent behavior in consumer devices. Various summarization methods have been the topic of intensive research in the content-based video analysis community. Summarization in traditional information retrieval is a well understood problem. While there has been a lot of research in the multimedia community there is no agreed upon terminology and classification of the problems in this domain. Although the problem has been researched from different aspects there is usually no distinction between the various dimensions of summarization. The goal of the paper is to provide the basic definitions of widely used terms such as skimming, summarization, and highlighting. The different levels of summarization: local, global, and meta-level are made explicit. We distinguish among the dimensions of task, content, and method and provide an extensive classification model for the same. We map the existing summary extraction approaches in the literature into this model and we classify the aspects of proposed systems in the literature. In addition, we outline the evaluation methods and provide a brief survey. Finally we propose future research directions based on the white spots that we identified by analysis of existing systems in the literature.

...read moreread less

32 citations

Proceedings Article•10.3115/1119467.1119476•

A survey for multi-document summarization

[...]

Satoshi Sekine¹, Chikashi Nobata•Institutions (1)

New York University¹

31 May 2003

TL;DR: This work prepared 100 document sets similar to the ones used in the DUC multi-document summarization task, and conducted a survey to observe how humans are doing the same task and look around for different strategies.

...read moreread less

Abstract: Automatic Multi-Document summarization is still hard to realize. Under such circumstances, we believe, it is important to observe how humans are doing the same task, and look around for different strategies.We prepared 100 document sets similar to the ones used in the DUC multi-document summarization task. For each document set, several people prepared the following data and we conducted a survey.A) Free style summarizationB) Sentence Extraction type summarizationC) Axis (type of main topic)D) Table style summaryIn particular, we will describe the last two in detail, as these could lead to a new direction for multi-summarization research.

...read moreread less

30 citations

Journal Article•10.1080/713827177•

Summarization and categorization of text data in high-level data cleaning for information retrieval

[...]

M. Saravanan, P. C. Reghu Raj, S. Raman

01 May 2003-Applied Artificial Intelligence

TL;DR: A text-mining framework is proposed in which subsystems of a classification system are treated as constituents of a knowledge discovery process for text corpora, and whether there exists a synergic relation between systems for classification and those for summarization by way of composing those subsystems is explored.

...read moreread less

Abstract: In view of the exponential growth of online document corpora, even perfect retrieval will fetch too much material for a user to cope with. One way to reduce this problem is automatic domain-specific summarization tailored to user's needs, which is a kind of high-level data cleaning. This requires some method of discovering classes of similar items that may be grouped into predetermined domains. We explore whether there exists a synergic relation between systems for classification and those for summarization by way of composing those subsystems. In other words, we examine whether prior summarization will increase the performance of the classifier system and vice versa. In both cases, the answer is affirmative, as we show in this paper. We propose a text-mining framework in which these subsystems are treated as constituents of a knowledge discovery process for text corpora.

...read moreread less

27 citations

Book Chapter•10.1007/3-540-36618-0_19•

Clustering and visualization in a multi-lingual multi-document summarization system

[...]

Hsin-Hsi Chen¹, June-Jei Kuo¹, Tsei-Chun Su¹•Institutions (1)

National Taiwan University¹

14 Apr 2003

TL;DR: Five strategies to compute the multilingual sentence similarity are presented and the experimental results show that sentence alignment without considering the word position or order in a sentence obtains the best performance.

...read moreread less

Abstract: To measure the similarity of words, sentences, and documents is one of the major issues in multi-lingual multi-document summarization. This paper presents five strategies to compute the multilingual sentence similarity. The experimental results show that sentence alignment without considering the word position or order in a sentence obtains the best performance. Besides, two strategies are proposed for multilingual document clustering. The two-phase strategy (translation after clustering) is better than one-phase strategy (translation before clustering). Translation deferred to sentence clustering, which reduces the propagation of translation errors, is most promising. Moreover, three strategies are proposed to tackle the sentence clustering. Complete link within a cluster has the best performance, however, the subsumption-based clustering has the advantage of lower computation complexity and similar performance. Finally, two visualization models (i.e., focusing and browsing), which consider the users' language preference, are proposed.

...read moreread less

27 citations

Performance of a Three-Stage System for Multi-Document Summarization

[...]

Daniel M. Dunlavy¹, J. Conroy, J. Schlesinger, H. van Halteren•Institutions (1)

University of Maryland, College Park¹

1 Jan 2003

TL;DR: The preprocessing of the data for the group's needs consisted of term identification, part-of-speech (POS) tagging, sentence boundary detection and SGML DTD processing, and the post-processing consisted of removing lead adverbs such as “And” or “But” to make the summaries flow more easily.

...read moreread less

Abstract: Our participation in DUC 2003 was limited to Tasks 2, 3, and 4. Although the tasks differed slightly in their goals, we applied the same approach in each case: preprocess the data for input to our system, apply our single-document and multi-document summarization algorithms, post-process the data for DUC evaluation. We did not use the topic descriptions for Task 2 or the viewpoint descriptions for Task 3, and used only the novel sentences for Task 4. The preprocessing of the data for our needs consisted of term identification, part-of-speech (POS) tagging, sentence boundary detection and SGML DTD processing. With the exception of sentence boundary detection for Task 4 (the test data was sentence-delimited using SGML tags), each of these preprocessing tasks were performed on all of the documents. Details of each of these tasks are presented in Section 2. The summarization algorithms were enhanced versions of those presented by members of our group in the past DUC evaluations (Conroy et al., 2001; Schlesinger et al., 2002). The enhancements to the previous system are detailed in Section 3. Previous post-processing consisted of removing lead adverbs such as “And” or “But” to make our summaries flow more easily. For DUC 2003, we added more extensive editing, eliminating part or all of selected sentences. This post-processing is described in Section 4.

...read moreread less

25 citations

Proceedings Article•10.1145/951676.951679•

The CPR model for summarizing video

[...]

Marat Fayzullin¹, V. S. Subrahmanian¹, Antonio Picariello, Maria Luisa Sapino²•Institutions (2)

University of Maryland, College Park¹, University of Turin²

7 Nov 2003

TL;DR: This work proposes a model of video summarization based on three important parameters: Priority, Continuity, and non-Repetition, and develops formal definitions of all these concepts and provides algorithms to find optimal summaries.

...read moreread less

Abstract: Most past work on video summarization has been based on selecting key frames from videos. We propose a model of video summarization based on three important parameters: Priority (of frames), Continuity (of the summary), and non-Repetition (of the summary). In short, a summary must include high priority frames, must be continuous and non-repetitive. An optimal summary is one that maximizes an objective function based on these three parameters. We develop formal definitions of all these concepts and provide algorithms to find optimal summaries. We briefly report on the performance of these algorithms.

...read moreread less

Automatic Summarization for Financial News Delivery on Mobile Devices.

[...]

Christopher C. Yang¹, Fu Lee Wang•Institutions (1)

The Chinese University of Hong Kong¹

1 Jan 2003

TL;DR: This paper presents a financial news delivery system on mobile devices based on the fractal summarization model, which reduces the computation load in comparing with the generation of the entire summary in one batch by the traditional summarization, which is ideal for wireless access.

...read moreread less

Abstract: Christopher C C Yang and Fu Lee Wang Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China {yang, flwang}@secuhkeduhk ABSTRACT Wireless access with mobile devices is a promising addition to the WWW and traditional electronic business Mobile devices provide convenience and portable access to the huge information space on the Internet It is desire to access the most updated financial information through mobile devices in order to make critical and urgent decision for most of the investors In this paper, we present a financial news delivery system on mobile devices based on the fractal summarization model Fractal summarization is developed based on the fractal theory It generates a brief skeleton of summary at the first stage, and the details of the summary on different levels of the document are generated on demands of users Such interactive summarization reduces the computation load in comparing with the generation of the entire summary in one batch by the traditional summarization, which is ideal for wireless access

...read moreread less

Proceedings Article•10.1109/ICIP.2003.1246738•

Automatically generating summaries for musical video

[...]

Xi Shao¹, Changsheng Xu¹, Mohan S. Kankanhalli•Institutions (1)

National University of Singapore¹

24 Nov 2003

TL;DR: The experiments on different genres of musical video and comparisons with the summaries only based on music track and video track indicate that the results of summarization using proposed method are significant and effective to help realize user's expectation.

...read moreread less

Abstract: In this paper, we propose a novel approach to automatically summarize musical videos. The proposed summarization scheme is different from the current methods used for video summarization. The musical video is separated into the musical and visual tracks. A music summary is created by analyzing the music content based on music features, adaptive clustering algorithm and musical domain knowledge. Then, shots are detected and clustered in the visual track. Finally, the music video summary is created by aligning the music summary and clustered video shots. Subjective studies by experienced users have been conducted to evaluate the quality of summarization. The experiments on different genres of musical video and comparisons with the summaries only based on music track and video track indicate that the results of summarization using proposed method are significant and effective to help realize user's expectation.

...read moreread less

Automatic text summarization as applied to information retrieval: using indicative and informative summaries

[...]

Min-Yen Kan, Kathleen R. McKeown, Judith L. Klavans

1 Jan 2003

10.7916/D8S189BR•

A Platform for Multilingual News Summarization

[...]

David Evans, Judith L. Klavans

1 Jan 2003

TL;DR: A multilingual version of Columbia Newsblaster is developed as a testbed for multilingual multi-document summarization, providing a platform for testing different strategies for mult linguistic document clustering, and approaches for mult bilingual multi- document summarization.

...read moreread less

Abstract: We have developed a multilingual version of Columbia Newsblaster as a testbed for multilingual multi-document summarization. The system collects, clusters, and summarizes news documents from sources all over the world daily. It crawls news sites in many different countries, written in different languages, extracts the news text from the HTML pages, uses a variety of methods to translate the documents for clustering and summarization, and produces an English summary for each cluster. The system is robust, running daily over real-world data. The multilingual version of Columbia Newsblaster provides a platform for testing different strategies for multilingual document clustering, and approaches for multilingual multi-document summarization. A Platform for Multilingual News Summarization

...read moreread less

Proceedings Article•10.1109/ASRU.2003.1318490•

Exploring the style-technique interaction in extractive summarization of broadcast news

[...]

BalaKrishna Kolluru¹, Heidi Christensen¹, Yoshihiko Gotoh¹, Steve Renals¹•Institutions (1)

University of Sheffield¹

30 Nov 2003

TL;DR: The initial results indicate that some summarization techniques work better for the documents with spontaneous speech than for those with planned speech.

...read moreread less

Abstract: In this paper we seek to explore the interaction between the style of a broadcast news story and its summarization technique. We report the performance of three different summarization techniques on broadcast news stories, which are split into planned speech and spontaneous speech. The initial results indicate that some summarization techniques work better for the documents with spontaneous speech than for those with planned speech. Even for human beings some documents are inherently difficult to summarize. We observe this correlation between degree of difficulty in summarizing and performance of the three automatic summarizers. Given the high frequency of named entities in broadcast news and even greater number of references to these named entities, we also gauge the effect of named entity and co-reference resolution in a news story, on the performance of these summarizers.

...read moreread less

Proceedings Article•

Spoken Language Condensation in the 21st Century

[...]

Klaus Zechner

1 Jan 2003

TL;DR: This paper will contrast speech summarization with text summarization, give an overview of the history of speech summarizing, its current state, and sketch possible avenues as well as remaining challenges in future research.

...read moreread less

Abstract: While the field of Information Retrieval originally had the search for the most relevant documents in mind, it has become increasingly clear that in many instances, what the user wants is a piece of coherent information, derived from a set of relevant documents and possibly other sources. Reducing relevant documents, passages, and sentences to their core is the task of text summarization or information condensation. Applying text-based technologies to speech is not always workable and often not enough to capture speech specific phenomena. In this paper, we will contrast speech summarization with text summarization, give an overview of the history of speech summarization, its current state, and, finally, sketch possible avenues as well as remaining challenges in future research.

...read moreread less

Text Summarization Using XML-Tagged Documents

[...]

Kenneth C. Litkowski

1 Jan 2003

TL;DR: The CL Research system performed at a higher than expected level, finishing first in mean length-adjusted coverage for summaries against a provided viewpoint, and demonstrates the basic viability of using XML-tagged documents for summarization.

...read moreread less

Abstract: CL Research’s participation in the Document Understanding Conference extended the framework used in the TREC 2003 question-answering track, in which texts are parsed and processed into XML-tagged documents where sentence elements are marked with discourse, syntactic, and semantic attributes. This extension was made primarily to test the viability of using XML-tagged documents for summarization. The extension of the Knowledge Management System was able to take advantage of these attributes in implementing various text summarization capabilities. While implementation of these capabilities made little use of current summarization technologies, the CL Research system performed at a higher than expected level, finishing first in mean length-adjusted coverage for summaries against a provided viewpoint. The system performed less well on this measure in event summarization (tenth), novelty summarization (fifth), and headline generation (eleventh), but performed well on quality measures (finishing first among teams participating in all tasks)) and relevance (finishing first on each summarization task, with all sentences in these tasks judged relevant to the topic). The system’s performance arises primarily from the use of an “antecedent” tag attached to referring expressions (such as pronouns) within a document. In particular, when accumulating word frequencies, the antecedent was used instead of the referring expression; thus, instead of treating a pronoun as a word in the frequency count, its antecedent was used. The system’s performance demonstrates the basic viability of using XML-tagged documents. Many options were explored in setting up the summarization capability, indicating considerable flexibility in examining documents from many perspectives and considerable potential in possible further improvements in the system. The system can indicate not only that a concept appears frequently within a document, but also how it is used (e.g., as subject, verb object, or prepositional object). More specifically, the availability of considerable structural information within documents permits a relatively simple examination of phenomena that have been used in text summarization, as well as the creation of a document’s semantic network.

...read moreread less

10.7916/D8MS4237•

Columbia at the Document Understanding Conference 2003

[...]

Ani Nenkova, Barry Schiffman, Andrew Schlaiker, Sasha Blair-Goldensohn, Regina Barzilay, Sergey Sigelman, Vasileios Hatzivassiloglou, Kathleen R. McKeown - Show less +4 more

1 Jan 2003

TL;DR: The Columbia Summarizer for DUC 2003, Task 2, is based on the multi-document summarization system that was developed for D UC 2002 and uses different summarization strategies depending on the type of documents in the input set.

...read moreread less

Abstract: The Columbia Summarizer for DUC 2003, Task 2, isbased on the multi-document summarization system thatwe developed for DUC 2002 (McKeown et al., 2002). Ituses different summarization strategies depending on thetype of documents in the input set. Four different strate-gies are used, one for single events, one for multiple re-lated events, one for biographies and one for discussionof an issue with related events. The summarization strat-egy encoded in M

...read moreread less

Proceedings Article•10.1109/NLPKE.2003.1276003•

An intelligent algorithm for automatic document summarization

[...]

Y. Guo¹, George K Stylios¹•Institutions (1)

Heriot-Watt University¹

26 Oct 2003

TL;DR: An intelligent algorithm, the event indexing and summarization (EIS) algorithm, for automatic document summarization, which is based on taking into account a cognitive psychological model, theevent-indexing model, and the roles and importance of sentences and their syntax in document understanding is introduced.

...read moreread less

Abstract: Automatic document summarization is a highly interdisciplinary research area related with computer science as well as cognitive psychology Here, we introduce an intelligent algorithm, the event indexing and summarization (EIS) algorithm, for automatic document summarization, which is based on taking into account a cognitive psychological model, the event-indexing model, and the roles and importance of sentences and their syntax in document understanding The EIS algorithm involves syntactic analysis of sentences, clustering and indexing sentences with the five indices from the event-indexing model, and extracting the most prominent content by lexical analysis at phrase and clause levels After thorough implementation and objective evaluations, the algorithm has now shown good performance in multidocument summarization

...read moreread less

Proceedings Article•10.3115/1119467.1119470•

Multi-document summarization using off the shelf compression software

[...]

Amardeep Grewal¹, Timothy Allison¹, Stanko Dimitrov¹, Dragomir R. Radev¹•Institutions (1)

University of Michigan¹

31 May 2003

TL;DR: This study examines the usefulness of common off the shelf compression software such as gzip in enhancing already existing summaries and producing summaries from scratch and hypothesized that the summary will gain the sentence with the most new information.

...read moreread less

Abstract: This study examines the usefulness of common off the shelf compression software such as gzip in enhancing already existing summaries and producing summaries from scratch. Since the gzip algorithm works by removing repetitive data from a file in order to compress it, we should be able to determine which sentences in a summary contain the least repetitive data by judging the gzipped size of the summary with the sentence compared to the gzipped size of the summary without the sentence. By picking the sentence that increased the size of the summary the most, we hypothesized that the summary will gain the sentence with the most new information. This hypothesis was found to be true in many cases and to varying degrees in this study.

...read moreread less

Web Document Summarization by Context.

[...]

Jean-Yves Delort, Bernadette Bouchon-Meunier, Maria Rifqi

1 Jan 2003

TL;DR: This paper puts forward two new summarization by context algorithms that uses both the content and the context and the second one relies only on the elements of the context.

...read moreread less

Abstract: This paper adresses the issue of Web document summarization. We consider the context of a Web document by the set of pieces of information extracted from the content of all the documents linked to it. We put forward two new summarization by context algorithms. The first one uses both the content and the context and the second one relies only on the elements of the context. It is shown that summaries based on the context are usually much more relevant than those only made from the content of the target. Optimal conditions on the size of the content and the context of the document to yield the best summaries are studied.

...read moreread less

Book Chapter•10.1007/978-3-540-24594-0_5•

Automatic summarization of Chinese and English parallel documents

[...]

Fu Lee Wang¹, Christopher C. Yang²•Institutions (2)

City University of Hong Kong¹, The Chinese University of Hong Kong²

8 Dec 2003

TL;DR: This paper compares the result of fractal summarization technique on parallel documents in Chinese and English and finds that grammatical and lexical differences between Chinese andEnglish have significant effect on the summarization processes.

...read moreread less

Abstract: As a result of the rapid growth in Internet access, significantly more information has become available online in real time. However, there is not sufficient time for users to read large volumes of information and make decisions accordingly. The problem of information-overloading can be resolved through the application of automatic summarization. Many summarization systems for documents in different languages have been implemented. However, the performance of summarization system on documents in different languages has not yet been investigated. In this paper, we compare the result of fractal summarization technique on parallel documents in Chinese and English. The grammatical and lexical differences between Chinese and English have significant effect on the summarization processes. Their impact on the performances of the summarization for the Chinese and English parallel documents is compared.

...read moreread less

Proceedings Article•

Sentence Extraction by Spreading Activation with Refined Similarity Measure

[...]

Naoaki Okazaki, Yutaka Matsuo, Naohiro Matsumura, Mitsuru Ishizuka

1 Jan 2003

TL;DR: A novel method to extract a set of comprehensible sentences that centers on several key points is proposed, which generates a similarity network from documents with a lexical dictionary and applies spreading activation to rank sentences.

...read moreread less

Abstract: Although there has been a great deal of research on automatic summarization, most methods are based on a statistical approach, disregarding relationships between extracted textual segments. To ensure sentence connectivity, we propose a novel method to extract a set of comprehensible sentences that centers on several key points. This method generates a similarity network from documents with a lexical dictionary and applies spreading activation to rank sentences. Also, we show evaluation results of a multi-document summarization system based on the method, participating in a competition of summarization, TSC (Text Summarization Challenge) task organized by the third NTCIR (NII-NACSIS Test Collection for IR Systems) project.

...read moreread less

Proceedings Article•10.1117/12.510951•

MPEG content summarization based on compressed domain feature analysis

[...]

Masaru Sugano, Yasuyuki Nakajima, Hiromasa Yanagihara

26 Nov 2003

TL;DR: By analyzing semantically important low-level and mid-level audiovisual features, this method universally summarizes the MPEG-1/-2 contents in the form of digest or highlight, and shows that news highlights and sports highlights in TV baseball games can be successfully extracted according to simple shot transition models.

...read moreread less

Abstract: This paper addresses automatic summarization of MPEG audiovisual content on compressed domain By analyzing semantically important low-level and mid-level audiovisual features, our method universally summarizes the MPEG-1/-2 contents in the form of digest or highlight The former is a shortened version of an original, while the latter is an aggregation of important or interesting events In our proposal, first, the incoming MPEG stream is segmented into shots and the above features are derived from each shot Then the features are adaptively evaluated in an integrated manner, and finally the qualified shots are aggregated into a summary Since all the processes are performed completely on compressed domain, summarization is achieved at very low computational cost The experimental results show that news highlights and sports highlights in TV baseball games can be successfully extracted according to simple shot transition models As for digest extraction, subjective evaluation proves that meaningful shots are extracted from content without a priori knowledge, even if it contains multiple genres of programs Our method also has the advantage of generating an MPEG-7 based description such as summary and audiovisual segments in the course of summarization

...read moreread less

Journal Article•

SVM-Based Multi-Document Summarization Integrating Sentence Extraction with Bunsetsu Elimination

[...]

T. Hirao

01 Sep 2003-IEICE Transactions on Information and Systems

Proceedings Article•10.1109/ICMLC.2003.1264442•

Sentences clustering based automatic summarization

[...]

Jian-Hui Wang¹, Shuigeng Zhou¹, Yunfa Hu¹•Institutions (1)

Fudan University¹

2 Nov 2003

TL;DR: An algorithm, which summarizes a document by extracting subtopics from the sentences, is based on statistics and partially understanding message, in order to get better summarization and get rid of the dependence on domain.

...read moreread less

Abstract: There are two ways by which the research on automatic summarization is carried out. One is based on statistics, and the other is based on message understanding. The former has nothing to do with domain, but its accuracy is lower. On the contrary, the latter depends on domain, but its accuracy is higher. In this paper, an algorithm, which summarizes a document by extracting subtopics from the sentences, is based on statistics and partially understanding message, in order to get better summarization and get rid of the dependence on domain. Besides, since it is difficult to determine the length of a summary manually, the algorithm also strives to obtain a better summary with proper length. To this end, a new module of mutual dependence is put forward too and applied to segmentation, which can select accuracy features for the summarizing algorithm. And then new rules are brought forward to evaluate sentences for the summarizing algorithm. Furthermore, a new task based algorithm to evaluating summarization is impersonally offered.

...read moreread less

Journal Article•

Research and Implementation of Automatic Multi-Document Summarization System

[...]

Huang Xuan

01 Jan 2003-Journal of Computer Research and Development

TL;DR: By using real Chinese corpus, experimental results show the system' s effectiveness and suitability, and a statistical approach to multi-document summarization is presented.

...read moreread less

Abstract: Automatic multi-document summarization is an outgrowth of single document summarization. A statistical approach to multi-document summarization is presented. It utilizes the semantic relevance between segments of documents. Text-tiling algorithm is implemented to break documents into semantic relevant segments. These segments are merged into some topic classes according to the semantic similarity by using clustering algorithm. The representative segments are extracted from topic classes to form the summarization result. By using real Chinese corpus, experimental results show the system' s effectiveness and suitability.

...read moreread less

Proceedings Article•10.1145/765891.766075•

TelMeA2003: social summarization in online communities

[...]

Toru Takahashi, Yasuhiro Katagiri

5 Apr 2003

TL;DR: The concept of social summarization as an alternative to the content-based technology for summarization is proposed and its use in the community system TelMeA2003 implemented and employed is reported on.

...read moreread less

Abstract: We propose the concept of social summarization as an alternative to the content-based technology for summarization, report on its use in the community system TelMeA2003 implemented and employed to investigate techniques for social summarization, and discuss its effectiveness in supporting collaborative activities in online communities.

...read moreread less

Book Chapter•10.1007/978-3-540-71009-7_13•

Social summarization for semantic society

[...]

Yasuhiro Katagiri, Toru Takahashi, Noriko H. Arai¹•Institutions (1)

National Institute of Informatics¹

23 Jun 2003

TL;DR: The idea of social summarization and its implementation in the community system TelMeA2003, which is being developed to investigate its effectiveness in supporting collaborative activities in online communities, is described.

...read moreread less

Abstract: We propose the concept of social summarization as a new technology for semantic computing. Social summarization focuses on human evaluative acts toward information, and provides an alternative to the content-based methods employed in the conventional information summarization technologies. We describe the idea of social summarization and its implementation in the community system TelMeA2003, which is being developed to investigate its effectiveness in supporting collaborative activities in online communities. We also report on the preliminary analysis of TelMeA2003 based on our experience obtained in a distance learning community e-Kyositu.

...read moreread less