Topic-based vector space model

Topic Tools

Papers

[...]

Jörg Becker¹, Dominik Kuropka¹•Institutions (1)

1 Jan 2003

TL;DR: This paper shows further how the Topic-based Vector Space Model can be fully implemented within the context of relational databases and facilitates the use of this approach by generic applications.

...read moreread less

Abstract: This paper motivates and presents the Topic-based Vector Space Model (TVSM), a new vector-based approach for document comparison. The approach does not assume independence between terms and it is flexible regarding the specification of term-similarities. Stopword-list, stemming and thesaurus can be fully integrated into the model. This paper shows further how the TVSM can be fully implemented within the context of relational databases. This facilitates the use of this approach by generic applications. At the end short comparisons with other vector-based approaches namely the Vector Space Model (VSM) and the Generalized Vector Space Model (GVSM) are presented.

...read moreread less

97 citations

Journal Article•10.1016/J.IPM.2021.102536•

Text summarization using topic-based vector space model and semantic measure

[...]

Ramesh Chandra Belwal¹, Sawan Rai¹, Atul Gupta¹•Institutions (1)

Indian Institute of Information Technology, Design and Manufacturing, Jabalpur¹

01 May 2021-Information Processing and Management

TL;DR: In this article, a text summarization technique that incorporates topic modeling and semantic measure within the vector space model to find the extractive summary of the given text is proposed, where the main objective is to address the redundancy problem associated with summarization methods and include only those sentences in summary, which represent the maximum of the topics embedded in the text document.

...read moreread less

Abstract: The primary shortcoming associated with extractive text summarization is redundancy, where more than one sentence representing a similar type of information are incorporated in summary In the last two decades, a lot of extractive text summarization methods have been proposed, but less attention was paid to the redundancy issue In this paper, we propose a text summarization technique that incorporates topic modeling and semantic measure within the vector space model to find the extractive summary of the given text Our main objective is to address the redundancy problem associated with summarization methods and include only those sentences in summary, which represent the maximum of the topics embedded in the given text document We generate the topic vector of the given document by representing the sentences in an intermediate form using a vector space model and topic modeling Moreover, to make the proposed method efficient, we incorporate the semantic similarity measure to find the relevance of the sentence We introduce two different ways to create the topic vector from the given document, ie, Combined topic vector and Individual topic vector approach Evaluation results on two datasets show that the summaries generated by both variants (Combined and Individual topic vector techniques) of the proposed method are found to be closer to the human-generated summaries when compared with the existing text summarization methods

...read moreread less

56 citations

A Quantitative Evalution of the Enhanced Topic-based Vector Space Model

[...]

Artem Polyvyanyy, Dominik Kuropka

1 Jan 2007

TL;DR: A quantitative evaluation procedure for Information Retrieval models is presented and the results of this procedure applied on the enhanced Topic-based Vector Space Model (eTVSM), showing that a manually created and optimized ontology is able to raise the effectiveness of the eTVSM to a level which is clearly above the best effectiveness levels the authors have found in the literature for the Latent Semantic Index model.

...read moreread less

Abstract: This contribution presents a quantitative evaluation procedure for Information Retrieval models and the results of this procedure applied on the enhanced Topic-based Vector Space Model (eTVSM). Since the eTVSM is an ontology-based model, its effectiveness heavily depends on the quality of the underlaying ontology. Therefore the model has been tested with different ontologies to evaluate the impact of those ontologies on the effectiveness of the eTVSM. On the highest level of abstraction, the following results have been observed during our evaluation: First, the theoretically deduced statement that the eTVSM has a similar effecitivity like the classic Vector Space Model if a trivial ontology (every term is a concept and it is independet of any other concepts) is used has been approved. Second, we were able to show that the effectiveness of the eTVSM raises if an ontology is used which is only able to resolve synonyms. We were able to derive such kind of ontology automatically from the WordNet ontology. Third, we observed that more powerful ontologies automatically derived from the WordNet, dramatically dropped the effectiveness of the eTVSM model even clearly below the effectiveness level of the Vector Space Model. Fourth, we were able to show that a manually created and optimized ontology is able to raise the effectiveness of the eTVSM to a level which is clearly above the best effectiveness levels we have found in the literature for the Latent Semantic Index model with compareable document sets.

...read moreread less

25 citations

Proceedings Article•10.1109/URKE.2011.6007864•

Application of Topic Based Vector Space Model with WordNet

[...]

Adi Wibowo¹, Andreas Handojo¹, Albert Halim¹•Institutions (1)

Petra Christian University¹

1 Sep 2011

TL;DR: This study proposes to develop relations between terms using WordNet and thesaurus to help TVSM calculating similarity between documents and proposes a way to find optimal relation score for a set of documents.

...read moreread less

Abstract: Topic Based Vector Space Model (TVSM) proposed a new vector space that its dimensions is composed of topics. Every term and document is represented by vectors inside this vector space. By using topics as dimensions TVSM tries to overcome word-mismatch between terms with similar topics in finding relevant documents to query. This study proposes to develop relations between terms using WordNet and thesaurus to help TVSM calculating similarity between documents. Relations between terms are represented by relation score. This study proposes a way to find optimal relation score for a set of documents. To help indexing documents with multi language terms this study also proposes to use dictonary to expand query terms.

...read moreread less

13 citations

Proceedings Article•

Golchin: A Distributed News Classifier for Persian News Archive using Enhanced Topic-based Vector Space Model.

[...]

Sayed Nasir Khalifehsoltani, Ali Vahdani, Reza Moallemi

1 Jan 2009

TL;DR: An optical module with which an optical connector is to be brought into abutment and connection by inserting guide pins into pin holes of the optical module to thereby cause connection between the optical connector and the Optical module.

...read moreread less

1 citations

Topic-based vector space model

Topic Tools

Papers

Topic-based Vector Space Model

Text summarization using topic-based vector space model and semantic measure

A Quantitative Evalution of the Enhanced Topic-based Vector Space Model

Application of Topic Based Vector Space Model with WordNet

Golchin: A Distributed News Classifier for Persian News Archive using Enhanced Topic-based Vector Space Model.

Related Topics (5)

Performance Metrics