About: Topic-based vector space model is a research topic. Over the lifetime, 6 publications have been published within this topic receiving 179 citations.
TL;DR: This paper shows further how the Topic-based Vector Space Model can be fully implemented within the context of relational databases and facilitates the use of this approach by generic applications.
Abstract: This paper motivates and presents the Topic-based Vector Space Model (TVSM), a new vector-based approach for document comparison. The approach does not assume independence between terms and it is flexible regarding the specification of term-similarities. Stopword-list, stemming and thesaurus can be fully integrated into the model. This paper shows further how the TVSM can be fully implemented within the context of relational databases. This facilitates the use of this approach by generic applications. At the end short comparisons with other vector-based approaches namely the Vector Space Model (VSM) and the Generalized Vector Space Model (GVSM) are presented.
TL;DR: In this article, a text summarization technique that incorporates topic modeling and semantic measure within the vector space model to find the extractive summary of the given text is proposed, where the main objective is to address the redundancy problem associated with summarization methods and include only those sentences in summary, which represent the maximum of the topics embedded in the text document.
Abstract: The primary shortcoming associated with extractive text summarization is redundancy, where more than one sentence representing a similar type of information are incorporated in summary In the last two decades, a lot of extractive text summarization methods have been proposed, but less attention was paid to the redundancy issue In this paper, we propose a text summarization technique that incorporates topic modeling and semantic measure within the vector space model to find the extractive summary of the given text Our main objective is to address the redundancy problem associated with summarization methods and include only those sentences in summary, which represent the maximum of the topics embedded in the given text document We generate the topic vector of the given document by representing the sentences in an intermediate form using a vector space model and topic modeling Moreover, to make the proposed method efficient, we incorporate the semantic similarity measure to find the relevance of the sentence We introduce two different ways to create the topic vector from the given document, ie, Combined topic vector and Individual topic vector approach Evaluation results on two datasets show that the summaries generated by both variants (Combined and Individual topic vector techniques) of the proposed method are found to be closer to the human-generated summaries when compared with the existing text summarization methods
TL;DR: A quantitative evaluation procedure for Information Retrieval models is presented and the results of this procedure applied on the enhanced Topic-based Vector Space Model (eTVSM), showing that a manually created and optimized ontology is able to raise the effectiveness of the eTVSM to a level which is clearly above the best effectiveness levels the authors have found in the literature for the Latent Semantic Index model.
Abstract: This contribution presents a quantitative evaluation procedure for Information Retrieval models and the results of this procedure applied on the enhanced Topic-based Vector Space Model (eTVSM). Since the eTVSM is an ontology-based model, its effectiveness heavily depends on the quality of the underlaying ontology. Therefore the model has been tested with different ontologies to evaluate the impact of those ontologies on the effectiveness of the eTVSM. On the highest level of abstraction, the following results have been observed during our evaluation: First, the theoretically deduced statement that the eTVSM has a similar effecitivity like the classic Vector Space Model if a trivial ontology (every term is a concept and it is independet of any other concepts) is used has been approved. Second, we were able to show that the effectiveness of the eTVSM raises if an ontology is used which is only able to resolve synonyms. We were able to derive such kind of ontology automatically from the WordNet ontology. Third, we observed that more powerful ontologies automatically derived from the WordNet, dramatically dropped the effectiveness of the eTVSM model even clearly below the effectiveness level of the Vector Space Model. Fourth, we were able to show that a manually created and optimized ontology is able to raise the effectiveness of the eTVSM to a level which is clearly above the best effectiveness levels we have found in the literature for the Latent Semantic Index model with compareable document sets.
TL;DR: This study proposes to develop relations between terms using WordNet and thesaurus to help TVSM calculating similarity between documents and proposes a way to find optimal relation score for a set of documents.
Abstract: Topic Based Vector Space Model (TVSM) proposed a new vector space that its dimensions is composed of topics. Every term and document is represented by vectors inside this vector space. By using topics as dimensions TVSM tries to overcome word-mismatch between terms with similar topics in finding relevant documents to query. This study proposes to develop relations between terms using WordNet and thesaurus to help TVSM calculating similarity between documents. Relations between terms are represented by relation score. This study proposes a way to find optimal relation score for a set of documents. To help indexing documents with multi language terms this study also proposes to use dictonary to expand query terms.
TL;DR: An optical module with which an optical connector is to be brought into abutment and connection by inserting guide pins into pin holes of the optical module to thereby cause connection between the optical connector and the Optical module.