Open Access10.13053/CYS-17-2-1523
A Knowledge-Base Oriented Approach for Automatic Keyword Extraction
Ludovic Jean-Louis,Michel Gagnon,Eric Charton +2 more
- 29 Jun 2013
- Vol. 17, Iss: 2, pp 187-196
TL;DR: A generic approach to extract keyword from documents using encyclopedic knowledge using a two-step approach based on a classification step for identifying candidate keywords followed by a learning-to-rank method depending on a user-defined keyword profile to order the candidates.
read more
Abstract: Automatic keyword extraction is an important subfield of information extraction process. It is a difficult task, where numerous different techniques and resources have been proposed. In this paper, we propose a generic approach to extract keyword from documents using encyclopedic knowledge. Our two-step approach first relies on a classification step for identifying candidate keywords followed by a learning-to-rank method depending on a user-defined keyword profile to order the candidates. The novelty of our approach relies on i) the usage of the keyword profile ii) generic features derived from Wikipedia categories and not necessarily related to the document content. We evaluate our system on keyword datasets and corpora from standard evaluation campaign and show that our system improves the global process of keyword extraction.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
How Preprocessing Affects Unsupervised Keyphrase Extraction
Rui Wang,Wei Liu,Chris McDonald +2 more
- 06 Apr 2014
TL;DR: It is shown that candidate selection approaches with better coverage and accuracy can boost the performance of the ranking algorithms.
25
Automatic Keyphrase Extraction Using SVM
Ankit Guleria,Radhika Sood,Pardeep Singh +2 more
- 01 Jan 2021
TL;DR: A supervised machine learning method based on statistical and linguistic features is proposed for keyword extraction using SVM and the experimental results compared with well-known methods show considerable improvement over the previously achieved results.
7
SIMT: A Semantic Interest Modeling Toolkit
Mohamed Amine Chatti,Fangzheng Ji,Mouadh Guesmi,Arham Muslim,Ravi Kumar Singh,Shoeb Ahmed Joarder +5 more
- 21 Jun 2021
TL;DR: SIMT as mentioned in this paper is a toolkit that harnesses the semantic information to effectively generate user interest models and compute their similarities using unsupervised keyword extraction algorithms, knowledge bases, and word embedding techniques.
7
Approach to Extract Keywords and Keyphrases of Text Resources and Documents in the Kazakh Language
Diana Rakhimova,Aliya Turganbayeva +1 more
- 30 Nov 2020
TL;DR: In this paper, a hybrid approach for extracting keywords and keyphrases of text resources and documents in Kazakh is proposed, which takes into account the syntactic feature of the words or phrases using the morphological analyzer of the Kazakh language.
3
Semantic Interest Modeling and Content-Based Scientific Publication Recommendation Using Word Embeddings and Sentence Encoders
Mouadh Guesmi,Mohamed Amine Chatti,Lamees Kadhim,Shoeb Joarder,Qurat Ul Ain +4 more
TL;DR: A transparent Recommendation and Interest Modeling Application (RIMA), a content-based scientific publication RS that implicitly derives user interest models from their authored papers that addresses the semantic issues and leverages pretrained transformer sentence encoders to represent user models and papers and compute their similarities.
2
References
Bagging predictors
Leo Breiman
- 01 Aug 1996
TL;DR: Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy.
Automatic Keyword Extraction from Individual Documents
Stuart J. Rose,Dave Engel,Nick Cramer,Wendy E. Cowley +3 more
- 04 Mar 2010
TL;DR: Keywords, which the authors define as a sequence of one or more words, provide a compact representation of a document’s content.
Improved automatic keyword extraction given more linguistic knowledge
Anette Hulth
- 11 Jul 2003
TL;DR: By adding linguistic knowledge to the representation, rather than relying only on statistics, a better result is obtained as measured by keywords previously assigned by professional indexers, by extracting NP-chunks gives a better precision than n-grams.
•Posted Content
KEA: Practical Automatic Keyphrase Extraction
TL;DR: This paper uses a large test corpus to evaluate Kea’s effectiveness in terms of how many author-assigned keyphrases are correctly identified, and describes the system, which is simple, robust, and publicly available.
Keyword extraction from a single document using word co-occurrence statistical information
Yutaka Matsuo,Mitsuru Ishizuka +1 more
TL;DR: This article presented a new keyword extraction algorithm that applies to a single document without using a corpus and showed comparable performance to tfidf without using TFIDF without using any corpus, but the degree of biases of distribution is measured by the χ 2 -measure.
Related Papers (5)
Anjali S,Meera Nair M,M G Thushara +2 more
- 01 Feb 2019
Jia Zhen,Bai Yang,Zhu Pinpin +2 more
- 31 May 2017
Jasmeen Kaur,Vishal Gupta +1 more
- 01 Jan 2010
Yiqun Chen,Yiqun Chen,Jian Yin,Weiheng Zhu,Shiding Qiu +4 more
- 08 Jun 2015