A review of feature selection techniques in bioinformatics
5.3K
TL;DR: A basic taxonomy of feature selection techniques is provided, providing their use, variety and potential in a number of both common as well as upcoming bioinformatics applications.
read more
Abstract: Feature selection techniques have become an apparent need in many bioinformatics applications. In addition to the large pool of techniques that have already been developed in the machine learning and data mining fields, specific applications in bioinformatics have led to a wealth of newly proposed techniques.
In this article, we make the interested reader aware of the possibilities of feature selection, providing a basic taxonomy of feature selection techniques, and discussing their use, variety and potential in a number of both common as well as upcoming bioinformatics applications.
Contact: yvan.saeys@psb.ugent.be
Supplementary information: http://bioinformatics.psb.ugent.be/supplementary_data/yvsae/fsreview
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Book
Applied Predictive Modeling
Max Kuhn,Kjell Johnson +1 more
- 17 May 2013
TL;DR: This research presents a novel and scalable approach called “Smartfitting” that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive process of designing and implementing statistical models for regression models.
5.9K
Random forest in remote sensing: A review of applications and future directions
Mariana Belgiu,Lucian Drăguţ +1 more
TL;DR: This review has revealed that RF classifier can successfully handle high data dimensionality and multicolinearity, being both fast and insensitive to overfitting.
5.2K
Feature Selection: A Data Perspective
TL;DR: Feature selection, as a data preprocessing strategy, has proven to be effective and efficient in preparing data (especially high-dimensional data) for various data mining and machine learning problems.
2.2K
Feature Selection: A Data Perspective
TL;DR: This survey revisits feature selection research from a data perspective and reviews representative feature selection algorithms for conventional data, structured data, heterogeneous data and streaming data, and categorizes them into four main groups: similarity- based, information-theoretical-based, sparse-learning-based and statistical-based.
Inferring Regulatory Networks from Expression Data Using Tree-Based Methods
TL;DR: This article presents GENIE3, a new algorithm for the inference of GRNs that was best performer in the DREAM4 In Silico Multifactorial challenge and compares favorably with existing algorithms to decipher the genetic regulatory network of Escherichia coli.
References
Improved microbial gene identification with GLIMMER
TL;DR: Significant technical improvements to GLIMMER are reported that improve its accuracy still further, and a comprehensive evaluation demonstrates that the accuracy of the system is likely to be higher than previously recognized.
Systematic variation in gene expression patterns in human cancer cell lines.
Douglas T. Ross,Uwe Scherf,Michael B. Eisen,Charles M. Perou,Christian A. Rees,Paul T. Spellman,Vishwanath R. Iyer,Stefanie S. Jeffrey,Matt van de Rijn,Mark Waltham,Alexander Pergamenschikov,Jeffrey C. Lee,Deval A. Lashkari,Dari Shalon,Timothy G. Myers,John N. Weinstein,David Botstein,Patrick O. Brown +17 more
TL;DR: Using cDNA microarrays to explore the variation in expression of approximately 8,000 unique genes among the 60 cell lines used in the National Cancer Institute's screen for anti-cancer drugs provided a novel molecular characterization of this important group of human cell lines and their relationships to tumours in vivo.
•Book
Feature Selection for Knowledge Discovery and Data Mining
Huan Liu,Hiroshi Motoda +1 more
- 31 Jul 1998
TL;DR: Feature Selection for Knowledge Discovery and Data Mining offers an overview of the methods developed since the 1970's and provides a general framework in order to examine these methods and categorize them and suggests guidelines for how to use different methods under various circumstances.
2.2K
•Journal Article
Efficient Feature Selection via Analysis of Relevance and Redundancy
TL;DR: It is shown that feature relevance alone is insufficient for efficient feature selection of high-dimensional data, and a new framework is introduced that decouples relevance analysis and redundancy analysis.
Empirical Bayes analysis of a microarray experiment
TL;DR: A simple nonparametric empirical Bayes model is introduced, which is used to guide the efficient reduction of the data to a single summary statistic per gene, and also to make simultaneous inferences concerning which genes were affected by the radiation.