Journal Article10.2174/15680266113139990119
Kernel-Based Feature Selection Techniques for Transport Proteins Based on Star Graph Topological Indices
Carlos Fernandez-Lozano,Marcos Gestal,Nieves Pedreira-Souto,Lucian Postelnicu,Julián Dorado,Cristian R. Munteanu +5 more
20
TL;DR: Among several feature selection techniques, the Support Vector Machine Recursive Feature Elimination allows us to obtain a classification model based on 20 attributes with a true positive rate of 83% and a false negative rate of 16.7%.
read more
Abstract: The transport of the molecules inside cells is a very important topic, especially in Drug Metabolism. The experimental
testing of the new proteins for the transporter molecular function is expensive and inefficient due to the large
amount of new peptides. Therefore, there is a need for cheap and fast theoretical models to predict the transporter proteins.
In the current work, the primary structure of a protein is represented as a molecular Star graph, characterized by a series of
topological indices. The dataset was made up of 2,503 protein chains, out of which 413 have transporter molecular function
and 2,090 have no transporter function. These indices were used as input to several classification techniques to find
the best Quantitative Structure Activity Relationship (QSAR) model that can evaluate the transporter function of a new
protein chain. Among several feature selection techniques, the Support Vector Machine Recursive Feature Elimination allows
us to obtain a classification model based on 20 attributes with a true positive rate of 83% and a false positive rate of
16.7%.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
A review on machine learning approaches and trends in drug discovery.
Paula Carracedo-Reboredo,Jose Liñares-Blanco,Nereida Rodriguez-Fernandez,Francisco Cedrón,Francisco J. Novoa,Adrian Carballal,Victor Maojo,Alejandro Pazos,Carlos Fernandez-Lozano +8 more
TL;DR: In this paper, a review of the state of the art in machine learning for drug discovery is presented, focusing mainly on the methods used to model the molecular data, as well as the biological problems addressed and the Machine Learning algorithms used for drugs discovery in recent years.
292
Classification of lung cancer using ensemble-based feature selection and machine learning methods.
TL;DR: Results obtained suggest the effectiveness of the ensemble-based feature selection approach and the possible existence of a common panel of DNA methylation markers among such three types of lung cancer tissue, which would facilitate clinical diagnosis and treatment.
167
Classification of mild cognitive impairment and Alzheimer's Disease with machine-learning techniques using 1H Magnetic Resonance Spectroscopy data
Cristian R. Munteanu,Carlos Fernandez-Lozano,Virginia Mato Abad,Salvador Pita Fernández,Juan Álvarez-Linera,Juan Antonio Hernández-Tamames,Alejandro Pazos +6 more
TL;DR: It is suggested that knowing the composition of white and grey matter and cerebrospinal fluid of the spectroscopic voxel is essential in a 1H-MRS study to improve the accuracy of the quantifications and classifications, particularly in those studies involving elder patients and neurodegenerative diseases.
44
A methodology for the design of experiments in computational intelligence with multiple regression models
TL;DR: This paper focuses on the use of different Machine Learning approaches for regression tasks in the field of Computational Intelligence and especially on a correct comparison between the different results provided for different methods, as those techniques are complex systems that require further study to be fully understood.
36
References
LIBSVM: A library for support vector machines
Chih-Chung Chang,Chih-Jen Lin +1 more
TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
The Protein Data Bank
Helen M. Berman,John D. Westbrook,Zukang Feng,Gary L. Gilliland,Talapady N. Bhat,Helge Weissig,Ilya N. Shindyalov,Philip E. Bourne +7 more
TL;DR: The goals of the PDB are described, the systems in place for data deposition and access, how to obtain further information and plans for the future development of the resource are described.
The WEKA data mining software: an update
TL;DR: This paper provides an introduction to the WEKA workbench, reviews the history of the project, and, in light of the recent 3.6 stable release, briefly discusses what has been added since the last stable version (Weka 3.4) released in 2003.
A Tutorial on Support Vector Machines for Pattern Recognition
TL;DR: There are several arguments which support the observed high accuracy of SVMs, which are reviewed and numerous examples and proofs of most of the key theorems are given.