Journal Article10.1093/BIOINFORMATICS/BTM312
Visualization-based cancer microarray data classification analysis
86
TL;DR: The approach to classification through visualization achieves performance that is comparable to state-of-the-art supervised data mining techniques, as well as to demonstrate that the proposed approach is well suited for cancer microarray analysis.
read more
Abstract: Motivation: Methods for analyzing cancer microarray data often face two distinct challenges: the models they infer need to perform well when classifying new tissue samples while at the same time providing an insight into the patterns and gene interactions hidden in the data. State-of-the-art supervised data mining methods often cover well only one of these aspects, motivating the development of methods where predictive models with a solid classification performance would be easily communicated to the domain expert.
Results: Data visualization may provide for an excellent approach to knowledge discovery and analysis of class-labeled data. We have previously developed an approach called VizRank that can score and rank point-based visualizations according to degree of separation of data instances of different class. We here extend VizRank with techniques to uncover outliers, score features (genes) and perform classification, as well as to demonstrate that the proposed approach is well suited for cancer microarray analysis. Using VizRank and radviz visualization on a set of previously published cancer microarray data sets, we were able to find simple, interpretable data projections that include only a small subset of genes yet do clearly differentiate among different cancer types. We also report that our approach to classification through visualization achieves performance that is comparable to state-of-the-art supervised data mining techniques.
Availability: VizRank and radviz are implemented as part of the Orange data mining suite ( http://www.ailab.si/orange).
Contact: blaz.zupan@fri.uni-lj.si
Supplementary information: Supplementary data are available from http://www.ailab.si/supp/bi-cancer.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Quality Metrics for Information Visualization
Michael Behrisch,Michael Blumenschein,Nam Wook Kim,Lin Shao,Mennatallah El-Assady,Johannes Fuchs,Daniel Seebacher,Alexandra Diehl,Ulrik Brandes,Hanspeter Pfister,Tobias Schreck,Daniel Weiskopf,Daniel A. Keim +12 more
TL;DR: This survey attempts to report, categorize and unify the diverse understandings and aims to establish a common vocabulary that will enable a wide audience to understand their differences and subtleties.
Feature selection and tumor classification for microarray data using relaxed Lasso and generalized multi-class support vector machine.
TL;DR: The experimental results show that the method proposed in this paper selects fewer feature genes and achieves higher classification accuracy, and rL-GenSVM uses regularization parameters to avoid overfitting and can be widely applied to high-dimensional and small-sample tumor data classification.
116
Comparative study of unsupervised dimension reduction techniques for the visualization of microarray gene expression data
TL;DR: Locally Linear Embedding and Isomap showed a superior performance on all datasets, and in very low-dimensional representations and with few differentially expressed genes, these two methods preserve more of the underlying structure of the data than PCA, and thus are favorable alternatives for the visualization of microarray data.
CuMiDa: An Extensively Curated Microarray Database for Benchmarking and Testing of Machine Learning Approaches in Cancer Research.
TL;DR: The Curated Microarray Database (CuMiDa), a database composed of 78 handpicked microarray data sets for Homo sapiens that were carefully examined from more than 30,000 microarray experiments from the Gene Expression Omnibus using a rigorous filtering criteria, is presented.
99
Suite of decision tree-based classification algorithms on cancer gene expression data
TL;DR: This study compares the classification accuracy among nine decision tree methods and evaluates the behaviors of these methods with/without applying attribute selection (A.S.) techniques such as Chi-square attribute selection and Gain Ratio attribute selection.
97
References
•Book
C4.5: Programs for Machine Learning
J. Ross Quinlan
- 15 Oct 1992
TL;DR: A complete guide to the C4.5 system as implemented in C for the UNIX environment, which starts from simple core learning methods and shows how they can be elaborated and extended to deal with typical problems such as missing data and over hitting.
27.2K
Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.
Todd R. Golub,Todd R. Golub,Donna K. Slonim,Pablo Tamayo,Christine Huard,Michelle Gaasenbeek,Jill P. Mesirov,Hilary A. Coller,Mignon L. Loh,James R. Downing,Michael A. Caligiuri,Clara D. Bloomfield,Eric S. Lander +12 more
TL;DR: A generic approach to cancer classification based on gene expression monitoring by DNA microarrays is described and applied to human acute leukemias as a test case and suggests a general strategy for discovering and predicting cancer classes for other types of cancer, independent of previous biological knowledge.
Data mining: practical machine learning tools and techniques with Java implementations
Ian H. Witten,Eibe Frank +1 more
- 01 Mar 2002
TL;DR: This presentation discusses the design and implementation of machine learning algorithms in Java, as well as some of the techniques used to develop and implement these algorithms.
6.2K
Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks
Javed Khan,Jun S. Wei,Markus Ringnér,Markus Ringnér,Lao H. Saal,Marc Ladanyi,Frank Westermann,Frank Berthold,Manfred Schwab,Cristina R. Antonescu,Carsten Peterson,Paul S. Meltzer +11 more
TL;DR: The ability of the trained ANN models to recognize SRBCTs is demonstrated, and the potential applications of these methods for tumor diagnosis and the identification of candidate targets for therapy are demonstrated.
2.9K
Gene expression correlates of clinical prostate cancer behavior.
Dinesh Singh,Phillip G. Febbo,Phillip G. Febbo,Kenneth N. Ross,Donald G. Jackson,Judith Manola,Christine Ladd,Pablo Tamayo,Andrew A. Renshaw,Anthony V. D'Amico,Jerome P. Richie,Eric S. Lander,Massimo Loda,Philip W. Kantoff,Todd R. Golub,William R. Sellers +15 more
TL;DR: The results support the notion that the clinical behavior of prostate cancer is linked to underlying gene expression differences that are detectable at the time of diagnosis.
2.7K