Journal Article10.3233/IDA-2002-6605
Evolutionary model selection in unsupervised learning
Yong Seog Kim,W. Nick Street,Filippo Menczer +2 more
- 01 Dec 2002
- Vol. 6, Iss: 6, pp 531-556
TL;DR: ELSA is used, an evolutionary local selection algorithm that maintains a diverse population of solutions that approximate the Pareto front in a multi-dimensional objective space and results in models with better and clearer semantic relevance.
read more
Abstract: Feature subset selection is important not only for the insight gained from determining relevant modeling variables but also for the improved understandability, scalability, and possibly, accuracy of the resulting models. Feature selection has traditionally been studied in supervised learning situations, with some estimate of accuracy used to evaluate candidate subsets. However, we often cannot apply supervised learning for lack of a training signal. For these cases, we propose a new feature selection approach based on clustering. A number of heuristic criteria can be used to estimate the quality of clusters built from a given feature subset. Rather than combining such criteria, we use ELSA, an evolutionary local selection algorithm that maintains a diverse population of solutions that approximate the Pareto front in a multi-dimensional objective space. Each evolved solution represents a feature subset and a number of clusters; two representative clustering algorithms, K-means and EM, are applied to form the given number of clusters based on the selected features. Experimental results on both real and synthetic data show that the method can consistently find approximate Pareto-optimal solutions through which we can identify the significant features and an appropriate number of clusters. This results in models with better and clearer semantic relevance.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
A review of feature selection methods with applications
Alan Jovic,Karla Brkić,Nikola Bogunović +2 more
- 25 May 2015
TL;DR: This review considers most of the commonly used FS techniques, including standard filter, wrapper, and embedded methods, and provides insight into FS for recent hybrid approaches and other advanced topics.
Feature Selection for Unsupervised Learning
Jennifer G. Dy,Carla E. Brodley +1 more
TL;DR: This paper explores the feature selection problem and issues through FSSEM (Feature Subset Selection using Expectation-Maximization (EM) clustering) and through two different performance criteria for evaluating candidate feature subsets: scatter separability and maximum likelihood.
An Evolutionary Approach to Multiobjective Clustering
Julia Handl,Joshua Knowles +1 more
TL;DR: The framework of multiobjective optimization is used to tackle the unsupervised learning problem, data clustering, following a formulation first proposed in the statistics literature and an evolutionary approach to the problem is developed.
737
A review of unsupervised feature selection methods
TL;DR: A comprehensive and structured review of the most relevant and recent unsupervised feature selection methods reported in the literature is provided and a taxonomy of these methods is presented.
567
Pareto-Based Multiobjective Machine Learning: An Overview and Case Studies
Yaochu Jin,Bernhard Sendhoff +1 more
- 01 May 2008
TL;DR: An overview of the existing research on multiobjective machine learning, focusing on supervised learning is provided, and a number of case studies are provided to illustrate the major benefits of the Pareto-based approach to machine learning.
References
Nonparametric Estimation from Incomplete Observations
Edward L. Kaplan,Paul Meier +1 more
TL;DR: In this article, the product-limit (PL) estimator was proposed to estimate the proportion of items in the population whose lifetimes would exceed t (in the absence of such losses), without making any assumption about the form of the function P(t).
•Book
Applied Multivariate Statistical Analysis
R. A. Johnson,Dean W. Wichern +1 more
- 01 Jan 1982
TL;DR: In this article, the authors present an overview of the basic concepts of multivariate analysis, including matrix algebra and random vectors, as well as a strategy for analyzing multivariate models.
12.6K
Pattern Classification and Scene Analysis
TL;DR: We provide a unified, comprehensive and up-to-date treatment of both statistical and descriptive methods for pattern recognition.
12.5K