Open Access
Multi-layer Perceptron Error Surfaces: Visualization, Structure and Modelling
Marcus Gallagher
- 01 Jan 2000
68
TL;DR: The Principal Component Analysis (PCA) is proposed as a method for visualizing the learning trajectory followed by an algorithm on the error surface and it is found that PCA provides an effective method for performing such a visualization, as well as providing an indication of the significance of individual weights to the training process.
read more
Abstract: The Multi-Layer Perceptron (MLP) is one of the most widely applied and researched Artificial Neural Network model MLP networks are normally applied to performing supervised learning tasks, which involve iterative training methods to adjust the connection weights within the network This is commonly formulated as a multivariate non-linear optimization problem over a very high-dimensional space of possible weight configurations Analogous to the field of mathematical optimization, training an MLP is often described as the search of an error surface for a weight vector which gives the smallest possible error value Although this presents a useful notion of the training process, there are many problems associated with using the error surface to understand the behaviour of learning algorithms and the properties of MLP mappings themselves Because of the high-dimensionality of the system, many existing methods of analysis are not well-suited to this problem Visualizing and describing the error surface are also nontrivial and problematic These problems are specific to complex systems such as neural networks, which contain large numbers of adjustable parameters, and the investigation of such systems in this way is largely a developing area of research In this thesis, the concept of the error surface is explored using three related methods Firstly, Principal Component Analysis (PCA) is proposed as a method for visualizing the learning trajectory followed by an algorithm on the error surface It is found that PCA provides an effective method for performing such a visualization, as well as providing an indication of the significance of individual weights to the training process Secondly, sampling methods are used to explore the error surface and to measure certain properties of the error surface, providing the necessary data for an intuitive description of the error surface A number of practical MLP error surfaces are found to contain a high degree of ultrametric structure, in common with other known configuration spaces of complex systems Thirdly, a class of global optimization algorithms is also developed, which is focused on the construction and evolution of a model of the error surface (or search spa ce) as an integral part of the optimization process The relationships between this algorithm class, the Population-Based Incremental Learning algorithm, evolutionary algorithms and cooperative search are discussed The work provides important practical techniques for exploration of the error surfaces of MLP networks These techniques can be used to examine the dynamics of different training algorithms, the complexity of MLP mappings and an intuitive description of the nature of the error surface The configuration spaces of other complex systems are also amenable to many of these techniques Finally, the algorithmic framework provides a powerful paradigm for visualization of the optimization process and the development of parallel coupled optimization algorithms which apply knowledge of the error surface to solving the optimization problem
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
An introduction and survey of estimation of distribution algorithms
Mark W. Hauschild,Martin Pelikan +1 more
TL;DR: Estimation of distribution algorithms are stochastic optimization techniques that explore the space of potential solutions by building and sampling explicit probabilistic models of promising candidate solutions and many of the different types of EDAs are outlined.
506
A Review on Estimation of Distribution Algorithms
Pedro Larrañaga
- 01 Jan 2002
TL;DR: This chapter reviews the Estimation of Distribution Algorithms proposed for the solution of combinatorial optimization problems and optimization in continuous domains using one unified notation.
116
•Journal Article
Analyzing the PBIL Algorithm by Means of Discrete Dynamical Systems
TL;DR: It can be deduced that the PBIL algorithm converges to the global optimum in unimodal functions.
76
Visualising basins of attraction for the cross-entropy and the squared error neural network loss functions
TL;DR: The proposed visualisation technique successfully captures the local minima properties exhibited by the neural network loss surfaces, and can be used for the purpose of fitness landscape analysis of neural networks.
64
References
Genetic algorithms in search, optimization and machine learning
David E. Goldberg
- 01 Jan 1989
TL;DR: This book brings together the computer techniques, mathematical tools, and research results that will enable both students and practitioners to apply genetic algorithms to problems in many fields.
58.6K
•Book
Genetic algorithms in search, optimization, and machine learning
David E. Goldberg
- 01 Sep 1988
TL;DR: In this article, the authors present the computer techniques, mathematical tools, and research results that will enable both students and practitioners to apply genetic algorithms to problems in many fields, including computer programming and mathematics.
Particle swarm optimization
James Kennedy,Russell C. Eberhart +1 more
- 06 Aug 2002
TL;DR: A concept for the optimization of nonlinear functions using particle swarm methodology is introduced, and the evolution of several paradigms is outlined, and an implementation of one of the paradigm is discussed.
44.1K
•Book
Adaptation in natural and artificial systems
John H. Holland
- 01 Jan 1975
TL;DR: Names of founding work in the area of Adaptation and modiication, which aims to mimic biological optimization, and some (Non-GA) branches of AI.
•Book
Neural Networks: A Comprehensive Foundation
Simon Haykin
- 16 Jul 1998
TL;DR: Thorough, well-organized, and completely up to date, this book examines all the important aspects of this emerging technology, including the learning process, back-propagation learning, radial-basis function networks, self-organizing systems, modular networks, temporal processing and neurodynamics, and VLSI implementation of neural networks.