Multi-layer Perceptron Error Surfaces: Visualization, Structure and Modelling

Open Access

Multi-layer Perceptron Error Surfaces: Visualization, Structure and Modelling

- 01 Jan 2000

68

TL;DR: The Principal Component Analysis (PCA) is proposed as a method for visualizing the learning trajectory followed by an algorithm on the error surface and it is found that PCA provides an effective method for performing such a visualization, as well as providing an indication of the significance of individual weights to the training process.

Abstract: The Multi-Layer Perceptron (MLP) is one of the most widely applied and researched Artificial Neural Network model MLP networks are normally applied to performing supervised learning tasks, which involve iterative training methods to adjust the connection weights within the network This is commonly formulated as a multivariate non-linear optimization problem over a very high-dimensional space of possible weight configurations Analogous to the field of mathematical optimization, training an MLP is often described as the search of an error surface for a weight vector which gives the smallest possible error value Although this presents a useful notion of the training process, there are many problems associated with using the error surface to understand the behaviour of learning algorithms and the properties of MLP mappings themselves Because of the high-dimensionality of the system, many existing methods of analysis are not well-suited to this problem Visualizing and describing the error surface are also nontrivial and problematic These problems are specific to complex systems such as neural networks, which contain large numbers of adjustable parameters, and the investigation of such systems in this way is largely a developing area of research In this thesis, the concept of the error surface is explored using three related methods Firstly, Principal Component Analysis (PCA) is proposed as a method for visualizing the learning trajectory followed by an algorithm on the error surface It is found that PCA provides an effective method for performing such a visualization, as well as providing an indication of the significance of individual weights to the training process Secondly, sampling methods are used to explore the error surface and to measure certain properties of the error surface, providing the necessary data for an intuitive description of the error surface A number of practical MLP error surfaces are found to contain a high degree of ultrametric structure, in common with other known configuration spaces of complex systems Thirdly, a class of global optimization algorithms is also developed, which is focused on the construction and evolution of a model of the error surface (or search spa ce) as an integral part of the optimization process The relationships between this algorithm class, the Population-Based Incremental Learning algorithm, evolutionary algorithms and cooperative search are discussed The work provides important practical techniques for exploration of the error surfaces of MLP networks These techniques can be used to examine the dynamics of different training algorithms, the complexity of MLP mappings and an intuitive description of the nature of the error surface The configuration spaces of other complex systems are also amenable to many of these techniques Finally, the algorithmic framework provides a powerful paradigm for visualization of the optimization process and the development of parallel coupled optimization algorithms which apply knowledge of the error surface to solving the optimization problem

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1016/J.SWEVO.2011.08.003

An introduction and survey of estimation of distribution algorithms

Mark W. Hauschild, +1 more

- 01 Sep 2011

- Swarm and evolutionary computation

TL;DR: Estimation of distribution algorithms are stochastic optimization techniques that explore the space of potential solutions by building and sampling explicit probabilistic models of promising candidate solutions and many of the different types of EDAs are outlined.

...read moreread less

506

Journal Article•10.2307/3151402

The Elements of Graphing Data

Rajendra K. Srivastava, +1 more

- 01 Nov 1987

- Journal of Marketing Research

306

Book Chapter•10.1007/978-1-4615-1539-5_3

A Review on Estimation of Distribution Algorithms

Pedro Larrañaga

- 01 Jan 2002

TL;DR: This chapter reviews the Estimation of Distribution Algorithms proposed for the solution of combinatorial optimization problems and optimization in continuous domains using one unified notation.

...read moreread less

116

•Journal Article

Analyzing the PBIL Algorithm by Means of Discrete Dynamical Systems

Cristina González, +2 more

- 01 Jan 2000

- Complex Systems

TL;DR: It can be deduced that the PBIL algorithm converges to the global optimum in unimodal functions.

...read moreread less

76

•Journal Article•10.1016/J.NEUCOM.2020.02.113

Visualising basins of attraction for the cross-entropy and the squared error neural network loss functions

Anna Sergeevna Bosman, +2 more

- 04 Aug 2020

- Neurocomputing

TL;DR: The proposed visualisation technique successfully captures the local minima properties exhibited by the neural network loss surfaces, and can be used for the purpose of fitness landscape analysis of neural networks.

...read moreread less

64

...

Expand

References

Genetic algorithms in search, optimization and machine learning

David E. Goldberg

- 01 Jan 1989

TL;DR: This book brings together the computer techniques, mathematical tools, and research results that will enable both students and practitioners to apply genetic algorithms to problems in many fields.

...read moreread less

58.6K

•Book

Genetic algorithms in search, optimization, and machine learning

David E. Goldberg

- 01 Sep 1988

TL;DR: In this article, the authors present the computer techniques, mathematical tools, and research results that will enable both students and practitioners to apply genetic algorithms to problems in many fields, including computer programming and mathematics.

...read moreread less

52.8K

Proceedings Article•10.1109/ICNN.1995.488968

Particle swarm optimization

James Kennedy, +1 more

- 06 Aug 2002

TL;DR: A concept for the optimization of nonlinear functions using particle swarm methodology is introduced, and the evolution of several paradigms is outlined, and an implementation of one of the paradigm is discussed.

...read moreread less

44.1K

•Book

Adaptation in natural and artificial systems

John H. Holland

- 01 Jan 1975

TL;DR: Names of founding work in the area of Adaptation and modiication, which aims to mimic biological optimization, and some (Non-GA) branches of AI.

...read moreread less

40.3K

•Book

Neural Networks: A Comprehensive Foundation

Simon Haykin

- 16 Jul 1998

TL;DR: Thorough, well-organized, and completely up to date, this book examines all the important aspects of this emerging technology, including the learning process, back-propagation learning, radial-basis function networks, self-organizing systems, modular networks, temporal processing and neurodynamics, and VLSI implementation of neural networks.

...read moreread less

32.8K