Second-order Learning Algorithm with Squared Penalty Term

doi:10.1162/089976600300015763

Open AccessJournal Article10.1162/089976600300015763

Second-order Learning Algorithm with Squared Penalty Term

Kazumi Saito, +1 more

- 03 Dec 1996

- Vol. 12, Iss: 3, pp 627-633

58

TL;DR: The experiments showed that for a reasonably adequate penalty factor, the combination of the squared penalty term and the second-order learning algorithm drastically improves the convergence performance in comparison to the other combinations, at the same time bringing about excellent generalization performance.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1162/NECO.1995.7.1.117

Bayesian regularization and pruning using a Laplace prior

Peter M. Williams

- 02 Jan 1995

- Neural Computation

TL;DR: Standard techniques for improved generalization from neural networks include weight decay and pruning and a comparison is made with results of MacKay using the evidence framework and a gaussian regularizer.

...read moreread less

436

Journal Article•10.1785/0120110088

Adapting the Neural Network Approach to PGA Prediction: An Example Based on the KiK-net Data

Boumédiène Derras, +3 more

- 01 Aug 2012

- Bulletin of the Seismological Society of...

TL;DR: In this article, the authors investigated the artificial neural network method for the derivation of physically sound, easy-to-handle, predictive ground-motion models and applied it to a large subset of the KiK-net seismic database, which includes 3891 records from 398 sites and 335 earthquakes.

...read moreread less

117

Journal Article•10.1109/TNN.2009.2020848

Boundedness and Convergence of Online Gradient Method With Penalty for Feedforward Neural Networks

Huisheng Zhang, +3 more

- 01 Jun 2009

- IEEE Transactions on Neural Networks

TL;DR: By proving that the weights are automatically bounded in the network training with penalty, this work simplifies the conditions that are required for convergence of online gradient method in literature.

...read moreread less

76

Journal Article•10.1162/NECO.1997.9.1.123

Partial BFGS update and efficient step-length calculation for three-layer neural networks

Kazumi Saito, +1 more

- 01 Jan 1997

- Neural Computation

TL;DR: It turned out that an efficient and accurate step-length calculation plays an important role for the convergence of quasi-Newton algorithms, and a partial BFGS update greatly saves storage space without losing the convergence performance.

...read moreread less

73

Journal Article•10.1016/J.NEUCOM.2013.10.023

Convergence of online gradient method for feedforward neural networks with smoothing L 1/2 regularization penalty

Qinwei Fan, +3 more

- 01 May 2014

- Neurocomputing

TL;DR: The strong convergence results for the smoothing L"1"/"2 regularization method are shown and the boundedness of the weights during the network training is proved, proving that weights are bounded is no longer needed for the proof of convergence.

...read moreread less

66

...

Expand

References

•Book

Neural networks for pattern recognition

Christopher M. Bishop

- 01 Jan 1995

TL;DR: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition, and is designed as a text, with over 100 exercises, to benefit anyone involved in the fields of neural computation and pattern recognition.

...read moreread less

19.9K

Book Chapter•10.1016/S0065-2458(08)60404-0

Neural Networks for Pattern Recognition

Suresh Kothari, +1 more

- 01 Jan 1993

- Advances in Computers

TL;DR: The chapter discusses two important directions of research to improve learning algorithms: the dynamic node generation, which is used by the cascade correlation algorithm; and designing learning algorithms where the choice of parameters is not an issue.

...read moreread less

14.5K

•Book

Pattern recognition and neural networks

Brian D. Ripley, +1 more

- 01 Jan 1996

TL;DR: Professor Ripley brings together two crucial ideas in pattern recognition; statistical methods and machine learning via neural networks in this self-contained account.

...read moreread less

6.4K

Journal Article•10.1162/NECO.1992.4.3.415

Bayesian interpolation

David J. C. MacKay

- 01 May 1992

TL;DR: The Bayesian approach to regularization and model-comparison is demonstrated by studying the inference problem of interpolating noisy data by examining the posterior probability distribution of regularizing constants and noise levels.

...read moreread less

4.7K

Journal Article•10.1016/S0893-6080(05)80056-5