Open AccessProceedings Article
Counting function theorem for multi-layer networks
Adam Kowalczyk
- 29 Nov 1993
- Vol. 6, pp 375-382
TL;DR: It is shown that a randomly selected N-tuple x→ of points of Rn with probability > 0 is such that any multi-layer percept ron with the first hidden layer composed of h1 threshold logic units can implement exactly 2 Σi=0h1n (N-1 i) different dichotomies of x→.
read more
Abstract: We show that a randomly selected N-tuple x→ of points of Rn with probability > 0 is such that any multi-layer percept ron with the first hidden layer composed of h1 threshold logic units can implement exactly 2 Σi=0h1n (N-1 i) different dichotomies of x→. If N > h1n then such a perceptron must have all units of the first hidden layer fully connected to inputs. This implies the maximal capacities (in the sense of Cover) of 2n input patterns per hidden unit and 2 input patterns per synaptic weight of such networks (both capacities are achieved by networks with single hidden layer and are the same as for a single neuron). Comparing these results with recent estimates of VC-dimension we find that in contrast to the single neuron case, for sufficiently large n, and h1 the VC-dimension exceeds Cover's capacity.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Circular backpropagation networks for classification
TL;DR: The proposed model unifies the two main representation paradigms found in the class of mapping networks for classification, namely, the surface-based and the prototype-based schemes, while retaining the advantage of being trainable by backpropagation.
146
Estimates of storage capacity of multilayer perceptron with threshold logic hidden units
TL;DR: The storage capacity of multilayer perceptron with n inputs, h(1) threshold logic units in the first hidden layer and 1 output is estimated and it is shown that such a network has memory capacity between nh(1)+1 and 2(nh( 1)+1) input patterns and for the most efficient networks in this class between 1 and 2 input patterns per connection.
21
Memorization and Optimization in Deep Neural Networks with Minimum Over-parameterization
Simone Bombari,Mohammad Hossein Amani,Marco Mondelli +2 more
- 20 May 2022
TL;DR: A key technical contribution is a lower bound on the smallest NTK eigenvalue for deep networks with the minimum possible over-parameterization : the number of parameters is roughly Ω( N ) and, hence, theNumber of neurons is as little as ℧( √ N ) .
20
•Posted Content
Tractability from overparametrization: The example of the negative perceptron
TL;DR: In this article, the authors consider the problem of finding a linear classifier with the largest possible negative margin in the negative perceptron problem, which is equivalent to finding a maximum norm vector in a polytope.
6
Dense shattering and teaching dimensions for differentiable families (extended abstract)
A. Kowalczyk
- 01 Jul 1997
TL;DR: These results extend some recently proven properties of analytic (Property 1) and analytic definable families (Property 2) to C’ case and illustrate an alternative technical approach to solving some issues in computational learning theory.
2
References
•Book
Estimation of Dependences Based on Empirical Data
Vladimir Vapnik
- 19 Nov 2010
TL;DR: In this article, the Big Picture of Inference: Direct Inference Instead of Generalization (INFI) instead of generalization (2000-2010) is presented. But this is not the case in this paper.
2.9K
Learnability and the Vapnik-Chervonenkis dimension
TL;DR: This paper shows that the essential condition for distribution-free learnability is finiteness of the Vapnik-Chervonenkis dimension, a simple combinatorial parameter of the class of concepts to be learned.
On the Betti numbers of real varieties
John Milnor
- 01 Feb 1964
TL;DR: In this article, it was shown that the number of points in Vc is equal to (deg fi) (deg f2) (deg mf) since each point of V0 lies close to some real point on Vc.
On the capabilities of multilayer perceptrons
TL;DR: A construction is presented here for implementing an arbitrary dichotomy with one hidden layer containing [ N d ] units, for any set of N points in general position in d dimensions, which is in fact the smallest such net as dichotomies which cannot be implemented by any net with fewer units.
428