TL;DR: In this article, the authors present a set of principles of Cortical Organization, including the Locales, Varieties and Uses of Maps Sense and Movement Categorization, Generalization and Memory Speech and Language Developmental and Theoretical Principles.
Abstract: Principles of Cortical Organization The Locales, Varieties and Uses of Maps Sense and Movement Categorization, Generalization and Memory Speech and Language Developmental and Theoretical Principles.
TL;DR: The essence of the general fundamentals of formal logic is that it takes these forms as if ready-made, developed, beyond formation and development.
Abstract: ion is quite different. We have presented only one approach to modern formal logic, but, in our opinion, it is the most acceptable one. There are, however, other positions. B. M. Kedrov has formulated one of them in detail [158]. From his point of view, the general fundamentals of formal logic (elementary formal logic) retain an independent significance to this day. It is a philosophical discipline rather than a specialized one. Some of its principles have been borrowed from mathematical logic, which is concerned with its own problems that verge on mathematics (which is a specialized discipline). The essence of the general fundamentals of formal logic, which, together with dialectical logic, also studies the forms of thought, is that it takes these forms as if ready-made, developed, beyond formation and development. It is the logic of the first, the initial level of cognition, at which there is a primary sifting of the real content of thought from fictions and fantasy. This level is necessary and inevitable – hence the study of its principles both in scientific cognition and in the ontogenetic development of the child’s thought retains significance. “In order to reason and to think dialectically, there must be elementary training in proper thinking, as a precondition...” [158, p. 70]. Formal logic also teaches this elementary thinking in its general fundamentals, which depends on four well-known laws (of identity, compatibility, etc.).
TL;DR: In this paper, the authors define a class of complex functions as the class of functions f, analytic on the open unit disc ℬ, f(0)=0, f′(0) = 1 and |f(qz)| ⩽|f(z)| on Å, denoted by PSq.
TL;DR: Numerical results on training in layered neural networks indicate that the generalization error improves gradually in some cases, and sharply in others, and statistical mechanics is used to study generalization curves in large layered networks.
Abstract: A statistical-mechanical theory of learning from examples in layered networks at finite temperature is studied. When the training error is a smooth function of continuously varying weights, the generalization error falls off asymptotically as the inverse number of examples. By analytical and numerical studies of single-layer perceptrons, we show that when the weights are discrete, the generalization error can exhibit a discontinuous transition to perfect generalization. For intermediate sizes of the example set, the state of perfect generalization coexists with a metastable spin-glass state.
TL;DR: A concept of conditional fuzzy measure is presented, which is a generalization of conditional probability measure and its properties are studied in the general case and in some particular types of fuzzy measures as representable measures, capacities of order two, and belief‐plausibility measures.
Abstract: In this article a concept of conditional fuzzy measure is presented, which is a generalization of conditional probability measure. Its properties are studied in the general case and in some particular types of fuzzy measures as representable measures, capacities of order two, and belief-plausibility measures. In the case of capacities of order two it coincides with the concept given by Dempster for representable measures. However, it differs from the Dempster's rule for conditioning belief-plausibility measures. As it is shown, Dempster's rule of conditioning is based on the idea of combining information and our definition is based on a restriction in the set of possible worlds.
TL;DR: A general adaptive model unifying existing models for pattern learning, unifying many existing models (including that of neural nets), is proposed, and suggests how various propositional object (class) descriptions might be generated based on the outputs of the learning processes.
TL;DR: In this paper, Blais's article A pragmatic analysis of mathematical realism and intuitionism is presented, where he argues that most comparisons of these two approaches to mathematics miss the essential point: intuitionism, in its simplest form, is a generalization of classical mathematics that accomodates both classical and computational models.
Abstract: I was inspired, not to say provoked, to write this note by Michel J. Blais's article A pragmatic analysis of mathematical realism and intuitionism [2]. Having spent the greater part of my career doing intuitionistic mathematics, while continuing to do classical mathematics, I have come to feel that most comparisons of these two approaches to mathematics miss the essential point: intuitionism, in its simplest form, is a generalization of classical mathematics that accomodates both classical and computational models. By intuitionism I mean the approach to mathematics based on intuitionistic logic, a well-de ned body of axioms and rules of inference [6] [3]. So, for example, my idea of intuitionism does not include the notion of a choice sequence [8], or the various continuity principles associated with intuitionism [9], and it does not refer to the more bizarre consequences that have been drawn from Brouwer's idea of a creating subject [3]. This lean version of intuitionistic mathematics is usually called constructive mathematics. Blais directs his comments in [2] at constructive mathematics rather than at the more esoteric varieties of intuitionistic mathematics. Most analyses dwell on the relative merits of classical and computational models, despite the fact that constructive mathematics, viewed as a theory rather than as a description of a universe, can be interpreted as speaking about either model. Other avors of intuitionistic mathematics incorporate axioms that hold only in computational models. For example, Brouwer proved that every totally de ned function on the real line is continuous. A theory where this is provable cannot refer to the classical universe, so we have to consider whether we like its models better than the classical model. But any theorem in constructive mathematics is a theorem in classical mathematics, so in this case it is not a question of choosing between models, but of deciding whether it is worthwhile to talk about computational models in addition to the classical model. By comparing mathematical realism with intuitionism from an informal axiomatic point of view, we can steer clear of most of the metaphysical problems involved in analyzing these notions from the ground up, and concentrate on what may be termed the purely mathematical aspects. In particular we won't have to consider whether intuitionists hold that a theorem \\isn't true until it's known to be true\", as Blais
TL;DR: In this paper, a generalization methodology is given that is independent from the reduction one wants to generalize, and based on that methodology, they define extensions of the implicit place transformation and the pre and post agglomeration of transitions.
Abstract: This paper presents the generalization to the coloured nets of the most efficient reductions defined by Berthelot for Petri nets. First, a generalization methodology is given that is independent from the reduction one wants to generalize. Then based on that methodology, we define extensions of the implicit place transformation and the pre and post agglomeration of transitions. For each reduction we prove that the reduced net has exactly the same properties as the original net. Finally we completely reduce an improved model of the data base management with multiple copies, thus showing its correctness.
TL;DR: A natural generalization to systems of first order equations is given of Poincare's classical theorem on ratio asymptotics of solutions of higher-order recurrence equations in this paper.
TL;DR: A complex-valued generalization of neural networks is presented and an activation function with more desirable characteristics in the complex plane is proposed, including the possibility of self oscillation.
Abstract: A complex-valued generalization of neural networks is presented. The dynamics of complex neural networks have parallels in discrete complex dynamics which give rise to the Mandelbrot set and other fractals. The continuation to the complex plane of common activation functions and the resulting neural dynamics are discussed. An activation function with more desirable characteristics in the complex plane is proposed. The dynamics of this activation function include the possibility of self oscillation. Possible applications in signal processing and neurobiological modeling are discussed
TL;DR: This thesis presents a feature construction framework based on the aspects of need detection, constructor selection, constructor generalization, and feature evaluation, which served as the basis for the design of CITRE, an inductive system that constructs new features using decision tress.
Abstract: While similarity-based learning (SBL) methods can be effective for acquiring concept descriptions from labeled examples, their success largely depends upon the quality of the features used to describe the examples. When a learning problem uses low-level features, the complexity of the concept-membership function can make SBL inaccurate, expensive, or simply impossible. One way to overcome this limitation is through feature construction: the construction of new features by the application of constructive operators to existing features. Feature construction can result in an improved instance space in which the concept-membership function is better behaved relative to the inductive biases of SBL algorithms. Feature construction, however, is computationally difficult, primarily because of the intractably large space of potential new features. To assist in the study and advancement of feature construction methods, this thesis presents a feature construction framework based on the aspects of (1) need detection, (2) constructor selection, (3) constructor generalization, and (4) feature evaluation. This framework was used to analyze eight existing systems (BACON, BOGART, DUCE, FRINGE, MIRO, PLSO, STAGGER, and STABB) and to identify promising approaches to feature construction. The framework also served as the basis for the design of CITRE, an inductive system that constructs new features using decision tress. CITRE was tested on five learning problems: l-term kDNF Boolean functions, tic-tac-toe classification, mushroom classification, voting-record classification, and chess-end-game classification. The results demonstrate CITRE's potential for significantly improving hypothesis accuracy and conciseness. The results also reveal substantial benefits obtainable by using simple domain-knowledge constraints and constructor generalization during feature construction.
TL;DR: The set of distances obtained by combining the cityblock and the chessboard motions is studied as a generalization of the octagonal distance for digital pictures and the corresponding digital disks are shown to be digital octagons.
TL;DR: In this paper, the authors developed an argument that a neural network with asynchronous updating dynamics is capable of sequential retrieval of properly embedded spatial patterns and that it is not necessary that the patterns be random but they have to satisfy certain restrictions.
Abstract: We develop an argument that a neural network with asynchronous updating dynamics is capable of sequential retrieval of properly embedded spatial patterns It is not necessary that the patterns be random, but they have to satisfy certain restrictions The network is a generalization of the Hopfield model with a specific type of asymmetric synaptic connections No time delay is introduced in signal transmission, in contrast to some of the recently proposed networks
TL;DR: A nontrivial computation task that attempts to recognize two-or-more-clumps in a 5-b string is used to illustrate the important influence of training set selection on generalization properties of back-propagation networks.
Abstract: A nontrivial computation task that attempts to recognize two-or-more-clumps in a 5-b string is used to illustrate the important influence of training set selection on generalization properties of back-propagation networks. For this problem, the input patterns can be clustered into four groups indexed by their distances from the class boundary. With various combinations of these groups, the authors constructed training sets ranging from those containing only typical patterns of each class to those of border patterns. A series of simulation experiments were carried out to study the generalization capability of networks trained with these sets. The results are consistent with the following conclusions: (1) larger sizes of training examples do not guarantee better generalization performance: (2) there exists a proper subset of border patterns, which constitutes a critical training set for perfect generalization; and (3) a network trained with an arbitrary subset of border sets is not necessarily a better performer compared with one trained with a typical or other collection of input patterns
TL;DR: In this article, the concept of approximate localization has been used for attainable set evolution, and a generalization of this theory has been proposed to control under uncertainty, optimal control, and differential games.
Abstract: In control theory, there is growing interest in the evolution of sets, especially attainable sets at timet. This is caused due to their applications to control under uncertainty, optimal control, and differential games. Recently, a new mathematical theory for attainable set evolution was developed. It is based on the concept of approximate localization, instead of differentiation. Here, we give a generalization of this theory.
TL;DR: One-way group actions provides a unified theory for all the known bit commitment schemes that offer unconditional protection for the originator of the commitments, and for many of those that offer her statistical protection.
Abstract: Bit commitment schemes are central to all zero-knowledge protocols [GMR89] for NP-complete problems [GMW86, BC86a, BC86b, BCC88, BCY89, FS89, etc] One-way group actions is a natural and powerful primitive for the implementation of bit commitment schemes It is a generalization of the one-way group homomorphism [IY88], which was not powerful enough to capture the bit commitment scheme based on graph isomorphism [BC86b] It provides a unified theory for all the known bit commitment schemes that offer unconditional protection for the originator of the commitments, and for many of those that offer her statistical protection (Unconditional protection means that the value of the bit committed to is always perfectly concealed Statistical protection either means that this is almost always the case, or that only an arbitrarily small probabilistic bias about this bit can leak; in either cases, statistical protection must hold even against unlimited computing power)Bit commitment schemes based on one-way group actions automatically have the chameleon property [BCC88] (also called trap-door [FS89]), which is useful for the parallelization of zero-knowledge protocols [BCY89, FS89] Moreover, these bit commitment schemes allow the originator of two commitments to convince the receiver that they are commitments to the same bit, provided that this is so, without disclosing any information about which bit this isIn addition, one-way group actions are also a natural primitive for the implementation of claw-free pairs of functions [GMRi88]
TL;DR: The Incremental Non-Backtracking Focusing (INBF) algorithm which learns strictly tree-structured concepts in polynomial space and time is described, formally proving that for treestructured concepts this assumption does in fact hold.
Abstract: The candidate elimination algorithm for inductive learning with version spaces can require both exponential time and space. This article describes the Incremental Non-Backtracking Focusing (INBF) algorithm which learns strictly tree-structured concepts in polynomial space and time. Specifically, it learns in time O(pnk) and space O(nk) where p is the number of positives, n the number of negatives, and k the number of features. INBF is an extension of an existing batch algorithm, Avoidance Focusing (AF). Although AF also learns in polynomial time, it assumes a convergent set of positive examples, and handles additional examples inefficiently; INBF has neither of these restrictions. Both the AF and INBF algorithms assume that the positive examples plus the near misses will be sufficient for convergence if the initial set of examples is convergent. This article formally proves that for treestructured concepts this assumption does in fact hold.
TL;DR: In this article, the exact solitary wave solutions of the extended Burgers-Fisher equation were obtained by using a simple and effective nonlinear transformation, and the results are the generalization of the former work.
Abstract: The exact solitary wave solutions of the extended Burgers-Fisher equation are obtained by using a simple and effective nonlinear transformation. The results are the generalization of the former work.
TL;DR: Experimental tests of generalization versus number of examples are presented for random target networks and examples drawn from a uniform distribution, and seem to indicate that networks with two hidden layers have Vapnik-Chervonenkis dimension roughly equal to their total number of weights.
Abstract: We first review in pedagogical fashion previous results which gave lower and upper bounds on the number of examples needed for training feedforward neural networks when valid generalization is desired. Experimental tests of generalization versus number of examples are then presented for random target networks and examples drawn from a uniform distribution. The experimental results are roughly consistent with the following heuristic: if a database of M examples is loaded onto a W weight net (for M≫W), one expects to make a fraction ɛ=W/M errors in classifying future examples drawn from the same distribution. This is consistent with our previous bounds, but if reliable strengthens them in that: (1) the bounds had large numerical constants and log factors, all of which are set equal one in the heuristic, (2) previous lower bounds on number of examples needed were valid only in a distribution independent context, whereas the experiments were conducted for a uniform distribution, and (3) the previous lower bound was valid for nets with one hidden layer only. These experiments also seem to indicate that networks with two hidden layers have Vapnik-Chervonenkis dimension roughly equal to their total number of weights.
TL;DR: This paper focuses on Database Theory, where the main component of Immerman’s solution was the collection of first-order lixpoint queries over finite structures, and the question of whether the hierarchy remains strict above the first major step was answered in the negative.
Abstract: Inductive definitions have played a central role in the foundations of mathematics for over a century. They were used in the 1970s as the backbone of one major generalization of Recursive Function Theory (Moschovakis, 1974; Aczel, 1977). In recent years the relevance of inductive definitions (in particular over finite structures) to Database Theory, to Descriptive Computational Complexity, and to Logics of programs has been recognized. A seminal paper on inductive definitions in Database Theory is Chandra and Hare1 (1982), where they define a hierarchy of queries over finite structures, within which minor steps (successor ordinals) correspond to first-order quantifier alternations, and major steps (limit ordinals) correspond to uses of lixpoints. They left open the question of whether the hierarchy remains strict above the first major step (level w). This problem was answered in the negative by Immerman (1986). Since the collection of first-order lixpoint queries over finite structures is closed under composition and first-order operations other than negation (Moschovakis, 1974), the main component of Immerman’s solution was
TL;DR: A temporal-difference method is used to bootstrap a collection of useful concepts by backing up evaluations from recognized states to their predecessors and this procedure is combined with explanation- based generalization and goal regression to use knowledge of the problem domain to help generalize the new concept definitions.
Abstract: We describe a technique for improving problem-solving performance by creating concepts that allow problem states to be evaluated through an efficient recognition process. A temporal-difference (TD) method is used to bootstrap a collection of useful concepts by backing up evaluations from recognized states to their predecessors. This procedure is combined with explanation- based generalization (EBG) and goal regression to use knowledge of the problem domain to help generalize the new concept definitions. This maintains the efficiency of using the concepts and accelerates the learning process in comparison to knowledge-free approaches. Also, because the learned definitions may describe negative conditions, it becomes possible to use EBG to explain why some instance is not an example of a concept. The learning technique has been elaborated for minimax gameplaying and tested on a Tic-Tat-Toe system, T2. Given only concepts defining the end-game states and constrained to a two-ply search bound, experiments show that T2 learns concepts for achieving near-perfect play. T2's total searching time, including concept recognition, is within acceptable performance limits while perfect play without the concepts requires searches taking well over 100 times longer than T2's.
TL;DR: A connectionist learning algorithm, the bounded, randomized, distributed (BRD) algorithm, is presented and formally analyzed within the framework of computational learning theory, and a new class of connectionist concepts is shown to be polynomially learnable using the BRD algorithm.
TL;DR: The initial examination of validity generalization in the Army Selection and Classification Project used data from a concurrent validation sample of 4,039 job incumbents drawn from a representative sample of nine jobs as discussed by the authors.
Abstract: The initial examination of validity generalization in the Army Selection and Classification Project used data from a concurrent validation sample of 4,039 job incumbents drawn from a representative sample of nine jobs. The available data consisted of 24 predictor scores and five job performance factor scores on each individual. The major objectives were to determine (a) the degree of validity generalization across the major components of performance, with the job held constant, and (b) the degree of validity generalization across jobs within each major performance factor. After reducing the predictor set by eliminating variables that added no information, a modified confirmatory analysis was used to test the hypotheses that one equation would fit the data from all performance components and that one equation would fit the data from all jobs, given a particular performance component. The major findings were that different predictor equations were needed for each of the five criterion factors. For generalization across jobs, within each criterion factor, one equation fit the data for four of the five performance components. Different prediction equations were required for the component that reflects proficiency on the technical tasks specific to each job.
TL;DR: A method for deriving from any graph a family of strongly equivalent LOTOS expressions that describe the intended process composition is introduced and proved correct and legitimizes the adoption of a graphical (or an equivalent textual) shorthand for such multiple compositions.
Abstract: An overview of some LOTOS (Language of Temporal Ordering Specification) constructs for expressing the behavior of distributed and concurrent systems is given. Some existing equivalent laws for LOTOS behavior expressions that involve the parallel composition operator and that are based on the notion of bisimulation equivalence are recalled. The graphical representation of the parallel composition of several LOTOS processes as a network of interconnected boxed is ambiguous. However, under sufficiently general conditions such graphical representation is sound; a method for deriving from any such graph a family of strongly equivalent LOTOS expressions that describe the intended process composition is introduced and proved correct. The method legitimizes the adoption of a graphical (or an equivalent textual) shorthand for such multiple compositions. It can be used for transforming the structure of parallel LOTOS expressions, and it is a generalization of previously known algebraic laws. >
TL;DR: This paper considers how the techniques of casebased reasoning in an adversarial, precedent-based domain can be used to aid a decision-tree based classification algorithm for training set selection, branching feature choice, and induction policy preference and deliberate exploitation of inductive bias.
Abstract: In a precedent-bused domain one appeals to previous cases to support a solution, decision, explanation, or an argument. Experts typically use care in choosing cases in precedent-based domains, and apply such criteria as case relevance, prototypicality, and importance. In domains where both cases and rules are used, experts use an additional case selection criterion: the generalizations that a particular group of cases support. Domain experts use their knowledge of cases to forge the rules learned from those cases.
In this paper, we explore inductive learning in a "mixed paradigm" setting, where both rule-based and case-based reasoning methods are used. In particular, we consider how the techniques of casebased reasoning in an adversarial, precedent-based domain can be used to aid a decision-tree based classification algorithm for (1) training set selection, (2) branching feature choice, and (3) induction policy preference and deliberate exploitation of inductive bias. We focus on how precedentbased argumentation may inform the selection of training examples used to build classification trees. The resulting decision trees may then be reexpressed as rules and incorporated into the mixed paradigm system. We discuss the heuristic control problems involved in incorporating an inductive learner into CABARET, a mixed paradigm reasoner. Finally, we present an empirical study in a legal domain of the classification trees generated by various training sets constructed by a case-based reasoning module.
TL;DR: This article analyzed replicated networks, in which a number of identical networks are independently trained on the same data and their results averaged, and concluded that replication almost always results in a decrease in the expected complexity of the network, and that replication therefore increases expected generalization.
Abstract: We present a unified framework for a number of different ways of failing to generalize properly. During learning, sources of random information contaminate the network, effectively augmenting the training data with random information. The complexity of the function computed is therefore increased, and generalization is degraded. We analyze replicated networks, in which a number of identical networks are independently trained on the same data and their results averaged. We conclude that replication almost always results in a decrease in the expected complexity of the network, and that replication therefore increases expected generalization. Simulations confirming the effect are also presented.