TL;DR: A nearest neighbor algorithm for learning in domains with symbolic features, which produces excellent classification accuracy on three problems that have been studied by machine learning researchers: predicting protein secondary structure, identifying DNA promoter sequences, and pronouncing English text.
Abstract: In the past, nearest neighbor algorithms for learning from examples have worked best in domains in which all features had numeric values. In such domains, the examples can be treated as points and distance metrics can use standard definitions. In symbolic domains, a more sophisticated treatment of the feature space is required. We introduce a nearest neighbor algorithm for learning in domains with symbolic features. Our algorithm calculates distance tables that allow it to produce real-valued distances between instances, and attaches weights to the instances to further modify the structure of feature space. We show that this technique produces excellent classification accuracy on three problems that have been studied by machine learning researchers: predicting protein secondary structure, identifying DNA promoter sequences, and pronouncing English text. Direct experimental comparisons with the other learning algorithms show that our nearest neighbor algorithm is comparable or superior in all three domains. In addition, our algorithm has advantages in training speed, simplicity, and perspicuity. We conclude that experimental evidence favors the use and continued development of nearest neighbor algorithms for domains such as the ones studied here.
TL;DR: This paper analyzes the complexity of on-line reinforcement learning algorithms, namely asynchronous realtime versions of Q-learning and value-iteration, applied to the problem of reaching a goal state in deterministic domains and shows that the algorithms are tractable with only a simple change in the task representation or initialization.
Abstract: This paper analyzes the complexity of on-line reinforcement learning algorithms, namely asynchronous realtime versions of Q-learning and value-iteration, applied to the problem of reaching a goal state in deterministic domains. Previous work had concluded that, in many cases, tabula rasa reinforcement learning was exponential for such problems, or was tractable only if the learning algorithm was augmented. We show that, to the contrary, the algorithms are tractable with only a simple change in the task representation or initialization. We provide tight bounds on the worst-case complexity, and show how the complexity is even smaller if the reinforcement learning algorithms have initial knowledge of the topology of the state space or the domain has certain special properties. We also present a novel bidirectional Q-learning algorithm to find optimal paths from all states to a goal state and show that it is no more complex than the other algorithms.
TL;DR: The UR-ID3 algorithm described combines uncertain reasoning with the rule set produced by ID3 to create a machine learning algorithm which is robust in the presence of uncertain training and testing data.
Abstract: Quinlan's ID3 is a symbolic machine learning algorithm which uses training examples as input and constructs a decision tree as output. One problem with the standard decision tree approach to machine learning is that uncertain data, either in training and/or testing, often produces poor classification accuracies. The UR-ID3 algorithm described combines uncertain reasoning with the rule set produced by ID3 to create a machine learning algorithm which is robust in the presence of uncertain training and testing data. Experimental results are presented which compare the new algorithm's performance with that of ID3 and backpropagation neural networks. >
TL;DR: An improved implementation of the Real-Time Recurrent Learning algorithm is described, which makes it possible to increase the performance of the learning algorithm during the training phase by using some a priori knowledge about the temporal necessities of the problem.
TL;DR: It is demonstrated how weighted majority voting with multiplicative weight updating can be applied to obtain robust algorithms for learning binary relations and an algorithm that obtains a nearly optimal mistake bound is presented.
Abstract: In this paper we demonstrate how weighted majority voting with multiplicative weight updating can be applied to obtain robust algorithms for learning binary relations. We first present an algorithm that obtains a nearly optimal mistake bound but at the expense of using exponential computation to make each prediction. However, the time complexity of our algorithm is significantly reduced from that of previously known algorithms that have comparable mistake bounds. The second algorithm we present is a polynomial time algorithm with a non-optimal mistake bound. Again the mistake bound of our second algorithm is significantly better than previous bounds proven for polynomial time algorithms.
TL;DR: The authors present a new learning and synthesis algorithm for training multilayer feedforward neural networks that can classify both linear separable and linear nonseparable families, whereas the backpropagation algorithm will fail sometimes.
Abstract: The authors present a new learning and synthesis algorithm for training multilayer feedforward neural networks. Its principle is to synthesize a neural network by growing layers based on using training results until the required results are achieved. Each layer is trained with the pocket algorithm and hidden neurons are added only when needed. The proposed algorithm has the following properties. 1) The architecture of the network is generated dynamically by the learning process algorithm and it is unnecessary to estimate the number of layers and the number of hidden neurons before training. The neuron activation function is hard limiting instead of sigmoidal. 2) The learning speed is faster than other algorithms, especially the backpropagation algorithm. After the neural network is fully trained the system error is absolutely zero. 3) This algorithm can classify both linear separable and linear nonseparable families, whereas the backpropagation algorithm will fail sometimes. Extensive numerical simulation studies of this algorithm have confirmed these properties and thus the proposed training strategy looks promising. >
TL;DR: This dissertation addresses the problem of designing algorithms for learning in embedded systems using Sutton's techniques for linear association and reinforcement comparison, while the interval estimation algorithm uses the statistical notion of confidence intervals to guide its generation of actions.
Abstract: This dissertation addresses the problem of designing algorithms for learning in embedded systems. This problem differs from the traditional supervised learning problem. An agent, finding itself in a particular input situation must generate an action. It then receives a reinforcement value from the environment, indicating how valuable the current state of the environment is for the agent. The agent cannot, however, deduce the reinforcement value that would have resulted from executing any of its other actions. A number of algorithms for learning action strategies from reinforcement values are presented and compared empirically with existing reinforcement-learning algorithms.
The interval-estimation algorithm uses the statistical notion of confidence intervals to guide its generation of actions in the world, trading off acting to gain information against acting to gain reinforcement. It performs well in simple domains but does not exhibit any generalization and is computationally complex.
The cascade algorithm is a structural credit-assignment method that allows an action strategy with many output bits to be learned by a collection of reinforcement-learning modules that learn Boolean functions. This method represents an improvement in computational complexity and often in learning rate.
Two algorithms for learning Boolean functions in k-DNF are described. Both are based on Valiant's algorithm for learning such functions from input-output instances. The first uses Sutton's techniques for linear association and reinforcement comparison, while the second uses techniques from the interval estimation algorithm. They both perform well and have tractable complexity.
A generate-and-test reinforcement-learning algorithm is presented. It allows symbolic representations of Boolean functions to be constructed incrementally and tested in the environment. It is highly parametrized and can be tuned to learn a broad range of function classes. Low-complexity functions can be learned very efficiently even in the presence of large numbers of irrelevant input bits. This algorithm is extended to construct simple sequential networks using a set-reset operator, which allows the agent to learn action strategies with state.
These algorithms, in addition to being studied in simulation, were implemented and tested on a physical mobile robot.