TL;DR: The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, assessing predictions, handling noisy data and outliers, improving the quality of predictions by tuning fit parameters, and applications of locally weighted learning.
Abstract: This paper surveys locally weighted learning, a form of lazy learning and memory-based learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, assessing predictions, handling noisy data and outliers, improving the quality of predictions by tuning fit parameters, interference between old and new data, implementing locally weighted learning efficiently, and applications of locally weighted learning. A companion paper surveys how locally weighted learning can be used in robot learning and control.
TL;DR: A new machine learning algorithm for the diagnosis of cardiac arrhythmia from standard 12 lead ECG recordings is presented, and it is indicated that it outperforms other standard algorithms such as Naive Bayesian and Nearest Neighbor classifiers.
Abstract: A new machine learning algorithm for the diagnosis of cardiac arrhythmia from standard 12 lead ECG recordings is presented. The algorithm is called VF15 for Voting Feature Intervals. VF15 is a supervised and inductive learning algorithm for inducing classification knowledge from examples. The input to VF15 is a training set of records. Each record contains clinical measurements, from ECG signals and some other information such as sex, age, and weight, along with the decision of an expert cardiologist. The knowledge representation is based on a recent technique called Feature Intervals, where a concept is represented by the projections of the training cases on each feature separately. Classification in VF15 is based on a majority voting among the class predictions made by each feature separately. The comparison of the VF15 algorithm indicates that it outperforms other standard algorithms such as Naive Bayesian and Nearest Neighbor classifiers.
TL;DR: Among many findings, this study concludes that the preflow-push algorithms are substantially faster than other classes of algorithms, and the highest-label preflows-push algorithm is the fastest maximum flow algorithm for which the growth rate in the computational time is O ( n 1.5 ) on four out of five of the authors' problem classes.
TL;DR: This dissertation presents a collection of papers that seek to overcome each of these disadvantages of the nearest neighbor algorithm by creating a comprehensive system called the Integrated Decremental Instance-Based Learning algorithm, which in experiments on 44 applications achieves higher generalization accuracy than other instance-based learning algorithms.
Abstract: The nearest neighbor algorithm and its derivatives, which are often referred to collectively as instance-based learning algorithms, have been successful on a variety of real-world applications. However, in its basic form, the nearest neighbor algorithm suffers from inadequate distance functions, large storage requirements, slow execution speed, a sensitivity to noise and irrelevant attributes, and an inability to adjust its decision surfaces after storing the data. This dissertation presents a collection of papers that seek to overcome each of these disadvantages. The most successful enhancements are combined into a comprehensive system called the Integrated Decremental Instance-Based Learning algorithm, which in experiments on 44 applications achieves higher generalization accuracy than other instance-based learning algorithms. It also yields higher generalization accuracy than that reported for 16 major machine learning and neural network models.
TL;DR: A new learning algorithm for locally recurrent neural networks, called truncated recursive backpropagation which can be easily implemented on-line with good performance and which generalises the algorithm proposed by Waibel et al. (1989) for TDNN.
Abstract: We propose a new learning algorithm for locally recurrent neural networks, called truncated recursive backpropagation which can be easily implemented on-line with good performance. Moreover it generalises the algorithm proposed by Waibel et al. (1989) for TDNN, and includes the Back and Tsoi (1991) algorithm as well as BPS and standard on-line backpropagation as particular cases. The proposed algorithm has a memory and computational complexity that can be adjusted by a careful choice of two parameters h and h' and so it is more flexible than a previous algorithm proposed by us. Although for the sake of brevity we present the new algorithm only for IIR-MLP networks, it can be applied also to any locally recurrent neural network. Some computer simulations of dynamical system identification tests, reported in literature, are also presented to assess the performance of the proposed algorithm applied to the IIR-MLP.
TL;DR: The expectation and maximization algorithm (EM algorithm) is generalized so that the learning proceeds according to adjustable weights in terms of probability measures, and it is found that this learning structure can work systolically.
Abstract: The expectation and maximization algorithm (EM algorithm) is generalized so that the learning proceeds according to adjustable weights in terms of probability measures. The method presented, the weighted EM algorithm (or the /spl alpha/-EM algorithm), includes the existing EM algorithm, as a special case. It is further found that this learning structure can work systolically. It is also possible to add monitors to interact with lower systolic subsystems. This is made possible by attaching building blocks of the weighted (or plain) EM learning. Derivation of the whole algorithm is based on generalized divergences. In addition to the discussions on the learning, extensions of basic statistical properties such as Fisher's efficient score, his measure of information and Cramer-Rao's inequality, are given. These appear in update equations of the generalized expectation learning. Experiments show that the presented generalized version contains cases that outperform traditional learning methods.
TL;DR: The proposed pruning algorithm consists of two already known algorithms, the structural learning algorithm with forgetting and the optimal brain damage algorithm using the second derivatives of the assessment.
Abstract: We propose a new structural learning algorithm for organizing the structure of the multi-layered neural networks. The proposed pruning algorithm consists of two already known algorithms, the structural learning algorithm with forgetting and the optimal brain damage algorithm using the second derivatives of the assessment. After the network is slimmed by the structural learning algorithm with forgetting, unimportant weights are pruned from the network using the second derivatives. The simulations are performed for the Boolean function and the acoustic diagnosis of compressors. The results show that the proposed algorithm is effective for eliminating the unimportant weights.
TL;DR: Two improvements to the Kernighan-Lin algorithm are identified which, without clustering or an improved heuristic function, bring the performance of the algorithm near that of more sophisticated algorithms.
Abstract: An algorithm that remains in use at the core of many partitioning systems is the Kernighan-Lin algorithm and a variant the Fidducia-Matheysses (FM) algorithm. To understand the FM algorithm we applied principles of data engineering where visualization and statistical analysis are used to analyze the run-time behavior. We identified two improvements to the algorithm which, without clustering or an improved heuristic function, bring the performance of the algorithm near that of more sophisticated algorithms. One improvement is based on the observation, explored empirically, that the full passes in the FM algorithm appear comparable to a stochastic local restart in the search. We motivate this observation with a discussion of recent improvements in Monte Carlo Markov Chain methods in statistics. The other improvement is based on the observation that when an FM-like algorithm is run 20 times and the best run chosen, the performance trace of the algorithm on earlier runs is useful data for learning when to abort a later run. These improvements, implemented with a simple adaptive scheme, are orthogonal to techniques used in state-of-the-art implementations, and should therefore be applicable to other VLSI optimization algorithms.
TL;DR: It is observed that there exists a universal learning algorithm that PAC-learns every concept class within complexity that is linearly related to the complexity of the best learning algorithm for this class.
TL;DR: This work demonstrates worst-case upper bounds on the absolute loss for the perceptron algorithm and an exponentiated update algorithm related to the Weighted Majority algorithm.
Abstract: The absolute loss is the absolute difference between the desired and predicted outcome. I demonstrate worst-case upper bounds on the absolute loss for the perceptron algorithm and an exponentiated update algorithm related to the Weighted Majority algorithm. The bounds characterize the behavior of the algorithms over any sequence of trials, where each trial consists of an example and a desired outcome interval (any value in the interval is an acceptable outcome). The worstcase absolute loss of both algorithms is bounded by: the absolute loss of the best linear function in the comparison class, plus a constant dependent on the initial weight vector, plus a per-trial loss. The per-trial loss can be eliminated if the learning algorithm is allowed a tolerance from the desired outcome. For concept learning, the worst-case bounds lead to mistake bounds that are comparable to previous results.
TL;DR: A new kth-order quadratic learning algorithm is developed, and when the order k is appropriately chosen, the algorithm can improve the learning efficiency.
Abstract: The authors develop a new kth-order quadratic learning algorithm, and compare it with the conventional least squares algorithm. It appears that when the order k is appropriately chosen, the algorithm can improve the learning efficiency.
TL;DR: A new learning theory (a set of principles for brain-like learning) and a corresponding algorithm for the neural-network field are presented and computational results are provided for the well known Mackey-Glass chaotic time series problem, the logistic map prediction problem, various neuro-control problems, and several time series forecasting problems.
Abstract: This paper presents a new learning theory (a set of principles for brain-like learning) and a corresponding algorithm for the neural-network field. The learning theory defines computational characteristics that are much more brain-like than that of classical connectionist learning. Robust and reliable learning algorithms would result if these learning principles are followed rigorously when developing neural-network algorithms. This paper also presents a new algorithm for generating radial basis function (RBF) nets for function approximation. The design of the algorithm is based on the proposed set of learning principles. The net generated by this algorithm is not a typical RBF net, but a combination of "truncated" RBF and other types of hidden units. The algorithm uses random clustering and linear programming (LP) to design and train this "mixed" RBF net. Polynomial time complexity of the algorithm is proven and computational results are provided for the well known Mackey-Glass chaotic time series problem, the logistic map prediction problem, various neuro-control problems, and several time series forecasting problems. The algorithm can also be implemented as an online adaptive algorithm.
TL;DR: The generalization performance of two learning algorithms, Bayes algorithm and the ``optimal learning'' algorithm, on two classification tasks is studied theoretically, and both algorithms perform better than the conventional stochastic Gibbs algorithm.
Abstract: The generalization performance of two learning algorithms, Bayes algorithm and the ``optimal learning'' algorithm, on two classification tasks is studied theoretically. In the first example the task is defined by a restricted two-layer network, a committee machine, and in the second the task is defined by the so-called prototype problem. The architecture of the learning machine, in both cases, is defined to be a committee machine. For both tasks the optimal learning algorithm, which is optimal when the solution is restricted to a specific architecture, performs worse than the overall optimal Bayes algorithm. However, both algorithms perform better than the conventional stochastic Gibbs algorithm, especially for the prototype problem in which the task and the learning machine are very different.
TL;DR: Presents an inductive machine learning algorithm called CLIP3 (Cover learning using integer programming), an extension of the CLILP2 algorithm that combines the best concepts of tree‐based and rule‐based algorithms to produce a highly reliable machine‐learning algorithm.
Abstract: Presents an inductive machine learning algorithm called CLIP3 (Cover learning using integer programming). CLIP3 is an extension of the CLILP2 algorithm. CLIP3 generates multiple rules for a given concept from two sets of discrete attribute data. It combines the best concepts of tree‐based and rule‐based algorithms to produce a highly reliable machine‐learning algorithm. The algorithm is run on the benchmark “MONK′s data sets”. Compares the results of standard machine learning algorithms such as the ID and AQ families of algorithms. The algorithm is also run on the breast cancer data set and the results are compared with C4.5 algorithm results.