TL;DR: A generic message-passing algorithm, the sum-product algorithm, that operates in a factor graph, that computes-either exactly or approximately-various marginal functions derived from the global function.
Abstract: Algorithms that must deal with complicated global functions of many variables often exploit the manner in which the given functions factor as a product of "local" functions, each of which depends on a subset of the variables. Such a factorization can be visualized with a bipartite graph that we call a factor graph, In this tutorial paper, we present a generic message-passing algorithm, the sum-product algorithm, that operates in a factor graph. Following a single, simple computational rule, the sum-product algorithm computes-either exactly or approximately-various marginal functions derived from the global function. A wide variety of algorithms developed in artificial intelligence, signal processing, and digital communications can be derived as specific instances of the sum-product algorithm, including the forward/backward algorithm, the Viterbi algorithm, the iterative "turbo" decoding algorithm, Pearl's (1988) belief propagation algorithm for Bayesian networks, the Kalman filter, and certain fast Fourier transform (FFT) algorithms.
TL;DR: In this paper, a genetic algorithm is used to abstract a data stream associated with each object and a pattern recognition algorithm is applied to classify the objects and measure the fitness of the chromosomes of the genetic algorithm.
Abstract: The invention concerns heuristic algorithms for the classification of Objects. A first learning algorithm comprises a genetic algorithm that is used to abstract a data stream associated with each Object and a pattern recognition algorithm that is used to classify the Objects and measure the fitness of the chromosomes of the genetic algorithm. The learning algorithm is applied to a training data set. The learning algorithm generates a classifying algorithm, which is used to classify or categorize unknown Objects. The invention is useful in the areas of classifying texts and medical samples, predicting the behavior of one financial market based on price changes in others and in monitoring the state of complex process facilities to detect impending failures.
TL;DR: A new learning algorithm is proposed for multilayer feedforward neural networks, which converges faster and achieves a better classification accuracy than the conventional backpropagation learning algorithm for pattern classification.
Abstract: The authors propose a new learning algorithm for multilayer feedforward neural networks, which converges faster and achieves a better classification accuracy than the conventional backpropagation learning algorithm for pattern classification. In the conventional backpropagation learning algorithm, weights are adjusted to reduce the error or cost function that reflects the differences between the computed and the desired outputs. In the proposed learning algorithm, the authors view each term of the output layer as a function of weights and adjust the weights directly so that the output neurons produce the desired outputs. Experiments with remotely sensed data show the proposed algorithm consistently performs better than the conventional backpropagation learning algorithm in terms of classification accuracy and convergence speed.
TL;DR: A new improved incremental SVM learning algorithm is proposed, which is based on a sifting factor, which accumulates distribution knowledge of the training sample while the incremental training is proceeded, and thus makes it possible to discard samples optimally.
Abstract: The classification algorithm based on SVM (support vector machine) attracts more attention from researchers due to its perfect theoretical properties and good empirical results. In this paper, the properties of SV set are analyzed thoroughly, and a new learning method is introdnced to extend the SVM Classification algorithm to incremental learning area. After that, a new improved incremental SVM learning algorithm is proposed, which is based on a sifting factor. This algorithm accumulates distribution knowledge of the training sample while the incremental training is proceeded, and thus makes it possible to discard samples optimally. The theoretical analysis and experimental results show that this algorithm could not only improve the training speed, but also reduce storage cost.
TL;DR: Haussler, Littlestone and Warmuth (1994) described a general-purpose algorithm for learning according to the prediction model, and proved an upper bound on the probability that their algorithm makes a mistake in terms of the number of examples seen and the Vapnik-Chervonenkis dimension of the concept class being learned.
Abstract: Haussler, Littlestone and Warmuth (1994) described a general-purpose algorithm for learning according to the prediction model, and proved an upper bound on the probability that their algorithm makes a mistake in terms of the number of examples seen and the Vapnik-Chervonenkis (VC) dimension of the concept class being learned We show that their bound is within a factor of 1+o(1) of the best possible such bound for any algorithm
TL;DR: A theoretical analysis for the mistake bound of the DNA-based majority algorithm via amplification is shown, and it is implied that the amplification to "double the volumes" of the correct DNA strands in the test tube works well.
Abstract: We consider a probabilistic interpretation of the test tube which contains a large amount of DNA strands, and propose a population computation using a number of DNA strands in the test tube and a probabilistic logical inference based on the probabilistic interpretation. Second, in order for the DNA-based learning algorithm [4] to be robust for errors in the data, we implement the weighted majority algorithm [3] on DNA computers, called DNA-based majority algorithm via amplification (DNAMA), which take a strategy of "amplifying" the consistent (correct) DNA strands while the usual weighted majority algorithm decreases the weights of inconsistent ones. We show a theoretical analysis for the mistake bound of the DNA-based majority algorithm via amplification, and imply that the amplification to "double the volumes" of the correct DNA strands in the test tube works well.
TL;DR: An incremental learning algorithm, Learn++, which allows supervised classification algorithms to learn from new data without forgetting previously acquired knowledge, is introduced.
Abstract: An incremental learning algorithm, Learn++, which allows supervised classification algorithms to learn from new data without forgetting previously acquired knowledge, is introduced. Learn++ is based on generating multiple classifiers using strategically chosen distributions of the training data and combining these classifiers through weighted majority voting. Learn++ shares various notions with psycho-physiological models of learning. The Learn++ algorithm, simulation results, and how the algorithm is related to various concepts in psycho-physiological learning models are discussed. The algorithm was tested on a variety of real world and synthetic datasets. Two sets of results are presented for optical handwritten digit recognition and gas sensing.
TL;DR: The algorithm enjoys better rate-distortion performance than that of other existing fuzzy clustering and competitive learning algorithms and can be an effective alternative to the existing variable-rate VQ algorithms for signal compression.
TL;DR: An efficient hybrid optimization algorithm named the adaptive random signal-based learning, which is similar to the reinforcement learning of neural networks is proposed and confirmed by applying it to two different examples.
Abstract: Genetic algorithms are becoming more popular because of their relative simplicity and robustness. Genetic algorithms are global search techniques for nonlinear optimization. However, traditional genetic algorithms, though robust, are generally not the most successful optimization algorithm on any particular domain because they are poor at hill-climbing, whereas simulated annealing has the ability of probabilistic hill-climbing. Therefore, hybridizing a genetic algorithm with other algorithms can produce better performance than using the genetic algorithm or other algorithms independently. In this paper, we propose an efficient hybrid optimization algorithm named the adaptive random signal-based learning. Random signal-based learning is similar to the reinforcement learning of neural networks. This paper describes the application of genetic algorithms and simulated annealing to a random signal-based learning in order to generate the parameters and reinforcement signal of the random signal-based learning, respectively. The validity of the proposed algorithm is confirmed by applying it to two different examples.
TL;DR: It is usually true that programmers and/or users come across a plethora of different algorithms when looking to solve a particular problem efficiently, but it is unlikely that a single one of them is the best (fastest) in all possible cases.
Abstract: Computer scientists always strive to find better and faster algorithms for any computational problem. It is usually true that programmers and/or users come across a plethora of different algorithms when looking to solve a particular problem efficiently. Each one of these algorithms might offer different guarantees and properties, but it is unlikely that a single one of them is the best (fastest) in all possible cases. So, the question that the programmer/user typically faces is: “Which algorithm should I select?” This question is largely due to the uncertainty in the input space, the inner workings of the algorithm (especially true for randomized algorithms), and the hardware characteristics. It’s hard to know in advance what kind of inputs will be provided, how exactly the computation will proceed, or even how efficiently the underlying hardware will support the needs of the different algorithms. Sometimes, a careful study can reveal that committing to a particular algorithm is better than committing to any of the other algorithms, but is this the best we can do? What if uncertainty is explicitly taken into account and the right decision is made dynamically on an instance-by-instance basis? To make the discussion more concrete, consider, for example, the problem of sorting. Why would you ever choose MergeSort or InsertionSort when you know that QuickSort is in general the fastest algorithm for sorting? That might be true to some extent, but it seems that if you allow for collaboration of these three “competitors”, the outcome can be beneficial. Think of this algorithm selection problem(Rice 1976) as a decision problem: “Which algorithm should I run whenever a new instance is presented?”The fact that two of these algorithms are recursive makes the problem even more interesting. Every time a recursive call is made, you can ask the same question: “Which algorithm should I choose for the current subproblem?”. Nothing really dictates that you have to use the same algorithm again and again throughout the recursion. How, then, can you optimize this sequence (or better, tree) of decisions? On what ground is each decision based?
TL;DR: This work presents a supervised learning approach to DNA shotgun sequencing that begins with a parameterized form of a sequencing algorithm and to learn the optimal parameter values, numerical and combinatorial, for the given domain of interest.
Abstract: We present a supervised learning approach to DNA shotgun sequencing. The oracle (supervisor) is a set of already-sequenced DNA strands; the output of the learning process is a domain-specific algorithm for sequence assembly. Our goal is to learn a fast algorithm for a given problem domain. Our approach is to begin with a parameterized form of a sequencing algorithm and to then learn the optimal parameter values, numerical and combinatorial, for the given domain of interest. We present experimental results using DNA strings from H. pylori and humans, and compare the results from real DNA with results obtained from random strings. Our experiments demonstrate that considerable gains in speed can be achieved without significant loss in accuracy. Further, the resulting algorithms learned on different input domains are distinct, illustrating the value of developing domain specific algorithms. We also present experimental results on the applicability of the algorithms learned on one domain to the other domains. These results have implications for the sequencing of new DNA-strings given already sequenced DNA-strings.
TL;DR: This work has proposed several methods to execute quantitative inferences using large quantities of DNA strands in test tube and extend the previous algorithms to robust ones for noise and errors in the data.
Abstract: We overview a series of our research on DNA-based supervised learning of Boolean formulae and its application to gene expression analyses. In our previous work, we have presented methods for encoding and evaluating Boolean formulae on DNA strands and supervised learning of Boolean formulae on DNA computers which is known as NP-hard problem in computational learning theory. We have also applied those methods to executing logical operations of gene expression profiles in test tube. These proposed methods are discrete (qualitative) algorithms and do not deal with quantitative analysis and are not robust for noise and errors. Recently, we have proposed several methods to execute quantitative inferences using large quantities of DNA strands in test tube and extend the previous algorithms to robust ones for noise and errors in the data. These methods include probabilistic inference and randomized prediction, and weighted majority prediction and learning by amplification in the test tube based on the weighted majority algorithm.