Top 11 papers published in the topic of Weighted Majority Algorithm in 2000

Showing papers on "Weighted Majority Algorithm published in 2000"

Proceedings Article•

A New Approximate Maximal Margin Classification Algorithm

[...]

1 Jan 2000

TL;DR: A new incremental learning algorithm is described which approximates the maximal margin hyperplane w.r.t. norm p ≥ 2 for a set of linearly separable data and is as fast as on-line algorithms, such as Rosenblatt's Perceptron algorithm.

...read moreread less

Abstract: A new incremental learning algorithm is described which approximates the maximal margin hyperplane w.r.t. norm p ≥ 2 for a set of linearly separable data. Our algorithm, called ALMAp (Approximate Large Margin algorithm w.r.t. norm p), takes O((p-1)X2/α2 γ2) corrections to separate the data with p-norm margin larger than (1 - α)γ, where γ is the p-norm margin of the data and X is a bound on the p-norm of the instances. ALMAp avoids quadratic (or higher-order) programming methods. It is very easy to implement and is as fast as on-line algorithms, such as Rosenblatt's perceptron. We report on some experiments comparing ALMAp to two incremental algorithms: Perceptron and Li and Long's ROMMA. Our algorithm seems to perform quite better than both. The accuracy levels achieved by ALMAp are slightly inferior to those obtained by Support vector Machines (SVMs). On the other hand, ALMAp is quite faster and easier to implement than standard SVMs training algorithms.

...read moreread less

236 citations

Proceedings Article•10.1109/ICASSP.2000.860134•

LEARN++: an incremental learning algorithm for multilayer perceptron networks

[...]

Robi Polikar¹, Lalita Udpa¹, Satish S. Udpa¹, Vasant Honavar•Institutions (1)

Iowa State University¹

5 Jun 2000

TL;DR: A supervised learning algorithm is introduced that gives neural network classification algorithms the capability of learning incrementally from new data without forgetting what has been learned in earlier training sessions.

...read moreread less

Abstract: We introduce a supervised learning algorithm that gives neural network classification algorithms the capability of learning incrementally from new data without forgetting what has been learned in earlier training sessions Schapire's (1990) boosting algorithm, originally intended for improving the accuracy of weak learners, has been modified to be used in an incremental learning setting The algorithm is based on generating a number of hypotheses using different distributions of the training data and combining these hypotheses using a weighted majority voting This scheme allows the classifier previously trained with a training database, to learn from new data when the original data is no longer available, even when new classes are introduced Initial results on incremental training of multilayer perceptron networks on synthetic as well as real-world data are presented in this paper

...read moreread less

50 citations

Book Chapter•10.1007/3-540-44418-1_13•

On-Line Estimation of Hidden Markov Model Parameters

[...]

Jun Mizuno¹, Tasuya Watanabe¹, Kazuya Ueki¹, Kazuyuki Amano¹, Eiji Takimoto¹, Akira Maruoka¹ - Show less +2 more•Institutions (1)

Tohoku University¹

4 Dec 2000

TL;DR: The experiments show that the on-line Baum-Welch algorithm performs well as compared to the Gradient Descent algorithm and adapts the change of speakers very well.

...read moreread less

Abstract: In modeling various signals such as the speech signal by using the Hidden Markov Model (HMM), it is often required to adapt not only to the inherent nonstationarity of the signal, but to changes of sources (speakers) who yield the signal. The well known Baum-Welch algorithm tries to adjust HMM so as to optimize the fit between the model and the signal observed. In this paper we develop an algorithm, which we call the on-line Baum-Welch algorithm, by incorporating the learning rate into the off-line Baum-Welch algorithm. The algorithm performs in a series of trials. In each trial the algorithm somehow produces an HMM Mt, then receives a symbol sequence wt, incurring loss - ln Pr(wt|Mt) which is the negative log-likelihood of the HMM Mt evaluated at wt. The performance of the algorithm is measured by the additional total loss, which is called the regret, of the algorithm over the total loss of a standard algorithm, where the standard algorithm is taken to be a criterion for measuring the relative loss. We take the off-line Baum-Welch algorithm as such a standard algorithm. To evaluate the performance of an algorithm, we take the Gradient Descent algorithm. Our experiments show that the on-line Baum-Welch algorithm performs well as compared to the Gradient Descent algorithm. We carry out the experiments not only for artificial data, but for some reasonably realistic data which is made by transforming acoustic waveforms to symbol sequences through the vector quantization method. The results show that the on-line Baum-Welch algorithm adapts the change of speakers very well.

...read moreread less

34 citations

A Generalized Let-Polymorphic Type Inference Algorithm

[...]

Ropas Memo, Oukseh Lee¹, Kwangkeun Yi¹•Institutions (1)

KAIST¹

1 Jan 2000

TL;DR: A generalized let-polymorphic type inference algorithm is presented, it is proved that any of its instances is sound and complete with respect to the Hindley/Milner let- poly morphic type system, and a condition is found on two instance algorithms so that one algorithm should find type errors earlier than the other.

...read moreread less

Abstract: We present a generalized let-polymorphic type inference algorithm, prove that any of its instances is sound and complete with respect to the Hindley/Milner let-polymorphic type system, and find a condition on two instance algorithms so that one algorithm should find type errors earlier than the other. By instantiating the generalized algorithm with different parameters, we can achieve not only the two opposite algorithms (the bottom-up standard Algorithm W and the top-down folklore algorithm M) but also other various hybrid algorithms that avoid their extremities in type-checking (W fails too late, while M fails too early). Such hybrid algorithms’ soundness, completeness, and their relative earliness in detecting type-errors follow automatically. The set of hybrid algorithms that come from the generalized algorithm is a superset of those used in the two most popular ML compilers, SML/NJ and OCaml.

...read moreread less

27 citations

Journal Article•10.1002/1099-1425(200009/10)3:5<273::AID-JOS48>3.0.CO;2-0•

Applying extra‐resource analysis to load balancing

[...]

Mark Brehob¹, Eric Torng¹, Patchrawat Uthaisombut¹•Institutions (1)

Michigan State University¹

01 Sep 2000-Journal of Scheduling

TL;DR: This work derives a qualitative divergence between off-line and on-line algorithms for the load-balancing problem, the problem of assigning a list of jobs on m identical machines to minimize the makespan, the maximum load on any machine.

...read moreread less

Abstract: Previously, extra-resource analysis has been used to argue that certain on-line algorithms are good choices for solving specific problems because these algorithms perform well with respect to the optimal off-line algorithm when given extra resources. We now introduce a new application for extra-resource analysis: deriving a qualitative divergence between off-line and on-line algorithms. We do this for the load-balancing problem, the problem of assigning a list of jobs on m identical machines to minimize the makespan, the maximum load on any machine. We analyze the worst-case performance of on-line and off-line approximation algorithms relative to performance of the optimal off-line algorithm when the approximation algorithms have k extra machines. Our main result are the following: The Longest-Processing-Time (ℒ) algorithm will produce a schedule with makespan no larger than that of the optimal off-line algorithm if ℒ has at least (4m−1) /3 machines while the optimal off-line algorithm has m machines. In contrast, no on-line algorithm can guarantee the same with any number of extra machines. Copyright © 2000 John Wiley & Sons, Ltd.

...read moreread less

25 citations

Proceedings Article•

A Normative Examination of Ensemble Learning Algorithms

[...]

David M. Pennock¹, Pedrito Maynard-Reid, C. Lee Giles, Eric Horvitz•Institutions (1)

NEC¹

29 Jun 2000

TL;DR: A normative evaluation of combination methods, applying and extending existing axiomatizations from social choice theory and statistics, shows that several seemingly innocuous and desirable properties are mutually satisfied only by a dictatorship and gives axiomatic justifications for majority vote and for weighted majority.

...read moreread less

Abstract: Ensemble learning algorithms combine the results of several classifiers to yield an aggregate classification. We present a normative evaluation of combination methods, applying and extending existing axiomatizations from social choice theory and statistics. For the case of multiple classes, we show that several seemingly innocuous and desirable properties are mutually satisfied only by a dictatorship. A weaker set of properties admit only the weighted average combination rule. For the case of binary classification, we give axiomatic justifications for majority vote and for weighted majority. We also show that, even when all component algorithms report that an attribute is probabilistically independent of the classification, common ensemble algorithms often destroy this independence information. We exemplify these theoretical results with experiments on stock market data, demonstrating how ensembles of classifiers can exhibit canonical voting paradoxes.

...read moreread less

16 citations

Any Two Learning Algorithms Are (Almost) Exactly Identical

[...]

David H. Wolpert¹•Institutions (1)

Ames Research Center¹

8 Jan 2000

TL;DR: This paper shows that if one is provided with a loss function, it can be used in a natural way to specify a distance measure quantifying the similarity of any two supervised learning algorithms, even non-parametric algorithms, indicating that any two learning algorithms are almost exactly identical for such scenarios.

...read moreread less

Abstract: This paper shows that if one is provided with a loss function, it can be used in a natural way to specify a distance measure quantifying the similarity of any two supervised learning algorithms, even non-parametric algorithms. Intuitively, this measure gives the fraction of targets and training sets for which the expected performance of the two algorithms differs significantly. Bounds on the value of this distance are calculated for the case of binary outputs and 0-1 loss, indicating that any two learning algorithms are almost exactly identical for such scenarios. As an example, for any two algorithms A and B, even for small input spaces and training sets, for less than 2e(-50) of all targets will the difference between A's and B's generalization performance of exceed 1%. In particular, this is true if B is bagging applied to A, or boosting applied to A. These bounds can be viewed alternatively as telling us, for example, that the simple English phrase 'I expect that algorithm A will generalize from the training set with an accuracy of at least 75% on the rest of the target' conveys 20,000 bytes of information concerning the target. The paper ends by discussing some of the subtleties of extending the distance measure to give a full (non-parametric) differential geometry of the manifold of learning algorithms.

...read moreread less

10 citations

Journal Article•10.1023/A:1019164403899•

A new optimal algorithm for weighted approximation and integration over ℝ

[...]

Lei Han¹, Grzegorz W. Wasilkowski¹•Institutions (1)

University of Kentucky¹

01 Jul 2000-Numerical Algorithms

TL;DR: This paper proposes another class of (almost) optimal algorithms that, for a number of instances, are easier to implement and have a cost smaller than the original algorithms from [7].

...read moreread less

Abstract: The complexities of weighted approximation and weighted integration problems for univariate functions defined over ℝ have recently been found in [7]. Complexity (almost) optimal algorithms have also been provided therein. In this paper, we propose another class of (almost) optimal algorithms that, for a number of instances, are easier to implement. More importantly, these new algorithms have a cost smaller than the original algorithms from [7]. Since both classes of algorithms are (almost) optimal, their costs differ by a multiplicative constant that depends on the specific weight functions and the error demand. In one of our tests we observed this constant to be as large as four, which means a cost reduction by a factor of four.

...read moreread less

4 citations

On the learning behavior of the dr-ls algorithm

[...]

Markus Rupp

1 Jan 2000

TL;DR: A data-reuse algorithm is analyzed that converges to the LS solution instead of the Wiener solution, similar to Kaczmarz’s row projection method, and results in a mis adjustment proportional to its step-size.

...read moreread less

Abstract: Adaptive algorithms with data-reuse, like the UNDRLMS algorithm, have recently received more attention due to their simple structure and their capability to improve the estimates by repeating the same operation. Since the basic operation is that of an LMS algorithm, it is a common belief that these algorithms converge towards the Wiener solution. In this paper, a data-reuse algorithm is analyzed that converges to the LS solution instead. The algorithm is similar to Kaczmarz’s row projection method, allows, however, un-normalized regression vectors and a wide range of (time-variant) step-sizes. Similar to the LMS algorithm when compared to the Wiener solution, this algorithm also results in a misadjustment proportional to its step-size. Various step-size control strategies are investigated to improve this misadjustment.

...read moreread less

Journal Article•10.1023/A:1007621832648•

On-line Learning and the Metrical Task System Problem

[...]

Avrim Blum¹, Carl Burch¹•Institutions (1)

Carnegie Mellon University¹

01 Apr 2000-Machine Learning

TL;DR: An experimental comparison of how these algorithms perform on a process migration problem, a problem that combines aspects of both the experts-tracking and MTS formalisms, is presented.

...read moreread less

Abstract: The problem of combining expert advice, studied extensively in the Computational Learning Theory literature, and the Metrical Task System (MTS) problem, studied extensively in the area of On-line Algorithms, contain a number of interesting similarities. In this paper we explore the relationship between these problems and show how algorithms designed for each can be used to achieve good bounds and new approaches for solving the other. Specific contributions of this paper include: • An analysis of how two recent algorithms for the MTS problem can be applied to the problem of tracking the best expert in the “decision-theoretic” setting, providing good bounds and an approach of a much different flavor from the well-known multiplicative-update algorithms. • An analysis showing how the standard randomized Weighted Majority (or Hedge) algorithm can be used for the problem of “combining on-line algorithms on-line”, giving much stronger guarantees than the results of Azar, Y., Broder, A., & Manasse, M. (1993). Proc ACM-SIAM Symposium on Discrete Algorithms (pp. 432–440) when the algorithms being combined occupy a state space of bounded diameter. • A generalization of the above, showing how (a simplified version of) Herbster and Warmuth's weight-sharing algorithm can be applied to give a “finely competitive” bound for the uniform-space Metrical Task System problem. We also give a new, simpler algorithm for tracking experts, which unfortunately does not carry over to the MTS problem. Finally, we present an experimental comparison of how these algorithms perform on a process migration problem, a problem that combines aspects of both the experts-tracking and MTS formalisms.

...read moreread less

Journal Article•10.1016/S0925-2312(99)00123-X•

Training neural networks by stochastic optimisation

[...]

Antanas Verikas¹, Antanas Verikas², Adas Gelzinis²•Institutions (2)

Halmstad University¹, Kaunas University of Technology²

01 Jan 2000-Neurocomputing

TL;DR: A stochastic learning algorithm for neural networks that does not make any assumptions about transfer functions of individual neurons and does not depend on a functional form of reinforcement learning is presented.

...read moreread less