Scispace (Formerly Typeset)
  1. Home
  2. Topics
  3. Weighted Majority Algorithm
  4. 2016
  1. Home
  2. Topics
  3. Weighted Majority Algorithm
  4. 2016
Showing papers on "Weighted Majority Algorithm published in 2016"
Journal Article•10.1109/TKDE.2016.2545658•
Label Distribution Learning

[...]

Xin Geng1•
Southeast University1
01 Jul 2016-IEEE Transactions on Knowledge and Data Engineering
TL;DR: This paper proposes six working LDL algorithms in three ways: problem transformation, algorithm adaptation, and specialized algorithm design, and results show clear advantages of the specialized algorithms, which indicates the importance of special design for the characteristics of the LDL problem.
Abstract: Although multi-label learning can deal with many problems with label ambiguity, it does not fit some real applications well where the overall distribution of the importance of the labels matters. This paper proposes a novel learning paradigm named label distribution learning (LDL) for such kind of applications. The label distribution covers a certain number of labels, representing the degree to which each label describes the instance. LDL is a more general learning framework which includes both single-label and multi-label learning as its special cases. This paper proposes six working LDL algorithms in three ways: problem transformation, algorithm adaptation, and specialized algorithm design. In order to compare the performance of the LDL algorithms, six representative and diverse evaluation measures are selected via a clustering analysis, and the first batch of label distribution datasets are collected and made publicly available. Experimental results on one artificial and 15 real-world datasets show clear advantages of the specialized algorithms, which indicates the importance of special design for the characteristics of the LDL problem.

595 citations

Journal Article•10.1016/J.ASOC.2015.10.011•
A novel SVM-kNN-PSO ensemble method for intrusion detection system

[...]

Abdulla Amin Aburomman1, Mamun Bin Ibne Reaz1•
National University of Malaysia1
1 Jan 2016
TL;DR: A novel ensemble construction method that uses PSO generated weights to create ensemble of classifiers with better accuracy for intrusion detection and results suggest that the new approach can generate ensembles that outperform WMA in terms of classification accuracy.
Abstract: Graphical abstractThe objective of this paper is to develop ensemble based classifiers that will improve the accuracy of Intrusion Detection. For this purpose, we trained and tested 12 experts and then combined them into an ensemble. We used the PSO algorithm to weight the opinion of each expert. Because the quality of the behavioral parameters inserted by the user into PSO strongly affects its effectiveness, we have used the LUS method as a meta-optimizer for finding high-quality parameters. We then used the improved PSO to create new weights for each expert. For comparison, we also developed an ensemble classifier with weights generated using WMA 12. Fig. 1 depicts the entire process. For simplicity, the system framework was divided into the following seven stages:1.Kdd99 data pre-processing.2.Data classification with six different SVM experts.3.Data classification with six different k-NN experts.4.Data classification with ensemble classifier based on PSO.5.Data classification with ensemble classifier based on LUS improvement of PSO.6.Data classification with ensemble classifier based on WMA.7.Comparison of results for each approach.Display Omitted HighlightsIDS implemented using ensemble of a six SVM and a six k-NN classifier.Ensembles are created with weight generated by PSO and meta-PSO algorithms.These two ensembles outperform third ensemble system that is created with WMA. In machine learning, a combination of classifiers, known as an ensemble classifier, often outperforms individual ones. While many ensemble approaches exist, it remains, however, a difficult task to find a suitable ensemble configuration for a particular dataset. This paper proposes a novel ensemble construction method that uses PSO generated weights to create ensemble of classifiers with better accuracy for intrusion detection. Local unimodal sampling (LUS) method is used as a meta-optimizer to find better behavioral parameters for PSO. For our empirical study, we took five random subsets from the well-known KDD99 dataset. Ensemble classifiers are created using the new approaches as well as the weighted majority algorithm (WMA) approach. Our experimental results suggest that the new approach can generate ensembles that outperform WMA in terms of classification accuracy.

488 citations

Journal Article•10.1016/J.ARTINT.2016.04.003•
ASlib: A Benchmark Library for Algorithm Selection

[...]

Bernd Bischl1, Pascal Kerschke2, Lars Kotthoff3, Marius Lindauer4, Yuri Malitsky5, Alexandre Fréchette3, Holger H. Hoos3, Frank Hutter4, Kevin Leyton-Brown3, Kevin Tierney6, Joaquin Vanschoren •
Ludwig Maximilian University of Munich1, University of Münster2, University of British Columbia3, University of Freiburg4, IBM5, University of Paderborn6
01 Aug 2016-Artificial Intelligence
TL;DR: In this article, the authors introduce a standardized format for representing algorithm selection scenarios and a repository that contains a growing number of data sets from the literature, and demonstrate the potential of algorithm selection to achieve significant performance improvements across a broad range of problems and algorithms.

258 citations

Journal Article•10.1007/S10115-015-0870-3•
Transfer learning for class imbalance problems with inadequate data

[...]

Samir Al-Stouhi1, Chandan K. Reddy2•
Honda1, Wayne State University2
01 Jul 2016-Knowledge and Information Systems
TL;DR: This paper proposes a novel boosting-based instance transfer classifier with a label-dependent update mechanism that simultaneously compensates for class imbalance and incorporates samples from an auxiliary domain to improve classification.
Abstract: A fundamental problem in data mining is to effectively build robust classifiers in the presence of skewed data distributions. Class imbalance classifiers are trained specifically for skewed distribution datasets. Existing methods assume an ample supply of training examples as a fundamental prerequisite for constructing an effective classifier. However, when sufficient data are not readily available, the development of a representative classification algorithm becomes even more difficult due to the unequal distribution between classes. We provide a unified framework that will potentially take advantage of auxiliary data using a transfer learning mechanism and simultaneously build a robust classifier to tackle this imbalance issue in the presence of few training samples in a particular target domain of interest. Transfer learning methods use auxiliary data to augment learning when training examples are not sufficient and in this paper we will develop a method that is optimized to simultaneously augment the training data and induce balance into skewed datasets. We propose a novel boosting-based instance transfer classifier with a label-dependent update mechanism that simultaneously compensates for class imbalance and incorporates samples from an auxiliary domain to improve classification. We provide theoretical and empirical validation of our method and apply to healthcare and text classification applications.

122 citations

Proceedings Article•10.1109/ITW.2016.7606789•
Information-theoretic analysis of stability and bias of learning algorithms

[...]

Maxim Raginsky1, Alexander Rakhlin2, Matthew Tsao1, Yihong Wu3, Aolin Xu1 •
University of Illinois at Urbana–Champaign1, University of Pennsylvania2, Yale University3
1 Sep 2016
TL;DR: This work proposes several information-theoretic measures of algorithmic stability and uses them to upper-bound the generalization bias of learning algorithms.
Abstract: Machine learning algorithms can be viewed as stochastic transformations that map training data to hypotheses Following Bousquet and Elisseeff, we say that such an algorithm is stable if its output does not depend too much on any individual training example Since stability is closely connected to generalization capabilities of learning algorithms, it is of theoretical and practical interest to obtain sharp quantitative estimates on the generalization bias of machine learning algorithms in terms of their stability properties We propose several information-theoretic measures of algorithmic stability and use them to upper-bound the generalization bias of learning algorithms Our framework is complementary to the information-theoretic methodology developed recently by Russo and Zou

112 citations

Proceedings Article•10.1145/2840728.2840766•
A PAC Approach to Application-Specific Algorithm Selection

[...]

Rishi Gupta1, Tim Roughgarden1•
Stanford University1
14 Jan 2016
TL;DR: Concepts from statistical and online learning theory are adapted to reason about application-specific algorithm selection, and dimension notions from statistical learning theory, historically used to measure the complexity of classes of binary- and real-valued functions, are relevant in a much broader algorithmic context.
Abstract: The best algorithm for a computational problem generally depends on the "relevant inputs," a concept that depends on the application domain and often defies formal articulation. While there is a large literature on empirical approaches to selecting the best algorithm for a given application domain, there has been surprisingly little theoretical analysis of the problem.This paper adapts concepts from statistical and online learning theory to reason about application-specific algorithm selection. Our models capture several state-of-the-art empirical and theoretical approaches to the problem, ranging from self-improving algorithms to empirical performance models, and our results identify conditions under which these approaches are guaranteed to perform well. We present one framework that models algorithm selection as a statistical learning problem, and our work here shows that dimension notions from statistical learning theory, historically used to measure the complexity of classes of binary- and real-valued functions, are relevant in a much broader algorithmic context. We also study the online version of the algorithm selection problem, and give possibility and impossibility results for the existence of no-regret learning algorithms.

96 citations

Proceedings Article•10.1109/SAI.2016.7555983•
Algorithm selection for classification problems

[...]

Nitin Pise1, Parag Kulkarni1•
College of Engineering, Pune1
13 Jul 2016
TL;DR: The number of attributes, the number of instances, thenumber of classes, maximum probability of class and class entropy are playing a major role in classifier accuracy and algorithm selection for thirty eight datasets used for experimentation.
Abstract: A number of algorithms are available in the areas of data mining, machine learning and pattern recognition for solving the same kind of problem. But there is a little guidance for suggesting algorithm to use which gives best results for the problem at hand. This paper shows an approach for solving this problem using meta-learning. The paper uses three types of data characteristics. Simple, information theoretic, and statistical data characteristics are used. Results are generated using nine different algorithms on thirty eight benchmark datasets from UCI repository. The proposed approach uses K-nearest neighbor algorithm for suggesting the suitable algorithm. Classifier accuracy is taken as a basis for recommending the algorithm. By using meta-learning, accurate method can be recommended as per the given data, and cognitive overload for applying each method, comparing with other methods and then selecting the suitable method for use can be reduced. Thus it helps in adaptive learning methods. The experimentation shows that predicted accuracies are matching with the actual accuracies for more than 90 % of the benchmark datasets used. Thus it is concluded that the number of attributes, the number of instances, the number of classes, maximum probability of class and class entropy are playing a major role in classifier accuracy and algorithm selection for thirty eight datasets used for experimentation.

39 citations

Proceedings Article•10.1109/INVENTIVE.2016.7830090•
Comparison of machine learning algorithms for breast cancer

[...]

Palli Suryachandra, P. Venkata Subba Reddy1•
Sri Venkateswara University1
1 Aug 2016
TL;DR: Random forest and Naïve Bayes algorithms are used as feature selection method, Random Forest is used to rank the feature importance and applied for relevant feedback.
Abstract: Machine learning algorithms are computer programs that try to predict cancer type based on the past data. The eventual goal of Machine learning algorithms in cancer diagnosis is to have a trained machine learning algorithm that gives the gene expression levels from cancer patient, can accurately predict what type and severity of cancer they have, aiding the doctor in treating it. The existing technology compares three different machine learning algorithms are Decision Tree, Support Vector Machine, Bayesian Belief Network. The main drawback of these algorithms is unusual because the number of features (gene expressions) far exceeds the number of cases (samples taken from patients). Performance efficiency can be achieved by comparing two more algorithms are Random Forest and Naive Bayes algorithms. Because Random forest and Naive Bayes are used as feature selection method, Random Forest is used to rank the feature importance and applied for relevant feedback. The requirements are weka tool, Java and Relational Database.

31 citations

Journal Article•10.3233/IDA-150796•
Multi-objective association rule mining with binary bat algorithm

[...]

Anping Song1, Xuehai Ding1, Jianjiao Chen2, Mingbo Li1, Wei Cao1, Ke Pu1 •
Shanghai University1, Yale University2
1 Jan 2016
TL;DR: This paper proposes multi-objective binary bat algorithm (MBBA) based on Pareto for association rule mining, and proposes a new method to discover interesting association rules without favoring or excluding any measure.
Abstract: Association rule mining meeting a variety of measures is regarded as a multi-objective optimization problem rather than a single objective optimization problem. The convergent speed of traditional multi-objective algorithms such as genetic algorithm is slow and the efficiency of these algorithms is low. Furthermore, the rules generated by traditional multi-objective algorithms are too large to be efficiently analyzed and explored in any further process. Bat algorithm is a new efficient global optimal algorithm whose convergence is superior to binary particle swarm optimization (BPSO) and genetic algorithm. This paper discusses the application of multi-objective bat algorithm to association rule mining. We propose multi-objective binary bat algorithm (MBBA) based on Pareto for association rule mining. This algorithm is independent of minimum support and minimum confidence. To evaluate the association rules mined by MBBA algorithm, we propose a new method to discover interesting association rules without favoring or excluding any measure. Compared with the single-objective BPSO, binary bat algorithm (BBA) and Apriori algorithm, the experimental results on six datasets show that the new algorithm is feasible and highly effective. It can make up the shortage of single objective algorithms and traditional association rule mining algorithms.

28 citations

Proceedings Article•10.1109/ICTAI.2016.0051•
An Investigation of Transfer Learning and Traditional Machine Learning Algorithms

[...]

Karl R. Weiss1, Taghi M. Khoshgoftaar1•
Florida Atlantic University1
1 Nov 2016
TL;DR: The unique contributions of this paper are defining a test framework, defining multiple distortion profiles, defining a stress test suite, and the evaluation and comparison of different transfer learning and traditional machine learning algorithms over a wide-range of distributions.
Abstract: Previous research focusing on the evaluation of transfer learning algorithms has predominantly used real-world datasets to measure an algorithm's performance. A test with a real-world dataset exposes an algorithm to a single instance of distribution difference between the training (source) and test (target) datasets. These previous works have not measured performance over a wide-range of source and target distribution differences. We propose to use a test framework that creates many source and target datasets from a single base dataset, representing a diverse-range of distribution differences. These datasets will be used as a stress test to measure an algorithm's performance. The stress test process will measure and compare different transfer learning algorithms and traditional learning algorithms. The unique contributions of this paper, with respect to transfer learning, are defining a test framework, defining multiple distortion profiles, defining a stress test suite, and the evaluation and comparison of different transfer learning and traditional machine learning algorithms over a wide-range of distributions.

24 citations

Journal Article•10.1016/J.EJOR.2015.05.078•
A pool-based pattern generation algorithm for logical analysis of data with automatic fine-tuning

[...]

Marco Caserta1, Torsten Reiners2•
IE University1, Curtin University2
16 Jan 2016-European Journal of Operational Research
TL;DR: A novel technique, called biased Random-Key Genetic Algorithm is employed here a novel technique that allows the calibration of all the parameters of the algorithm in an automatic fashion, hence reducing the fine-tuning effort required and enhancing the performance of the algorithms itself.
Journal Article•10.1049/EL.2016.2299•
Traffic sign recognition based on weighted ELM and AdaBoost

[...]

Xu Yan, Wang Quanwei, Wei Zhenyu, Ma Shuo
08 Nov 2016-Electronics Letters
TL;DR: A novel multiclass AdaBoost-based extreme learning machine (ELM) ensemble algorithm is proposed, in which the weighted ELM is selected as the basic weak classifier because of its much faster learning speed and much better generalisation performance than traditional support vector machines.
Abstract: A novel multiclass AdaBoost-based extreme learning machine (ELM) ensemble algorithm is proposed, in which the weighted ELM is selected as the basic weak classifier because of its much faster learning speed and much better generalisation performance than traditional support vector machines. AdaBoost acts as an ensemble learning method of a number of weighted ELMs. Then, an ensemble strong classifier is constructed by the weighted majority vote of all the weighted ELMs. Compared with the existing ELM methods, the proposed algorithm solves the problem of how to train the weighted samples by ELM in multiclass classification directly. Experiments on the German Traffic Sign Recognition Benchmark database demonstrate that the proposed algorithm can achieve a high recognition accuracy of 99.12% with a relatively lower computational complexity than many state-of-the-art algorithms.
Proceedings Article•10.1109/IICIP.2016.7975367•
Disease prediction using hybrid K-means and support vector machine

[...]

Sandeep Kaur1, Sheetal Kalra1•
Guru Nanak Dev University1
1 Aug 2016
TL;DR: The proposed hybrid K-means and Support Vector Machine algorithm for disease prediction is helpful in choosing initial centroids, number of clusters and also to improve the efficiency of K-Means algorithm.
Abstract: Medical data mining is one of the significant research field as medical organizations produce large volume of data on daily basis. Handling this vast amount of data in medical field is challenging, so there is a need to mine this data in order to extract useful patterns for disease prediction. A hybrid K-means and Support Vector Machine algorithm for disease prediction is proposed in this paper. The proposed hybrid K-means algorithm is helpful in choosing initial centroids, number of clusters and also to improve the efficiency of K-means algorithm. The hybrid K-means algorithm is used for dimensionality reduction of the dataset which is given as an input to Support Vector Machine classifier. The simulation is performed in MATLAB and from the results it has been analysed that the accuracy of the classification is improved and the processing time to obtain the final output is reduced.
Posted Content•
Human-Algorithm Interaction Biases in the Big Data Cycle: A Markov Chain Iterated Learning Framework.

[...]

Olfa Nasraoui, Patrick Shafto
29 Aug 2016-arXiv: Learning
TL;DR: This paper presents a preliminary theoretical model and analysis of the mutual interaction between humans and algorithms, based on an iterated learning framework that is inspired from the study of human language evolution, and defines the concepts of human and algorithm blind spots.
Abstract: Early supervised machine learning algorithms have relied on reliable expert labels to build predictive models. However, the gates of data generation have recently been opened to a wider base of users who started participating increasingly with casual labeling, rating, annotating, etc. The increased online presence and participation of humans has led not only to a democratization of unchecked inputs to algorithms, but also to a wide democratization of the "consumption" of machine learning algorithms' outputs by general users. Hence, these algorithms, many of which are becoming essential building blocks of recommender systems and other information filters, started interacting with users at unprecedented rates. The result is machine learning algorithms that consume more and more data that is unchecked, or at the very least, not fitting conventional assumptions made by various machine learning algorithms. These include biased samples, biased labels, diverging training and testing sets, and cyclical interaction between algorithms, humans, information consumed by humans, and data consumed by algorithms. Yet, the continuous interaction between humans and algorithms is rarely taken into account in machine learning algorithm design and analysis. In this paper, we present a preliminary theoretical model and analysis of the mutual interaction between humans and algorithms, based on an iterated learning framework that is inspired from the study of human language evolution. We also define the concepts of human and algorithm blind spots and outline machine learning approaches to mend iterated bias through two novel notions: antidotes and reactive learning.
Proceedings Article•10.1109/IICIP.2016.7975380•
Improving classification in data mining using hybrid algorithm

[...]

Akanksha Ahlawat1, Bharti Suri1•
Guru Gobind Singh Indraprastha University1
1 Aug 2016
TL;DR: The hybrid algorithm proposed in this paper uses the concept of clustering and decision tree induction to classify the data samples and shows improved accuracy in most cases.
Abstract: Data mining is a powerful concept with great potential to predict future trends and behavior. It refers to the extraction of hidden knowledge from large datasets using techniques like statistical analysis, machine learning, clustering, neural networks and genetic algorithms. Hybrid algorithms for data mining are a logical combination of multiple pre-existing techniques to enhance performance and provide better results[11]. The hybrid algorithm proposed in this paper uses the concept of clustering and decision tree induction to classify the data samples. When the proposed approach is tested on real life datasets, the results obtained show improved accuracy in most cases.
Journal Article•10.1016/J.INS.2016.07.023•
BFPART: Best-First PART

[...]

Igor Ibarguren1, Aritz Lasarguren1, Jesús M. Pérez1, Javier Muguerza1, Ibai Gurrutxaga1, Olatz Arbelaitz1 •
University of the Basque Country1
01 Nov 2016-Information Sciences
TL;DR: This work proposes and compares 16 variants of the PART algorithm from the perspectives of discriminating capacity, complexity of the models, and the computational cost, for 36 real-world problems obtained from the UCI repository and finds the best-performing variant ranks first when compared to the well-established C4.5 algorithm.
Proceedings Article•10.1109/ICMLA.2016.0064•
Automatic Algorithm Selection in Computational Software Using Machine Learning

[...]

Matthew C. Simpson, Qing Yi1, Jugal Kalita2•
University of Colorado Boulder1, University of Colorado Colorado Springs2
1 Dec 2016
TL;DR: When used as a replacement for pre-existing meta-algorithms, the neural network brings about a 68% runtime improvement in Maple and 49% improvement in Mathematica, and Random forests, k-nearest neighbors, and both linear and RBF kernel SVMs are compared to the Neural network model, the latter of which offers the best performance out of the tested machine learning methods.
Abstract: Computational software programs, such as Maple and Mathematica, heavily rely on superfunctions and meta-algorithms to select the optimal algorithm for a given task. These meta-algorithms may require intensive mathematical proof to formulate, incur large computational overhead, or fail to consistently select the best algorithm. Machine learning demonstrates a promising alternative for automatic algorithm selection by easing the design process and overhead while also attaining high accuracy in selection. In a case study on the resultant superfunction, a trained neural network is able to select the best algorithm out of the four available 86% of the time in Maple and 78% of the time in Mathematica. When used as a replacement for pre-existing meta-algorithms, the neural network brings about a 68% runtime improvement in Maple and 49% improvement in Mathematica. Random forests, k-nearest neighbors, and both linear and RBF kernel SVMs are also compared to the neural network model, the latter of which offers the best performance out of the tested machine learning methods.
Posted Content•
Interactive algorithms: from pool to stream

[...]

Sivan Sabato1, Tom Hess1•
Ben-Gurion University of the Negev1
02 Feb 2016-arXiv: Machine Learning
TL;DR: It is shown that a maximal gap between the two settings exists also in the special case of active learning for binary classification, and what is the stream size required for emulating a pool algorithm with a given pool size is asked.
Abstract: We consider interactive algorithms in the pool-based setting, and in the stream-based setting. Interactive algorithms observe suggested elements (representing actions or queries), and interactively select some of them and receive responses. Pool-based algorithms can select elements at any order, while stream-based algorithms observe elements in sequence, and can only select elements immediately after observing them. We assume that the suggested elements are generated independently from some source distribution, and ask what is the stream size required for emulating a pool algorithm with a given pool size. We provide algorithms and matching lower bounds for general pool algorithms, and for utility-based pool algorithms. We further show that a maximal gap between the two settings exists also in the special case of active learning for binary classification.
Proceedings Article•10.1109/KST.2016.7440488•
Improved EM method for internet traffic classification

[...]

Songyin Liu1, Jing Hu1, Shengnan Hao1, Tiecheng Song1•
Southeast University1
24 Mar 2016
TL;DR: The improved algorithm applies the EM algorithm to generate a constrained matrix, then combines the constrained matrix with the q-DAEM algorithm to reduce the search range, so that a better Gaussian mixture model can be derived from this algorithm.
Abstract: Network traffic classification algorithm based on the machine learning has attracted more and more attention. Because the traditional EM algorithm has the disadvantage that the algorithm has the sensitivity of initial value and converge to local optimal point easily. This paper proposed a new improved EM algorithm based on the q-DAEM. The improved algorithm applies the EM algorithm to generate a constrained matrix, then combine the constrained matrix with the q-DAEM algorithm to reduce the search range, so that a better Gaussian mixture model can be derived from this algorithm. The algorithm was applied to the Moore datasets for evaluation, the experimental results show that this improved algorithm which applied to network traffic classification can lead to a higher precision and overall accuracy.
Proceedings Article•10.1109/IKT.2016.7777760•
Optimization of the Ho-Kashyap classification algorithm using appropriate learning samples

[...]

Mir Hossein Dezfoulian1, S. Younes MiriNezhad1, Seyed Muhammad Hossein Mousavi1, Mehrdad Shafaei Mosleh1, Muhammad Mehdi Shalchi1 •
Bu-Ali Sina University1
1 Sep 2016
TL;DR: Combining the methods of Multi Class Instance Selection and Ho-Kashyap has not only reduced the starting time of algorithm, but has improved the accuracy of this algorithm, using proper parameters.
Abstract: This article is focusing on optimization of the Ho-Kashyap classification algorithm. Choosing a proper learning sample plays a significant role in runtime and accuracy of the supervised classification algorithms, specially the Ho-Kashyap classification algorithm. This article with combining the methods of Multi Class Instance Selection and Ho-Kashyap, not has only reduced the starting time of algorithm, but has improved the accuracy of this algorithm, using proper parameters. The results of this suggested method, in terms of accuracy and time, are evaluated and simulations have proved that MCIS method can choose the data that have more effectiveness on classification, using proper measures. If Ho-Kashyap algorithm classifies using more important data, it could be to save the time in classification process and even increases the accuracy of classification.
Proceedings Article•10.1109/TRUSTCOM.2016.0211•
Differential Evolution Based Selective Ensemble of Extreme Learning Machine

[...]

Yong Zhang, Bo Liu, Fan Yang
1 Aug 2016
TL;DR: Experimental results demonstrate that the proposed selective ensemble learning algorithm based on differential evolution (DE) for classification problem can effectively improve the classification accuracy and generalization ability.
Abstract: Extreme learning machine (ELM) for single-hidden-layer feedforward neural networks has been widely used in classification and regression for its fast learning speed. However, a single ELM suffers from problems of stability and overfitting. Ensemble approach can effectively resolve these problems. This paper proposes a selective ensemble learning algorithm based on differential evolution (DE) for classification problem. In the proposed algorithm, ELM is selected as base classifier, and then DE algorithm is employed as the optimization technique to construct an ensemble learning model by combining base classifiers. The weights of each base classifier in the ensemble are optimized by DE algorithm. Finally, several base classifiers with larger weights are selected to form the ensemble for making decision. Experimental results on 14 benchmark datasets demonstrate that the proposed algorithm can effectively improve the classification accuracy and generalization ability.
Proceedings Article•
Aggregation Algorithm Vs. Average for Time Series Prediction

[...]

Waqas Jamil, Yuri Kaliniskan, Hamid Bouchachia
24 Sep 2016
TL;DR: The Aggregation Algorithm is investigated, which a generalisation of the famous weighted majority algorithm, which performs very well in comparison to average.
Abstract: Learning with expert advice as a scheme of on-line learning has been very successfully applied to various learning problems due to its strong theoretical basis. In this paper, for the purpose of times se- ries prediction, we investigate the application of Aggregation Algorithm, which a generalisation of the famous weighted majority algorithm. The results of the experiments done, show that the Aggregation Algorithm performs very well in comparison to average.
Proceedings Article•10.1109/IRI.2016.27•
Designing a Testing Framework for Transfer Learning Algorithms (Application Paper)

[...]

Karl R. Weiss1, Taghi M. Khoshgoftaar1, Oneeb Rehman•
Florida Atlantic University1
1 Jul 2016
TL;DR: The unique contribution of this paper is the definition of a test framework that measures a more complete profile of a transfer learning algorithm's capability, facilitating the identification of relative poor and good performance areas.
Abstract: Most works covering the topic of transfer learning propose an algorithm to solve a given domain adaptation problem, then test the algorithm using real-world datasets. A test with a real-world dataset represents a single transfer learning test condition, which partially measures an algorithm's performance. Previous research has placed little emphasis on developing a comprehensive and uniform test for transfer learning algorithms. With this in mind, a test framework is proposed, comprising of distortion profiles which define a comprehensive test suite. The unique contribution of this paper is the definition of a test framework that measures a more complete profile of a transfer learning algorithm's capability, facilitating the identification of relative poor and good performance areas. As a proof of concept, the test framework is used to test a homogeneous transfer learning algorithm. The test framework will be the basis for a number of future applications.
Journal Article•10.1007/S00138-017-0828-4•
Weighted extreme learning machine for balance and optimization learning

[...]

Xiaojuan Ban1, Ruoyi Liu1, Qing Shen1, Yu Wang1•
University of Science and Technology Beijing1
1 Aug 2016
TL;DR: From the experimental results, the decay-weighted ELM obtains the better effects in solving the imbalance classification tasks, particularly in multiclass tasks.
Abstract: Extreme learning machine (ELM) is a simple and effective method of single-hidden layer feedforward neural network. On this basis, there are many other methods are proposed to improve ELM. Weighted extreme learning machine is one of those methods. Weighted ELM is simple in theory and convenient in implementation and it can be applied directly into multiclass classification tasks. This paper improves previous weighted ELM for balance and optimization learning. From the experimental results, the improved weighted ELM obtains the better effects in solving the multiclass classification tasks.
Journal Article•10.3233/IDA-160836•
Cascading randomized weighted majority: A new online ensemble learning algorithm

[...]

Mohammadzaman Zamani1, Hamid Beigy2, Amirreza Shaban3•
Stony Brook University1, Sharif University of Technology2, Georgia Institute of Technology3
1 Jan 2016
TL;DR: This paper proposes a cascading version of RWM to achieve not only better experimental results but also a better error bound for sufficiently large datasets.
Abstract: With the increasing volume of data in the world, the best approach for learning from this data is to exploit an online learning algorithm. Online ensemble methods are online algorithms which take advantage of an ensemble of classifiers to predict labels of data. Prediction with expert advice is a well-studied problem in the online ensemble learning literature. The Weighted Majority algorithm and the randomized weighted majority (RWM) are the most well-known solutions to this problem, aiming to converge to the best expert. Since among some expert, The best one does not necessarily have the minimum error in all regions of data space, defining specific regions and converging to the best expert in each of these regions will lead to a better result. In this paper, we aim to resolve this defect of RWM algorithms by proposing a novel online ensemble algorithm to the problem of prediction with expert advice. We propose a cascading version of RWM to achieve not only better experimental results but also a better error bound for sufficiently large datasets.
Proceedings Article•10.1109/CEC.2016.7744311•
A genetic algorithm for improving the induction of attribute-based decision graph classifiers

[...]

João Roberto Bertini1, Maria do Carmo Nicoletti1•
Federal University of São Carlos1
1 Jul 2016
TL;DR: A genetic algorithm (GA-AbDG) is proposed which explores the possible benefits of inducing part of the structure of AbDGs, by evolving suitable edge sets for them, and shows that the results obtained with the GA- AbDG algorithm outperform a prior proposal.
Abstract: Attribute-based Decision Graphs (AbDG) have been recently proposed as a novel and effective way to represent data as weighted labeled graphs. However, for some domains, the definition of a graph structure that best fits the data can be a hard task. In machine learning it is very common to rely on evolutionary algorithms to guide the model selection phase of learning processes. Particularly, as far as classification tasks are concerned, evolutionary algorithms can be of great help when searching for characteristics which would promote the induction of a suitable classifier, without the need to exhaustively test all possibilities. This paper proposes a genetic algorithm (GA-AbDG) which explores the possible benefits of inducing part of the structure of AbDGs, by evolving suitable edge sets for them. In addition, the paper shows that the results obtained with the GA-AbDG algorithm outperform a prior proposal, with a fixed p-partite structure of AbDGs, as well as the C4.5.
Journal Article•10.1007/S10586-016-0586-5•
An examination of on-line machine learning approaches for pseudo-random generated data

[...]

Jia Zhu1, Chuanhua Xu1, Zhixu Li2, Gabriel Pui Cheong Fung3, Xueqin Lin1, Jin Huang1, Changqin Huang1 •
South China Normal University1, Soochow University (Suzhou)2, The Chinese University of Hong Kong3
01 Sep 2016-Cluster Computing
TL;DR: This article examines current popular machine learning algorithms with various on-line algorithms for pseudo-random generated data in order to find out which machine learning approach is more suitable for this kind of data for prediction based on on- line algorithms.
Abstract: A pseudo-random generator is an algorithm to generate a sequence of objects determined by a truly random seed which is not truly random. It has been widely used in many applications, such as cryptography and simulations. In this article, we examine current popular machine learning algorithms with various on-line algorithms for pseudo-random generated data in order to find out which machine learning approach is more suitable for this kind of data for prediction based on on-line algorithms. To further improve the prediction performance, we propose a novel sample weighted algorithm that takes generalization errors in each iteration into account. We perform intensive evaluation on real Baccarat data generated by Casino machines and random number generated by a popular Java program, which are two typical examples of pseudo-random generated data. The experimental results show that support vector machine and k-nearest neighbors have better performance than others with and without sample weighted algorithm in the evaluation data set.
Proceedings Article•10.1109/CSCI.2016.0208•
A Multi-Objective Genetic Local Search Algorithm for Optimal Feature Subset Selection

[...]

David Tian1•
Leeds Beckett University1
1 Dec 2016
TL;DR: A feature selection algorithm called ultiObjective genetic local search (MOGLS) which integrates a 3-objective genetic algorithm with a local search heuristic to find feature subsets with the maximum prediction accuracy, the smallest sizes and the minimum redundancy is proposed.
Abstract: Feature selection algorithms select the most relevant features of a data set to improve the classification performance of the machine learning classifiers trained using the data set. This paper proposes a feature selection algorithm called ultiobjective genetic local search (MOGLS) which integrates a 3-objective genetic algorithm with a local search heuristic to find feature subsets with the maximum prediction accuracy, the smallest sizes and the minimum redundancy. The performance of MOGLS is compared with 4 algorithms: a wrapper genetic algorithm, correlation-based feature selection, mutual information ranking and C4.5 on 8 datasets from the UCI machine learning repository. MOGLS performs better than or as good as the 4 algorithms on the 8 datasets.
Proceedings Article•10.1109/CAMAD.2016.7790325•
Efficient algorithm selection for packet classification using machine learning

[...]

Mohammed Elmahgiubi1, Omar Ahmed1, Shawki Areibi1, Gary Grewal1•
University of Guelph1
1 Oct 2016
TL;DR: By utilizing Meta-Learning and Artificial Neural Networks (ANNs) this work is able to achieve an average accuracy of 90% when automatically choosing the most appropriate algorithm when applied to over a hundred different rulesets ranging in size from 1K to 5K.
Abstract: Many packet classification algorithms with variable performances and capabilities are available. However, no single algorithm is guaranteed to outperform every other one in every case. Meta-Learning is a subfield in Machine Learning that aims to apply statistical techniques to automate the algorithm selection process. In this work, we propose a novel framework for efficient, automatic packet classification algorithm selection. By utilizing Meta-Learning and Artificial Neural Networks (ANNs) we are able to achieve an average accuracy of 90% when automatically choosing the most appropriate algorithm when applied to over a hundred different rulesets ranging in size from 1K to 5K.
Proceedings Article•
Weighted A* algorithms for unsupervised feature selection with provable bounds on suboptimality

[...]

Hiromasa Arai1, Ke Xu1, Crystal Maung1, Haim Schweitzer1•
University of Texas at Dallas1
12 Feb 2016
TL;DR: This work proposes an algorithm based on ideas similar to the Weighted A* algorithm in heuristic search that is more accurate than the current state of the art in identifying a small number of features in data.
Abstract: Identifying a small number of features that can represent the data is believed to be NP-hard. Previous approaches exploit algebraic structure and use randomization. We propose an algorithm based on ideas similar to the Weighted A* algorithm in heuristic search. Our experiments show this new algorithm to be more accurate than the current state of the art.

Tools

SciSpace AgentBiomedical AgentSciSpace RecruitSciSpace for EnterpriseAgent GalleryChat with PDFLiterature ReviewAI WriterFind TopicsParaphraserCitation GeneratorExtract DataAI DetectorCitation Booster

Learn

ResourcesLive Workshops

SciSpace

CareersSupportBrowse PapersPricingSciSpace Affiliate ProgramCancellation & Refund PolicyTermsPrivacyData Sources

Directories

PapersTopicsJournalsAuthorsConferencesInstitutionsCitation StylesWriting templates

Extension & Apps

SciSpace Chrome ExtensionSciSpace Mobile App

Contact

support@scispace.com
SciSpace

© 2026 | PubGenius Inc. | Suite # 217 691 S Milpitas Blvd Milpitas CA 95035, USA

soc2
Secured by Delve