TL;DR: This paper's models capture several state-of-the-art empirical and theoretical approaches to the problem, ranging from self-improving algorithms to empirical performance models, and the results identify conditions under which these approaches are guaranteed to perform well.
Abstract: The best algorithm for a computational problem generally depends on the “relevant inputs,” a concept that depends on the application domain and often defies formal articulation. While there is a large body of literature on empirical approaches to selecting the best algorithm for a given application domain, there has been surprisingly little theoretical analysis of the problem. This paper adapts concepts from statistical and online learning theory to reason about application-specific algorithm selection. Our models capture several state-of-the-art empirical and theoretical approaches to the problem, ranging from self-improving algorithms to empirical performance models, and our results identify conditions under which these approaches are guaranteed to perform well. We present one framework that models algorithm selection as a statistical learning problem, and our work here shows that dimension notions from statistical learning theory, historically used to measure the complexity of classes of binary- and re...
TL;DR: In this paper, the authors present a thorough analysis of 13 state-of-the-art, commonly used machine learning algorithms on a set of 165 publicly available classification problems in order to provide data-driven algorithm recommendations to current researchers.
Abstract: As the bioinformatics field grows, it must keep pace not only with new data but with new algorithms. Here we contribute a thorough analysis of 13 state-of-the-art, commonly used machine learning algorithms on a set of 165 publicly available classification problems in order to provide data-driven algorithm recommendations to current researchers. We present a number of statistical and visual comparisons of algorithm performance and quantify the effect of model selection and algorithm tuning for each algorithm and dataset. The analysis culminates in the recommendation of five algorithms with hyperparameters that maximize classifier performance across the tested problems, as well as general guidelines for applying machine learning to supervised classification problems.
TL;DR: In this paper, a new novel swarm intelligence algorithm, namely, dandelion algorithm (DA), is proposed for global optimization of complex functions, and simulations show that the proposed algorithm seems much superior to other algorithms.
Abstract: Inspired by the behavior of dandelion sowing, a new novel swarm intelligence algorithm, namely, dandelion algorithm (DA), is proposed for global optimization of complex functions in this paper. In DA, the dandelion population will be divided into two subpopulations, and different subpopulations will undergo different sowing behaviors. Moreover, another sowing method is designed to jump out of local optimum. In order to demonstrate the validation of DA, we compare the proposed algorithm with other existing algorithms, including bat algorithm, particle swarm optimization, and enhanced fireworks algorithm. Simulations show that the proposed algorithm seems much superior to other algorithms. At the same time, the proposed algorithm can be applied to optimize extreme learning machine (ELM) for biomedical classification problems, and the effect is considerable. At last, we use different fusion methods to form different fusion classifiers, and the fusion classifiers can achieve higher accuracy and better stability to some extent.
TL;DR: A Diverse Human Learning Optimization algorithm (DHLO), into which the Gaussian distribution and dynamic adjusting strategy are introduced, and its performance is compared with the standard HLO as well as the other eight meta-heuristics.
Abstract: Human Learning Optimization is a simple but efficient meta-heuristic algorithm in which three learning operators, i.e. the random learning operator, the individual learning operator, and the social learning operator, are developed to efficiently search the optimal solution by imitating the learning mechanisms of human beings. However, HLO assumes that all the individuals possess the same learning ability, which is not true in a real human population as the IQ scores of humans, one of the most important indices of the learning ability of humans, follow Gaussian distribution and increase with the development of society and technology. Inspired by this fact, this paper proposes a Diverse Human Learning Optimization algorithm (DHLO), into which the Gaussian distribution and dynamic adjusting strategy are introduced. By adopting a set of Gaussian distributed parameter values instead of a constant to diversify the learning abilities of DHLO, the robustness of the algorithm is strengthened. In addition, by cooperating with the dynamic updating operation, DHLO can adjust to better parameter values and consequently enhances the global search ability of the algorithm. Finally, DHLO is applied to tackle the CEC05 benchmark functions as well as knapsack problems, and its performance is compared with the standard HLO as well as the other eight meta-heuristics, i.e. the Binary Differential Evolution, Simplified Binary Artificial Fish Swarm Algorithm, Adaptive Binary Harmony Search, Binary Gravitational Search Algorithms, Binary Bat Algorithms, Binary Artificial Bee Colony, Bi-Velocity Discrete Particle Swarm Optimization, and Modified Binary Particle Swarm Optimization. The experimental results show that the presented DHLO outperforms the other algorithms in terms of search accuracy and scalability.
TL;DR: The case for leveraging the potentially big data of an algorithm's past executions to improve and speed up future, similar solutions, by reducing the algorithm's search space is made.
Abstract: At the heart of many computer network planning, deployment, and operational tasks lie hard algorithmic problems. Accordingly, over the last decades, we have witnessed a continuous pursuit for ever more accurate and faster algorithms. We propose an approach to design network algorithms which is radically different from most existing algorithms. Our approach is motivated by the observation that most existing algorithms to solve a given hard computer networking problem overlook a simple yet very powerful optimization opportunity in practice: many network algorithms are executed repeatedly (e.g., for each virtual network request or in reaction to user mobility), and hence with each execution, generate interesting data: (problem,solution)-pairs. We make the case for leveraging the potentially big data of an algorithm's past executions to improve and speed up future, similar solutions, by reducing the algorithm's search space. We study the applicability of machine learning to network algorithm design, identify challenges and discuss limitations. We empirically demonstrate the potential of machine learning network algorithms in two case studies, namely the embedding of virtual networks (a packing optimization problem) and k-center facility location (a covering optimization problem), using a prototype implementation.
TL;DR: An improved ID3 algorithm is proposed that combines the simplified information entropy based on different weights with coordination degree in rough set theory and is shown that the proposed algorithm has a better performance in the running time and tree structure, but not in accuracy.
Abstract: The decision tree algorithm is a core technology in data classification mining, and ID3 (Iterative Dichotomiser 3) algorithm is a famous one, which has achieved good results in the field of classification mining. Nevertheless, there exist some disadvantages of ID3 such as attributes biasing multi-values, high complexity, large scales, etc. In this paper, an improved ID3 algorithm is proposed that combines the simplified information entropy based on different weights with coordination degree in rough set theory. The traditional ID3 algorithm and the proposed one are fairly compared by using three common data samples as well as the decision tree classifiers. It is shown that the proposed algorithm has a better performance in the running time and tree structure, but not in accuracy than the ID3 algorithm, for the first two sample sets, which are small. For the third sample set that is large, the proposed algorithm improves the ID3 algorithm for all of the running time, tree structure and accuracy. The experimental results show that the proposed algorithm is effective and viable.
TL;DR: This study addresses the IL problem in BD applications by proposed the Distributed and Weighted ELM (DW-ELM) algorithm, which is based on the MapReduce framework, and confirms the feasibility of parallel computation.
TL;DR: Results show that even if tested on noisy data, sampled at a broad range of sampling rates, 5 out of 10 machine learning based universal models perform better than state-of-the-art algorithms.
Abstract: This paper presents a comparison of 10 machine learning algorithms in eye‑movement event detection task. The goal was to build a universal algorithm, which could work with any type of the eye-tracking data. Results show that even if tested on noisy data, sampled at a broad range of sampling rates, 5 out of 10 machine learning based universal models perform better than state-of-the-art algorithms. Even more, 7 machine learning based specialist classifiers, trained to work with high quality data, outperform expert coders as reported by Larsson et al. (2015).
TL;DR: Simulation results show that the performance of the proposed RGFACL algorithm outperforms theperformance of the fuzzy actor–critic learning and the Q-learning fuzzy inference system algorithms in terms of convergence and speed of learning.
Abstract: In this work, we propose a new fuzzy reinforcement learning algorithm for differential games that have continuous state and action spaces. The proposed algorithm uses function approximation systems whose parameters are updated differently from the updating mechanisms used in the algorithms proposed in the literature. Unlike the algorithms presented in the literature which use the direct algorithms to update the parameters of their function approximation systems, the proposed algorithm uses the residual gradient value iteration algorithm to tune the input and output parameters of its function approximation systems. It has been shown in the literature that the direct algorithms may not converge to an answer in some cases, while the residual gradient algorithms are always guaranteed to converge to a local minimum. The proposed algorithm is called the residual gradient fuzzy actor–critic learning (RGFACL) algorithm. The proposed algorithm is used to learn three different pursuit–evasion differential games. Simulation results show that the performance of the proposed RGFACL algorithm outperforms the performance of the fuzzy actor–critic learning and the Q-learning fuzzy inference system algorithms in terms of convergence and speed of learning.
TL;DR: An improved weighted LeaderRank algorithm is proposed, which takes degree of node only and clustering coefficient into account and makes a quicker convergence but also select nodes which play a more significant role in the spreading process.
Abstract: Identifying influential spreaders in social network is of great theoretical and practical significance. In this paper, we propose an improved weighted LeaderRank algorithm. Instead of considering degree of node only, we also take clustering coefficient into account to depict the weight. Moreover, we change the way of score assignment, which can give more scores to those high-influence nodes. Compared with PageRank and weighted LeaderRank, simulations show that not only can our algorithm make a quicker convergence but also select nodes which play a more significant role in the spreading process.
TL;DR: A parallel approximate SS-ELM Algorithm based on MapReduce based on the approximate adjacent similarity matrix (AASM) algorithm, which leverages the Locality-Sensitive Hashing (LSH) scheme to calculate the approximate neighboring similarity matrix, thus greatly reducing the complexity and occupied memory.
TL;DR: Under some mild conditions on the reliability of the sensors, it is proved that one can filter out the unreliable ones, and the approach leverages the power of the theory of learning automata (LA) so as to gradually learn the identity of the reliable and unreliable sensors.
Abstract: The purpose of this paper is to propose a solution to an extremely pertinent problem, namely, that of identifying unreliable sensors (in a domain of reliable and unreliable ones) without any knowledge of the ground truth. This fascinating paradox can be formulated in simple terms as trying to identify stochastic liars without any additional information about the truth. Though apparently impossible, we will show that it is feasible to solve the problem, a claim that is counter-intuitive in and of itself. One aspect of our contribution is to show how redundancy can be introduced, and how it can be effectively utilized in resolving this paradox. Legacy work and the reported literature (for example, in the so-called weighted majority algorithm) have merely addressed assessing the reliability of a sensor by comparing its reading to the ground truth either in an online or an offline manner. Unfortunately, the fundamental assumption of revealing the ground truth cannot be always guaranteed (or even expected) in many real life scenarios. While some extensions of the Condorcet jury theorem [9] can lead to a probabilistic guarantee on the quality of the fused process, they do not provide a solution to the unreliable sensor identification problem. The essence of our approach involves studying the agreement of each sensor with the rest of the sensors, and not comparing the reading of the individual sensors with the ground truth—as advocated in the literature. Under some mild conditions on the reliability of the sensors, we can prove that we can, indeed, filter out the unreliable ones. Our approach leverages the power of the theory of learning automata (LA) so as to gradually learn the identity of the reliable and unreliable sensors. To achieve this, we resort to a team of LA, where a distinct automaton is associated with each sensor. The solution provided here has been subjected to rigorous experimental tests, and the results presented are, in our opinion, both novel and conclusive.
TL;DR: The article presents a novel meta-algorithm, called Epochal Stochastic Bandit Algorithm Selection (ESBAS), to freeze the policy updates at each epoch, and to leave a rebooted stochastic bandit in charge of the algorithm selection.
Abstract: Dialogue systems rely on a careful reinforcement learning design: the learning algorithm and its state space representation. In lack of more rigorous knowledge, the designer resorts to its practical experience to choose the best option. In order to automate and to improve the performance of the aforementioned process, this article formalises the problem of online off-policy reinforcement learning algorithm selection. A meta-algorithm is given for input a portfolio constituted of several off-policy reinforcement learning algorithms. It then determines at the beginning of each new trajectory, which algorithm in the portfolio is in control of the behaviour during the full next trajectory, in order to maximise the return. The article presents a novel meta-algorithm, called Epochal Stochastic Bandit Algorithm Selection (ESBAS). Its principle is to freeze the policy updates at each epoch, and to leave a rebooted stochastic bandit in charge of the algorithm selection. Under some assumptions, a thorough theoretical analysis demonstrates its near-optimality considering the structural sampling budget limitations. Then, ESBAS is put to the test in a set of experiments with various portfolios, on a negotiation dialogue game. The results show the practical benefits of the algorithm selection for dialogue systems, in most cases even outperforming the best algorithm in the portfolio, even when the aforementioned assumptions are transgressed.
TL;DR: A novel modified version of the K-means algorithm is proposed that results in lower clustering errors and is based on its strong dependence of the selection of initial centroids.
Abstract: Load profiling refers to a procedure which leads to the formulation of daily load curve and consumer classes regarding the similarity of the curves shapes. This procedure incorporates a set of unsupervised machine learning algorithms. Various researches propose clustering algorithms for grouping together load curves with high degree of similarity. K-means is the most common algorithm in the load profiling literature. The main drawback of the algorithm lies on its strong dependence of the selection of initial centroids. The present paper proposes a novel modified version of the algorithm that results in lower clustering errors.
TL;DR: In this article, the authors provide relative performance analysis of transfer learning algorithms and traditional machine learning algorithms, addressing the correlation between AUC and classification accuracy under domain class imbalance conditions with statistical analysis provided.
Abstract: In machine learning applications, there are scenarios of having no labeled training data, due to the data being rare or too expensive to obtain. In these cases, it is desirable to use readily available labeled data, that is similar to, but not the same as, the domain application of interest. Transfer learning algorithms are used to build high-performance classifiers, when the training data has different distribution characteristics from the testing data. For a transfer learning environment, it is not possible to use validation techniques (such as cross validation or data splitting) to set the desired performance of a classifier, due to the lack of labeled training data from the test domain. As a result, the area under the receiver operating characteristic curve (AUC) performance measure may not be predictive of the actual classifier performance. In an environment where validation techniques are not possible, the relationship between AUC and classification accuracy is needed to better characterize transfer learning algorithm performance. This paper provides relative performance analysis of state-of-the-art transfer learning algorithms and traditional machine learning algorithms, addressing the correlation between AUC and classification accuracy under domain class imbalance conditions with statistical analysis provided.
TL;DR: The results show that warnings that lead to failures which is dubbed as abnormal events can be predicted using supervised machine learning algorithms, in particular, the Random Forest algorithm, with a relatively satisfactory Recall and Precision which is visibly higher than the other classifiers.
Abstract: In this study, we apply machine learning algorithms to predict technical failures that can be encountered in Oracle databases and related services. In order to train machine learning algorithms, data from log files are collected hourly from Oracle database systems and labeled with two classes; normal or abnormal. We use several data science approaches to preprocess and transform the input data from raw format to the format, which can be feed to the algorithms. After the preprocessing, several different machine learning classifiers are trained and evaluated on our datasets. Our results show that warnings that lead to failures which is dubbed as abnormal events can be predicted using supervised machine learning algorithms, in particular, the Random Forest algorithm, with a relatively satisfactory Recall (75.7%) and Precision (84.9%) which is visibly higher than the other classifiers.
TL;DR: A new learning algorithm for learning the synaptic weights of the single-hidden-layer feedforward neural networks is proposed by combining the upgraded bat algorithm with the extreme learning machine, which can efficiently search for the optimal input weights as well as the hidden biases, leading to the reduced number of evaluations needed to train a neural network.
Abstract: The learning time of the synaptic weights for feedforward neural networks tend to be very long. In order to reduce the learning time, in this paper we propose a new learning algorithm for learning the synaptic weights of the single-hidden-layer feedforward neural networks by combining the upgraded bat algorithm with the extreme learning machine. The proposed approach can efficiently search for the optimal input weights as well as the hidden biases, leading to the reduced number of evaluations needed to train a neural network. The experimental results based on classification problems and comparison with other approaches from literature have shown that the proposed algorithm produces a satisfactory performance in almost all cases and that it can learn the weight factors much faster than the traditional learning algorithms.
TL;DR: A meta-algorithm for approximating the Pareto optimal set of costly black-box multiobjective optimization problems given a limited number of objective function evaluations based on the predicted performance of each algorithm at the time of optimization search.
Abstract: This paper presents a meta-algorithm for approximating the Pareto optimal set of costly black-box multiobjective optimization problems given a limited number of objective function evaluations. The key idea is to switch among different algorithms during the optimization search based on the predicted performance of each algorithm at the time. Algorithm performance is modeled using a machine learning technique based on the available information. The predicted best algorithm is then selected to run for a limited number of evaluations. The proposed approach is tested on several benchmark problems and the results are compared against those obtained using any one of the candidate algorithms alone.
TL;DR: The first algorithm is Elementwise Probing Algorithm (EPA) which is very fast under a score which utilizes Frobenius Distance and the second algorithm is Additive Reinforcement Learning Algorithm which combines ideas from perceptron algorithm and reinforcement learning algorithm.
Abstract: We introduce the Binary Matrix Guessing Problem and provide two algorithms to solve this problem. The first algorithm we introduce is Elementwise Probing Algorithm (EPA) which is very fast under a score which utilizes Frobenius Distance. The second algorithm is Additive Reinforcement Learning Algorithm which combines ideas from perceptron algorithm and reinforcement learning algorithm. This algorithm is significantly slower compared to first one, but less restrictive and generalizes better. We compare computational performance of both algorithms and provide numerical results.
TL;DR: In novel algorithm, the new distance measurement of scalable spatial density similarity in data sets is defined, and a cluster-center iterative model in the algorithm is proposed, and compared with Euclidean distance based k-Means, this algorithm generally perform more accurate on several synthetic and real-world datasets.
Abstract: k-Means clustering algorithm is widely used in many machine learning tasks. However, the classic k-Means clustering algorithm has poor performance on classification of non-convex data sets. We find that k-Means effect depends heavily on the measurement of similarity between instances of the datasets. In novel algorithm, we define the new distance measurement of scalable spatial density similarity in data sets, and propose a cluster-center iterative model in the algorithm. Experimental results show that compared with Euclidean distance based k-Means, our proposed algorithm with spatial density similarity measurement generally perform more accurate on several synthetic and real-world datasets.
TL;DR: A novel parallelized parameter selection using Flower Pollination Algorithm (FPA) to quickly find the optimal parameters of SVM, which forms a fully distributed algorithm to support a large dataset.
Abstract: Support Vector Machine (SVM) is one of the most popular machine learning algorithm to perform classification tasks and help organizations in different ways to improve their efficiency. A lot of studies have been made to improve SVM including speed, accuracy, and/or scalability. The algorithm possesses parameters that need precision tuning to perform well. This work proposes a novel parallelized parameter selection using Flower Pollination Algorithm (FPA) to quickly find the optimal parameters of SVM. In particular, MapReduce algorithm introduced in big data framework is applied to both FPA and SVM, which forms a fully distributed algorithm to support a large dataset. The experimental results of Parallelized FPA-SVM on real datasets show its outstanding speed in generating optimal models while maintaining high accuracy.
TL;DR: Experiments show that the proposed hybrid Adaboost algorithm compared with the weak algorithm integration algorithm with only a single algorithm, the proposed algorithm is superior.
Abstract: This paper presents a hybrid Adaboost algorithm. The decision groups are chosen as weak classifiers, which consist of K nearest neighbor algorithm, Naive Bayes and decision tree. When the weak classifiers are promoted to strong classifier, the genetic algorithm is used to optimize the discourse right of each weak classifier. Experiments show proposed algorithm compared with the weak algorithm integration algorithm with only a single algorithm, the proposed algorithm is superior.
TL;DR: An instancespecific algorithm selection method based on multi-output learning, which can manage the potential relations between different candidate algorithms more directly and can obtain a better performance over the state-of-the-art algorithm selection methods.
TL;DR: A voting algorithm that can be used to find the most optimal solution to clustering problems in machine learning and showed an excellent performance due to the use of linear time computations.
Abstract: This paper describes a voting algorithm that can be used to find the most optimal solution to clustering problems in machine learning. As part of the family of algorithms known as Condorcet methods, the voting algorithm is used to choose a particular candidate, even in the absence of a definitive majority. The algorithm proceeds in two steps: Renormalization and reconciliation. In the renormalization step all probability measure are reset so that the ensemble probability is always unity. In the reconciliation step a best choice is made based on the renormalized data. The result showed an excellent performance due to the use of linear time computations.
TL;DR: A greedy algorithm for sparse learning over a doubly stochastic network that provides a restricted isometry property (RIP)-based theoretical guarantee both on the performance of the algorithm and the number of iterations required for convergence.
Abstract: In this paper, we develop a greedy algorithm for sparse learning over a doubly stochastic network. In the proposed algorithm, nodes of the network perform sparse learning by exchanging their individual intermediate variables. The algorithm is iterative in nature. We provide a restricted isometry property (RIP)-based theoretical guarantee both on the performance of the algorithm and the number of iterations required for convergence. Using simulations, we show that the proposed algorithm provides good performance.
TL;DR: The proposed algorithm is verified on disease data available in UCI Online Machine Learning Repository and proves the robustness, effectiveness and versatility in terms of performance and low computational cost of the proposed system in the field of medical diagnostics.
Abstract: In this paper, a novel online scheme for classification, which is based on the contextual-variant of Weighted Average Forecaster Algorithm is proposed. The proposed method adaptively partitions the data space based on contexts, and tradeoffs exploration and exploitation when fusing the predictions of the experts. The proposed algorithm is verified on disease data available in UCI Online Machine Learning Repository. These results prove the robustness, effectiveness and versatility in terms of performance and low computational cost of the proposed system in the field of medical diagnostics.
TL;DR: One new error function is developed to improve the neural network learning algorithm and shows that it can get error precision in a faster way for improved algorithm; while for standard algorithm, the error precision will be realized but in a slow speed.
Abstract: Neural network algorithm can be used in many areas due to its advantage; while a standard BP neural network algorithm cannot get the desired results limited by its learning speed. The reasons of learning speed are mainly the defects of the neural network algorithm itself. This paper will develop one new error function to improve the neural network learning algorithm. The results shows that it can get error precision in a faster way for improved algorithm; while for standard algorithm, the error precision will be realized but in a slow speed.
TL;DR: The discriminative learning algorithm not only effectively restrain the noise, but also avoid the overfitting of the classifier in the current ensemble parameters learning algorithms of Bayesian network.
Abstract: The paper first analyzed the property of sample confidence measure function applied by noise reduction algorithm, explained the reason of this function being not suitable for multi-class problems. Then a more targeted confidence measure function was designed, and based on this function, an enhanced de-noise algorithm of ensemble parameters learning was proposed. Thus the discriminative learning algorithm not only effectively restrain the noise, but also avoid the overfitting of the classifier. Finally, the experimental results and statistical analysis for hypothesis testing verified that the current ensemble parameters learning algorithms of Bayesian network was improved obviously in the performance.
TL;DR: A stabilized learning algorithm based on iteration correction is proposed that can finish the learning process in fewer steps than the number of neurons and three theorems and their proofs can prove that the proposed algorithm is stable.
Abstract: Single-hidden-layer feedforward neural network (SLFN) is an effective model for data classification and regression. However, it has a very important defect that it is rather time-consuming to explore the training algorithm of SLFN. In order to shorten the learning time, a special non-iterative learning algorithm was proposed, named as extreme learning machine (ELM). The main idea is that the input weights and bias are chosen randomly and the output weights are calculated by a pseudo-inverse matrix. However, ELM also has a very important drawback that it cannot achieve stable solution for different runs because of randomness. In this paper, we propose a stabilized learning algorithm based on iteration correction. The convergence analysis shows that the proposed algorithm can finish the learning process in fewer steps than the number of neurons. Three theorems and their proofs can prove that the proposed algorithm is stable. Several data sets are selected from UCI databases, and the experimental results demonstrate that the proposed algorithm is effective.
TL;DR: A new algorithm is proposed by combining the advantages of both random search and gradient descent algorithms, and it is shown that given an accuracy level of the estimated expected risk, the algorithm can generate a hypothesis by its algorithm to guarantee the accuracy with probability 1, and the algorithm will converge in finite steps.
Abstract: The generalization ability of learning algorithms is the focus of machine learning research, where the empirical risk minimization (ERM) plays an important role when the population distribution of observations is unknown. Most of the previous results are mainly based on computational learning theory, which is interested in how many samples are needed to make sure the estimated expected risk satisfies a given accuracy with high probability. In this paper, we will propose a new algorithm by combining the advantages of both random search and gradient descent algorithms, and show that given an accuracy level of the estimated expected risk, we can generate a hypothesis by our algorithm to guarantee the accuracy with probability 1, and our algorithm will converge in finite steps. In addition, we will relax the conventional independently and identically distributed(i.i.d.) assumption on the observations to a kind of weakly dependent condition. We will also provide some simulations to demonstrate our algorithm's advantages over either random search or gradient descent algorithms.