TL;DR: This paper proposes six working LDL algorithms in three ways: problem transformation, algorithm adaptation, and specialized algorithm design, and results show clear advantages of the specialized algorithms, which indicates the importance of special design for the characteristics of the LDL problem.
Abstract: Although multi-label learning can deal with many problems with label ambiguity, it does not fit some real applications well where the overall distribution of the importance of the labels matters. This paper proposes a novel learning paradigm named label distribution learning (LDL) for such kind of applications. The label distribution covers a certain number of labels, representing the degree to which each label describes the instance. LDL is a more general learning framework which includes both single-label and multi-label learning as its special cases. This paper proposes six working LDL algorithms in three ways: problem transformation, algorithm adaptation, and specialized algorithm design. In order to compare the performance of the LDL algorithms, six representative and diverse evaluation measures are selected via a clustering analysis, and the first batch of label distribution datasets are collected and made publicly available. Experimental results on one artificial and 15 real-world datasets show clear advantages of the specialized algorithms, which indicates the importance of special design for the characteristics of the LDL problem.
TL;DR: A novel ensemble construction method that uses PSO generated weights to create ensemble of classifiers with better accuracy for intrusion detection and results suggest that the new approach can generate ensembles that outperform WMA in terms of classification accuracy.
Abstract: Graphical abstractThe objective of this paper is to develop ensemble based classifiers that will improve the accuracy of Intrusion Detection. For this purpose, we trained and tested 12 experts and then combined them into an ensemble. We used the PSO algorithm to weight the opinion of each expert. Because the quality of the behavioral parameters inserted by the user into PSO strongly affects its effectiveness, we have used the LUS method as a meta-optimizer for finding high-quality parameters. We then used the improved PSO to create new weights for each expert. For comparison, we also developed an ensemble classifier with weights generated using WMA 12. Fig. 1 depicts the entire process. For simplicity, the system framework was divided into the following seven stages:1.Kdd99 data pre-processing.2.Data classification with six different SVM experts.3.Data classification with six different k-NN experts.4.Data classification with ensemble classifier based on PSO.5.Data classification with ensemble classifier based on LUS improvement of PSO.6.Data classification with ensemble classifier based on WMA.7.Comparison of results for each approach.Display Omitted HighlightsIDS implemented using ensemble of a six SVM and a six k-NN classifier.Ensembles are created with weight generated by PSO and meta-PSO algorithms.These two ensembles outperform third ensemble system that is created with WMA. In machine learning, a combination of classifiers, known as an ensemble classifier, often outperforms individual ones. While many ensemble approaches exist, it remains, however, a difficult task to find a suitable ensemble configuration for a particular dataset. This paper proposes a novel ensemble construction method that uses PSO generated weights to create ensemble of classifiers with better accuracy for intrusion detection. Local unimodal sampling (LUS) method is used as a meta-optimizer to find better behavioral parameters for PSO. For our empirical study, we took five random subsets from the well-known KDD99 dataset. Ensemble classifiers are created using the new approaches as well as the weighted majority algorithm (WMA) approach. Our experimental results suggest that the new approach can generate ensembles that outperform WMA in terms of classification accuracy.
TL;DR: In this article, the authors introduce a standardized format for representing algorithm selection scenarios and a repository that contains a growing number of data sets from the literature, and demonstrate the potential of algorithm selection to achieve significant performance improvements across a broad range of problems and algorithms.
TL;DR: This paper proposes a novel boosting-based instance transfer classifier with a label-dependent update mechanism that simultaneously compensates for class imbalance and incorporates samples from an auxiliary domain to improve classification.
Abstract: A fundamental problem in data mining is to effectively build robust classifiers in the presence of skewed data distributions. Class imbalance classifiers are trained specifically for skewed distribution datasets. Existing methods assume an ample supply of training examples as a fundamental prerequisite for constructing an effective classifier. However, when sufficient data are not readily available, the development of a representative classification algorithm becomes even more difficult due to the unequal distribution between classes. We provide a unified framework that will potentially take advantage of auxiliary data using a transfer learning mechanism and simultaneously build a robust classifier to tackle this imbalance issue in the presence of few training samples in a particular target domain of interest. Transfer learning methods use auxiliary data to augment learning when training examples are not sufficient and in this paper we will develop a method that is optimized to simultaneously augment the training data and induce balance into skewed datasets. We propose a novel boosting-based instance transfer classifier with a label-dependent update mechanism that simultaneously compensates for class imbalance and incorporates samples from an auxiliary domain to improve classification. We provide theoretical and empirical validation of our method and apply to healthcare and text classification applications.
TL;DR: This work proposes several information-theoretic measures of algorithmic stability and uses them to upper-bound the generalization bias of learning algorithms.
Abstract: Machine learning algorithms can be viewed as stochastic transformations that map training data to hypotheses Following Bousquet and Elisseeff, we say that such an algorithm is stable if its output does not depend too much on any individual training example Since stability is closely connected to generalization capabilities of learning algorithms, it is of theoretical and practical interest to obtain sharp quantitative estimates on the generalization bias of machine learning algorithms in terms of their stability properties We propose several information-theoretic measures of algorithmic stability and use them to upper-bound the generalization bias of learning algorithms Our framework is complementary to the information-theoretic methodology developed recently by Russo and Zou
TL;DR: Concepts from statistical and online learning theory are adapted to reason about application-specific algorithm selection, and dimension notions from statistical learning theory, historically used to measure the complexity of classes of binary- and real-valued functions, are relevant in a much broader algorithmic context.
Abstract: The best algorithm for a computational problem generally depends on the "relevant inputs," a concept that depends on the application domain and often defies formal articulation. While there is a large literature on empirical approaches to selecting the best algorithm for a given application domain, there has been surprisingly little theoretical analysis of the problem.This paper adapts concepts from statistical and online learning theory to reason about application-specific algorithm selection. Our models capture several state-of-the-art empirical and theoretical approaches to the problem, ranging from self-improving algorithms to empirical performance models, and our results identify conditions under which these approaches are guaranteed to perform well. We present one framework that models algorithm selection as a statistical learning problem, and our work here shows that dimension notions from statistical learning theory, historically used to measure the complexity of classes of binary- and real-valued functions, are relevant in a much broader algorithmic context. We also study the online version of the algorithm selection problem, and give possibility and impossibility results for the existence of no-regret learning algorithms.
TL;DR: The number of attributes, the number of instances, thenumber of classes, maximum probability of class and class entropy are playing a major role in classifier accuracy and algorithm selection for thirty eight datasets used for experimentation.
Abstract: A number of algorithms are available in the areas of data mining, machine learning and pattern recognition for solving the same kind of problem. But there is a little guidance for suggesting algorithm to use which gives best results for the problem at hand. This paper shows an approach for solving this problem using meta-learning. The paper uses three types of data characteristics. Simple, information theoretic, and statistical data characteristics are used. Results are generated using nine different algorithms on thirty eight benchmark datasets from UCI repository. The proposed approach uses K-nearest neighbor algorithm for suggesting the suitable algorithm. Classifier accuracy is taken as a basis for recommending the algorithm. By using meta-learning, accurate method can be recommended as per the given data, and cognitive overload for applying each method, comparing with other methods and then selecting the suitable method for use can be reduced. Thus it helps in adaptive learning methods. The experimentation shows that predicted accuracies are matching with the actual accuracies for more than 90 % of the benchmark datasets used. Thus it is concluded that the number of attributes, the number of instances, the number of classes, maximum probability of class and class entropy are playing a major role in classifier accuracy and algorithm selection for thirty eight datasets used for experimentation.
TL;DR: Random forest and Naïve Bayes algorithms are used as feature selection method, Random Forest is used to rank the feature importance and applied for relevant feedback.
Abstract: Machine learning algorithms are computer programs that try to predict cancer type based on the past data. The eventual goal of Machine learning algorithms in cancer diagnosis is to have a trained machine learning algorithm that gives the gene expression levels from cancer patient, can accurately predict what type and severity of cancer they have, aiding the doctor in treating it. The existing technology compares three different machine learning algorithms are Decision Tree, Support Vector Machine, Bayesian Belief Network. The main drawback of these algorithms is unusual because the number of features (gene expressions) far exceeds the number of cases (samples taken from patients). Performance efficiency can be achieved by comparing two more algorithms are Random Forest and Naive Bayes algorithms. Because Random forest and Naive Bayes are used as feature selection method, Random Forest is used to rank the feature importance and applied for relevant feedback. The requirements are weka tool, Java and Relational Database.
TL;DR: This paper proposes multi-objective binary bat algorithm (MBBA) based on Pareto for association rule mining, and proposes a new method to discover interesting association rules without favoring or excluding any measure.
Abstract: Association rule mining meeting a variety of measures is regarded as a multi-objective optimization problem rather than a single objective optimization problem. The convergent speed of traditional multi-objective algorithms such as genetic algorithm is slow and the efficiency of these algorithms is low. Furthermore, the rules generated by traditional multi-objective algorithms are too large to be efficiently analyzed and explored in any further process. Bat algorithm is a new efficient global optimal algorithm whose convergence is superior to binary particle swarm optimization (BPSO) and genetic algorithm. This paper discusses the application of multi-objective bat algorithm to association rule mining. We propose multi-objective binary bat algorithm (MBBA) based on Pareto for association rule mining. This algorithm is independent of minimum support and minimum confidence. To evaluate the association rules mined by MBBA algorithm, we propose a new method to discover interesting association rules without favoring or excluding any measure. Compared with the single-objective BPSO, binary bat algorithm (BBA) and Apriori algorithm, the experimental results on six datasets show that the new algorithm is feasible and highly effective. It can make up the shortage of single objective algorithms and traditional association rule mining algorithms.
TL;DR: The unique contributions of this paper are defining a test framework, defining multiple distortion profiles, defining a stress test suite, and the evaluation and comparison of different transfer learning and traditional machine learning algorithms over a wide-range of distributions.
Abstract: Previous research focusing on the evaluation of transfer learning algorithms has predominantly used real-world datasets to measure an algorithm's performance. A test with a real-world dataset exposes an algorithm to a single instance of distribution difference between the training (source) and test (target) datasets. These previous works have not measured performance over a wide-range of source and target distribution differences. We propose to use a test framework that creates many source and target datasets from a single base dataset, representing a diverse-range of distribution differences. These datasets will be used as a stress test to measure an algorithm's performance. The stress test process will measure and compare different transfer learning algorithms and traditional learning algorithms. The unique contributions of this paper, with respect to transfer learning, are defining a test framework, defining multiple distortion profiles, defining a stress test suite, and the evaluation and comparison of different transfer learning and traditional machine learning algorithms over a wide-range of distributions.
TL;DR: A novel technique, called biased Random-Key Genetic Algorithm is employed here a novel technique that allows the calibration of all the parameters of the algorithm in an automatic fashion, hence reducing the fine-tuning effort required and enhancing the performance of the algorithms itself.
TL;DR: A novel multiclass AdaBoost-based extreme learning machine (ELM) ensemble algorithm is proposed, in which the weighted ELM is selected as the basic weak classifier because of its much faster learning speed and much better generalisation performance than traditional support vector machines.
Abstract: A novel multiclass AdaBoost-based extreme learning machine (ELM) ensemble algorithm is proposed, in which the weighted ELM is selected as the basic weak classifier because of its much faster learning speed and much better generalisation performance than traditional support vector machines. AdaBoost acts as an ensemble learning method of a number of weighted ELMs. Then, an ensemble strong classifier is constructed by the weighted majority vote of all the weighted ELMs. Compared with the existing ELM methods, the proposed algorithm solves the problem of how to train the weighted samples by ELM in multiclass classification directly. Experiments on the German Traffic Sign Recognition Benchmark database demonstrate that the proposed algorithm can achieve a high recognition accuracy of 99.12% with a relatively lower computational complexity than many state-of-the-art algorithms.
TL;DR: The proposed hybrid K-means and Support Vector Machine algorithm for disease prediction is helpful in choosing initial centroids, number of clusters and also to improve the efficiency of K-Means algorithm.
Abstract: Medical data mining is one of the significant research field as medical organizations produce large volume of data on daily basis. Handling this vast amount of data in medical field is challenging, so there is a need to mine this data in order to extract useful patterns for disease prediction. A hybrid K-means and Support Vector Machine algorithm for disease prediction is proposed in this paper. The proposed hybrid K-means algorithm is helpful in choosing initial centroids, number of clusters and also to improve the efficiency of K-means algorithm. The hybrid K-means algorithm is used for dimensionality reduction of the dataset which is given as an input to Support Vector Machine classifier. The simulation is performed in MATLAB and from the results it has been analysed that the accuracy of the classification is improved and the processing time to obtain the final output is reduced.
TL;DR: This paper presents a preliminary theoretical model and analysis of the mutual interaction between humans and algorithms, based on an iterated learning framework that is inspired from the study of human language evolution, and defines the concepts of human and algorithm blind spots.
Abstract: Early supervised machine learning algorithms have relied on reliable expert labels to build predictive models. However, the gates of data generation have recently been opened to a wider base of users who started participating increasingly with casual labeling, rating, annotating, etc. The increased online presence and participation of humans has led not only to a democratization of unchecked inputs to algorithms, but also to a wide democratization of the "consumption" of machine learning algorithms' outputs by general users. Hence, these algorithms, many of which are becoming essential building blocks of recommender systems and other information filters, started interacting with users at unprecedented rates. The result is machine learning algorithms that consume more and more data that is unchecked, or at the very least, not fitting conventional assumptions made by various machine learning algorithms. These include biased samples, biased labels, diverging training and testing sets, and cyclical interaction between algorithms, humans, information consumed by humans, and data consumed by algorithms. Yet, the continuous interaction between humans and algorithms is rarely taken into account in machine learning algorithm design and analysis. In this paper, we present a preliminary theoretical model and analysis of the mutual interaction between humans and algorithms, based on an iterated learning framework that is inspired from the study of human language evolution. We also define the concepts of human and algorithm blind spots and outline machine learning approaches to mend iterated bias through two novel notions: antidotes and reactive learning.
TL;DR: The hybrid algorithm proposed in this paper uses the concept of clustering and decision tree induction to classify the data samples and shows improved accuracy in most cases.
Abstract: Data mining is a powerful concept with great potential to predict future trends and behavior. It refers to the extraction of hidden knowledge from large datasets using techniques like statistical analysis, machine learning, clustering, neural networks and genetic algorithms. Hybrid algorithms for data mining are a logical combination of multiple pre-existing techniques to enhance performance and provide better results[11]. The hybrid algorithm proposed in this paper uses the concept of clustering and decision tree induction to classify the data samples. When the proposed approach is tested on real life datasets, the results obtained show improved accuracy in most cases.
TL;DR: This work proposes and compares 16 variants of the PART algorithm from the perspectives of discriminating capacity, complexity of the models, and the computational cost, for 36 real-world problems obtained from the UCI repository and finds the best-performing variant ranks first when compared to the well-established C4.5 algorithm.
TL;DR: When used as a replacement for pre-existing meta-algorithms, the neural network brings about a 68% runtime improvement in Maple and 49% improvement in Mathematica, and Random forests, k-nearest neighbors, and both linear and RBF kernel SVMs are compared to the Neural network model, the latter of which offers the best performance out of the tested machine learning methods.
Abstract: Computational software programs, such as Maple and Mathematica, heavily rely on superfunctions and meta-algorithms to select the optimal algorithm for a given task. These meta-algorithms may require intensive mathematical proof to formulate, incur large computational overhead, or fail to consistently select the best algorithm. Machine learning demonstrates a promising alternative for automatic algorithm selection by easing the design process and overhead while also attaining high accuracy in selection. In a case study on the resultant superfunction, a trained neural network is able to select the best algorithm out of the four available 86% of the time in Maple and 78% of the time in Mathematica. When used as a replacement for pre-existing meta-algorithms, the neural network brings about a 68% runtime improvement in Maple and 49% improvement in Mathematica. Random forests, k-nearest neighbors, and both linear and RBF kernel SVMs are also compared to the neural network model, the latter of which offers the best performance out of the tested machine learning methods.
TL;DR: It is shown that a maximal gap between the two settings exists also in the special case of active learning for binary classification, and what is the stream size required for emulating a pool algorithm with a given pool size is asked.
Abstract: We consider interactive algorithms in the pool-based setting, and in the stream-based setting. Interactive algorithms observe suggested elements (representing actions or queries), and interactively select some of them and receive responses. Pool-based algorithms can select elements at any order, while stream-based algorithms observe elements in sequence, and can only select elements immediately after observing them. We assume that the suggested elements are generated independently from some source distribution, and ask what is the stream size required for emulating a pool algorithm with a given pool size. We provide algorithms and matching lower bounds for general pool algorithms, and for utility-based pool algorithms. We further show that a maximal gap between the two settings exists also in the special case of active learning for binary classification.
TL;DR: The improved algorithm applies the EM algorithm to generate a constrained matrix, then combines the constrained matrix with the q-DAEM algorithm to reduce the search range, so that a better Gaussian mixture model can be derived from this algorithm.
Abstract: Network traffic classification algorithm based on the machine learning has attracted more and more attention. Because the traditional EM algorithm has the disadvantage that the algorithm has the sensitivity of initial value and converge to local optimal point easily. This paper proposed a new improved EM algorithm based on the q-DAEM. The improved algorithm applies the EM algorithm to generate a constrained matrix, then combine the constrained matrix with the q-DAEM algorithm to reduce the search range, so that a better Gaussian mixture model can be derived from this algorithm. The algorithm was applied to the Moore datasets for evaluation, the experimental results show that this improved algorithm which applied to network traffic classification can lead to a higher precision and overall accuracy.
TL;DR: Combining the methods of Multi Class Instance Selection and Ho-Kashyap has not only reduced the starting time of algorithm, but has improved the accuracy of this algorithm, using proper parameters.
Abstract: This article is focusing on optimization of the Ho-Kashyap classification algorithm. Choosing a proper learning sample plays a significant role in runtime and accuracy of the supervised classification algorithms, specially the Ho-Kashyap classification algorithm. This article with combining the methods of Multi Class Instance Selection and Ho-Kashyap, not has only reduced the starting time of algorithm, but has improved the accuracy of this algorithm, using proper parameters. The results of this suggested method, in terms of accuracy and time, are evaluated and simulations have proved that MCIS method can choose the data that have more effectiveness on classification, using proper measures. If Ho-Kashyap algorithm classifies using more important data, it could be to save the time in classification process and even increases the accuracy of classification.
TL;DR: Experimental results demonstrate that the proposed selective ensemble learning algorithm based on differential evolution (DE) for classification problem can effectively improve the classification accuracy and generalization ability.
Abstract: Extreme learning machine (ELM) for single-hidden-layer feedforward neural networks has been widely used in classification and regression for its fast learning speed. However, a single ELM suffers from problems of stability and overfitting. Ensemble approach can effectively resolve these problems. This paper proposes a selective ensemble learning algorithm based on differential evolution (DE) for classification problem. In the proposed algorithm, ELM is selected as base classifier, and then DE algorithm is employed as the optimization technique to construct an ensemble learning model by combining base classifiers. The weights of each base classifier in the ensemble are optimized by DE algorithm. Finally, several base classifiers with larger weights are selected to form the ensemble for making decision. Experimental results on 14 benchmark datasets demonstrate that the proposed algorithm can effectively improve the classification accuracy and generalization ability.
TL;DR: The Aggregation Algorithm is investigated, which a generalisation of the famous weighted majority algorithm, which performs very well in comparison to average.
Abstract: Learning with expert advice as a scheme of on-line learning has been very successfully applied to various learning problems due to its strong theoretical basis. In this paper, for the purpose of times se- ries prediction, we investigate the application of Aggregation Algorithm, which a generalisation of the famous weighted majority algorithm. The results of the experiments done, show that the Aggregation Algorithm performs very well in comparison to average.
TL;DR: The unique contribution of this paper is the definition of a test framework that measures a more complete profile of a transfer learning algorithm's capability, facilitating the identification of relative poor and good performance areas.
Abstract: Most works covering the topic of transfer learning propose an algorithm to solve a given domain adaptation problem, then test the algorithm using real-world datasets. A test with a real-world dataset represents a single transfer learning test condition, which partially measures an algorithm's performance. Previous research has placed little emphasis on developing a comprehensive and uniform test for transfer learning algorithms. With this in mind, a test framework is proposed, comprising of distortion profiles which define a comprehensive test suite. The unique contribution of this paper is the definition of a test framework that measures a more complete profile of a transfer learning algorithm's capability, facilitating the identification of relative poor and good performance areas. As a proof of concept, the test framework is used to test a homogeneous transfer learning algorithm. The test framework will be the basis for a number of future applications.
TL;DR: From the experimental results, the decay-weighted ELM obtains the better effects in solving the imbalance classification tasks, particularly in multiclass tasks.
Abstract: Extreme learning machine (ELM) is a simple and effective method of single-hidden layer feedforward neural network. On this basis, there are many other methods are proposed to improve ELM. Weighted extreme learning machine is one of those methods. Weighted ELM is simple in theory and convenient in implementation and it can be applied directly into multiclass classification tasks. This paper improves previous weighted ELM for balance and optimization learning. From the experimental results, the improved weighted ELM obtains the better effects in solving the multiclass classification tasks.
TL;DR: This paper proposes a cascading version of RWM to achieve not only better experimental results but also a better error bound for sufficiently large datasets.
Abstract: With the increasing volume of data in the world, the best approach for learning from this data is to exploit an online learning algorithm. Online ensemble methods are online algorithms which take advantage of an ensemble of classifiers to predict labels of data. Prediction with expert advice is a well-studied problem in the online ensemble learning literature. The Weighted Majority algorithm and the randomized weighted majority (RWM) are the most well-known solutions to this problem, aiming to converge to the best expert. Since among some expert, The best one does not necessarily have the minimum error in all regions of data space, defining specific regions and converging to the best expert in each of these regions will lead to a better result. In this paper, we aim to resolve this defect of RWM algorithms by proposing a novel online ensemble algorithm to the problem of prediction with expert advice. We propose a cascading version of RWM to achieve not only better experimental results but also a better error bound for sufficiently large datasets.
TL;DR: A genetic algorithm (GA-AbDG) is proposed which explores the possible benefits of inducing part of the structure of AbDGs, by evolving suitable edge sets for them, and shows that the results obtained with the GA- AbDG algorithm outperform a prior proposal.
Abstract: Attribute-based Decision Graphs (AbDG) have been recently proposed as a novel and effective way to represent data as weighted labeled graphs. However, for some domains, the definition of a graph structure that best fits the data can be a hard task. In machine learning it is very common to rely on evolutionary algorithms to guide the model selection phase of learning processes. Particularly, as far as classification tasks are concerned, evolutionary algorithms can be of great help when searching for characteristics which would promote the induction of a suitable classifier, without the need to exhaustively test all possibilities. This paper proposes a genetic algorithm (GA-AbDG) which explores the possible benefits of inducing part of the structure of AbDGs, by evolving suitable edge sets for them. In addition, the paper shows that the results obtained with the GA-AbDG algorithm outperform a prior proposal, with a fixed p-partite structure of AbDGs, as well as the C4.5.
TL;DR: This article examines current popular machine learning algorithms with various on-line algorithms for pseudo-random generated data in order to find out which machine learning approach is more suitable for this kind of data for prediction based on on- line algorithms.
Abstract: A pseudo-random generator is an algorithm to generate a sequence of objects determined by a truly random seed which is not truly random. It has been widely used in many applications, such as cryptography and simulations. In this article, we examine current popular machine learning algorithms with various on-line algorithms for pseudo-random generated data in order to find out which machine learning approach is more suitable for this kind of data for prediction based on on-line algorithms. To further improve the prediction performance, we propose a novel sample weighted algorithm that takes generalization errors in each iteration into account. We perform intensive evaluation on real Baccarat data generated by Casino machines and random number generated by a popular Java program, which are two typical examples of pseudo-random generated data. The experimental results show that support vector machine and k-nearest neighbors have better performance than others with and without sample weighted algorithm in the evaluation data set.
TL;DR: A feature selection algorithm called ultiObjective genetic local search (MOGLS) which integrates a 3-objective genetic algorithm with a local search heuristic to find feature subsets with the maximum prediction accuracy, the smallest sizes and the minimum redundancy is proposed.
Abstract: Feature selection algorithms select the most relevant features of a data set to improve the classification performance of the machine learning classifiers trained using the data set. This paper proposes a feature selection algorithm called ultiobjective genetic local search (MOGLS) which integrates a 3-objective genetic algorithm with a local search heuristic to find feature subsets with the maximum prediction accuracy, the smallest sizes and the minimum redundancy. The performance of MOGLS is compared with 4 algorithms: a wrapper genetic algorithm, correlation-based feature selection, mutual information ranking and C4.5 on 8 datasets from the UCI machine learning repository. MOGLS performs better than or as good as the 4 algorithms on the 8 datasets.
TL;DR: By utilizing Meta-Learning and Artificial Neural Networks (ANNs) this work is able to achieve an average accuracy of 90% when automatically choosing the most appropriate algorithm when applied to over a hundred different rulesets ranging in size from 1K to 5K.
Abstract: Many packet classification algorithms with variable performances and capabilities are available. However, no single algorithm is guaranteed to outperform every other one in every case. Meta-Learning is a subfield in Machine Learning that aims to apply statistical techniques to automate the algorithm selection process. In this work, we propose a novel framework for efficient, automatic packet classification algorithm selection. By utilizing Meta-Learning and Artificial Neural Networks (ANNs) we are able to achieve an average accuracy of 90% when automatically choosing the most appropriate algorithm when applied to over a hundred different rulesets ranging in size from 1K to 5K.
TL;DR: This work proposes an algorithm based on ideas similar to the Weighted A* algorithm in heuristic search that is more accurate than the current state of the art in identifying a small number of features in data.
Abstract: Identifying a small number of features that can represent the data is believed to be NP-hard. Previous approaches exploit algebraic structure and use randomization. We propose an algorithm based on ideas similar to the Weighted A* algorithm in heuristic search. Our experiments show this new algorithm to be more accurate than the current state of the art.