TL;DR: The aim of this paper is to present three new aspects of KEEL: KEEL-dataset, a data set repository which includes the data set partitions in theKEELformat and some guidelines for including new algorithms in KEEL, helping the researcher to compare the results of many approaches already included within the KEEL software.
Abstract: (Knowledge Extraction based onEvolutionary Learning) tool, an open source software that supports datamanagement and a designer of experiments. KEEL pays special attentionto the implementation of evolutionary learning and soft computing basedtechniques for Data Mining problems including regression, classification,clustering, pattern mining and so on.The aim of this paper is to present three new aspects of KEEL: KEEL-dataset, a data set repository which includes the data set partitions in theKEELformatandshowssomeresultsofalgorithmsinthesedatasets; someguidelines for including new algorithms in KEEL, helping the researcherstomaketheirmethodseasilyaccessibletootherauthorsandtocomparetheresults of many approaches already included within the KEEL software;and a module of statistical procedures developed in order to provide to theresearcher a suitable tool to contrast the results obtained in any experimen-talstudy.Acaseofstudyisgiventoillustrateacompletecaseofapplicationwithin this experimental analysis framework.
TL;DR: Simulation results indicate that ABC algorithm can efficiently be used for multivariate data clustering and is compared with Particle Swarm Optimization (PSO) algorithm and other nine classification techniques from the literature.
Abstract: Artificial Bee Colony (ABC) algorithm which is one of the most recently introduced optimization algorithms, simulates the intelligent foraging behavior of a honey bee swarm. Clustering analysis, used in many disciplines and applications, is an important tool and a descriptive task seeking to identify homogeneous groups of objects based on the values of their attributes. In this work, ABC is used for data clustering on benchmark problems and the performance of ABC algorithm is compared with Particle Swarm Optimization (PSO) algorithm and other nine classification techniques from the literature. Thirteen of typical test data sets from the UCI Machine Learning Repository are used to demonstrate the results of the techniques. The simulation results indicate that ABC algorithm can efficiently be used for multivariate data clustering.
TL;DR: The performance of EPSDE is evaluated on a set of bound-constrained problems and is compared with conventional DE and several state-of-the-art parameter adaptive DE variants.
Abstract: Differential evolution (DE) has attracted much attention recently as an effective approach for solving numerical optimization problems. However, the performance of DE is sensitive to the choice of the mutation strategy and associated control parameters. Thus, to obtain optimal performance, time-consuming parameter tuning is necessary. Different mutation strategies with different parameter settings can be appropriate during different stages of the evolution. In this paper, we propose to employ an ensemble of mutation strategies and control parameters with the DE (EPSDE). In EPSDE, a pool of distinct mutation strategies along with a pool of values for each control parameter coexists throughout the evolution process and competes to produce offspring. The performance of EPSDE is evaluated on a set of bound-constrained problems and is compared with conventional DE and several state-of-the-art parameter adaptive DE variants.
TL;DR: The empirical studies on fifteen static test problems, a dynamic function and a real world engineering problem show that the proposed particle swarm optimization model is quite effective in adapting the value of w in the dynamic and static environments.
Abstract: Particle swarm optimization (PSO) is a stochastic population-based algorithm motivated by intelligent collective behavior of some animals. The most important advantages of the PSO are that PSO is easy to implement and there are few parameters to adjust. The inertia weight (w) is one of PSO's parameters originally proposed by Shi and Eberhart to bring about a balance between the exploration and exploitation characteristics of PSO. Since the introduction of this parameter, there have been a number of proposals of different strategies for determining the value of inertia weight during a course of run. This paper presents the first comprehensive review of the various inertia weight strategies reported in the related literature. These approaches are classified and discussed in three main groups: constant, time-varying and adaptive inertia weights. A new adaptive inertia weight approach is also proposed which uses the success rate of the swarm as its feedback parameter to ascertain the particles' situation in the search space. The empirical studies on fifteen static test problems, a dynamic function and a real world engineering problem show that the proposed particle swarm optimization model is quite effective in adapting the value of w in the dynamic and static environments.
TL;DR: Empirical results with three well-known real data sets indicate that the proposed model can be an effective way to improve forecasting accuracy achieved by traditional hybrid models and also either of the components models used separately.
Abstract: Improving forecasting especially time series forecasting accuracy is an important yet often difficult task facing decision makers in many areas. Both theoretical and empirical findings have indicated that integration of different models can be an effective way of improving upon their predictive performance, especially when the models in combination are quite different. Artificial neural networks (ANNs) are flexible computing frameworks and universal approximators that can be applied to a wide range of forecasting problems with a high degree of accuracy. However, using ANNs to model linear problems have yielded mixed results, and hence; it is not wise to apply ANNs blindly to any type of data. Autoregressive integrated moving average (ARIMA) models are one of the most popular linear models in time series forecasting, which have been widely applied in order to construct more accurate hybrid models during the past decade. Although, hybrid techniques, which decompose a time series into its linear and nonlinear components, have recently been shown to be successful for single models, these models have some disadvantages. In this paper, a novel hybridization of artificial neural networks and ARIMA model is proposed in order to overcome mentioned limitation of ANNs and yield more general and more accurate forecasting model than traditional hybrid ARIMA-ANNs models. In our proposed model, the unique advantages of ARIMA models in linear modeling are used in order to identify and magnify the existing linear structure in data, and then a neural network is used in order to determine a model to capture the underlying data generating process and predict, using preprocessed data. Empirical results with three well-known real data sets indicate that the proposed model can be an effective way to improve forecasting accuracy achieved by traditional hybrid models and also either of the components models used separately.
TL;DR: A survey of some of the most important lines of hybridization of metaheuristics with other techniques for optimization, which includes, for example, the combination of exact algorithms and meta heuristics.
Abstract: Research in metaheuristics for combinatorial optimization problems has lately experienced a noteworthy shift towards the hybridization of metaheuristics with other techniques for optimization. At the same time, the focus of research has changed from being rather algorithm-oriented to being more problem-oriented. Nowadays the focus is on solving the problem at hand in the best way possible, rather than promoting a certain metaheuristic. This has led to an enormously fruitful cross-fertilization of different areas of optimization. This cross-fertilization is documented by a multitude of powerful hybrid algorithms that were obtained by combining components from several different optimization techniques. Hereby, hybridization is not restricted to the combination of different metaheuristics but includes, for example, the combination of exact algorithms and metaheuristics. In this work we provide a survey of some of the most important lines of hybridization. The literature review is accompanied by the presentation of illustrative examples.
TL;DR: This paper describes a modified ABC algorithm for constrained optimization problems and compares the performance of the modifiedABC algorithm against those of state-of-the-art algorithms for a set of constrained test problems.
Abstract: Artificial Bee Colony (ABC) algorithm was firstly proposed for unconstrained optimization problems on where that ABC algorithm showed superior performance. This paper describes a modified ABC algorithm for constrained optimization problems and compares the performance of the modified ABC algorithm against those of state-of-the-art algorithms for a set of constrained test problems. For constraint handling, ABC algorithm uses Deb's rules consisting of three simple heuristic rules and a probabilistic selection scheme for feasible solutions based on their fitness values and infeasible solutions based on their violation values. ABC algorithm is tested on thirteen well-known test problems and the results obtained are compared to those of the state-of-the-art algorithms and discussed. Moreover, a statistical parameter analysis of the modified ABC algorithm is conducted and appropriate values for each control parameter are obtained using analysis of the variance (ANOVA) and analysis of mean (ANOM) statistics.
TL;DR: A fast image encryption algorithm with combined permutation and diffusion is proposed and an efficient method for generating pseudorandom numbers from spatiotemporal chaos is suggested, which further increases the encryption speed.
Abstract: In recent years, various image encryption algorithms based on the permutation-diffusion architecture have been proposed where, however, permutation and diffusion are considered as two separate stages, both requiring image-scanning to obtain pixel values. If these two stages are combined, the duplicated scanning effort can be reduced and the encryption can be accelerated. In this paper, a fast image encryption algorithm with combined permutation and diffusion is proposed. First, the image is partitioned into blocks of pixels. Then, spatiotemporal chaos is employed to shuffle the blocks and, at the same time, to change the pixel values. Meanwhile, an efficient method for generating pseudorandom numbers from spatiotemporal chaos is suggested, which further increases the encryption speed. Theoretical analyses and computer simulations both confirm that the new algorithm has high security and is very fast for practical image encryption.
TL;DR: Support Vector Machine (SVM) is used along with continuous wavelet transform (CWT), an advanced signal-processing tool, to analyze the frame vibrations during start-up to set up a base for condition monitoring technique of induction motor which will be simple, fast and overcome the limitations of traditional data-based models/techniques.
Abstract: Condition monitoring of induction motors is a fast emerging technology in the field of electrical equipment maintenance and has attracted more and more attention worldwide as the number of unexpected failure of a critical system can be avoided. Keeping this in mind a bearing fault detection scheme of three-phase induction motor has been attempted. In the present study, Support Vector Machine (SVM) is used along with continuous wavelet transform (CWT), an advanced signal-processing tool, to analyze the frame vibrations during start-up. CWT has not been widely applied in the field of condition monitoring although much better results can been obtained compared to the widely used DWT based techniques. The encouraging results obtained from the present analysis is hoped to set up a base for condition monitoring technique of induction motor which will be simple, fast and overcome the limitations of traditional data-based models/techniques.
TL;DR: Results obtained from the proposed approach have been compared to those obtained from pareto differential evolution, nondominated sorting genetic algorithm-II and strength pare to evolutionary algorithm 2.
Abstract: Economic environmental dispatch (EED) is an important optimization task in fossil fuel fired power plant operation for allocating generation among the committed units such that fuel cost and emission level are optimized simultaneously while satisfying all operational constraints. It is a highly constrained multiobjective optimization problem involving conflicting objectives with both equality and inequality constraints. In this paper, multi-objective differential evolution has been proposed to solve EED problem. Numerical results of three test systems demonstrate the capabilities of the proposed approach. Results obtained from the proposed approach have been compared to those obtained from pareto differential evolution, nondominated sorting genetic algorithm-II and strength pareto evolutionary algorithm 2.
TL;DR: The proposed modified method for solution update of the onlooker bees is able to produce higher quality solutions with faster convergence than either the original ABC or the current state-of-the-art ABC-based algorithm.
Abstract: The Artificial Bee Colony (ABC) algorithm is inspired by the behavior of honey bees. The algorithm is one of the Swarm Intelligence algorithms explored in recent literature. ABC is an optimization technique, which is used in finding the best solution from all feasible solutions. However, ABC can sometimes be slow to converge. In order to improve the algorithm performance, we present a modified method for solution update of the onlooker bees in this paper. In our method, the best feasible solutions found so far are shared globally among the entire population. Thus, the new candidate solutions are more likely to be close to the current best solution. In other words, we bias the solution direction toward the best-so-far position. Moreover, in each iteration, we adjust the radius of the search for new candidates using a larger radius earlier in the search process and then reduce the radius as the process comes closer to converging. Finally, we use a more robust calculation to determine and compare the quality of alternative solutions. We empirically assess the performance of our proposed method on two sets of problems: numerical benchmark functions and image registration applications. The results demonstrate that the proposed method is able to produce higher quality solutions with faster convergence than either the original ABC or the current state-of-the-art ABC-based algorithm.
TL;DR: An integrated system where wavelet transforms and recurrent neural network (RNN) based on artificial bee colony (abc) algorithm are combined for stock price forecasting is presented and can be implemented in a real-time trading system for forecasting stock prices and maximizing profits.
Abstract: This study presents an integrated system where wavelet transforms and recurrent neural network (RNN) based on artificial bee colony (abc) algorithm (called ABC-RNN) are combined for stock price forecasting. The system comprises three stages. First, the wavelet transform using the Haar wavelet is applied to decompose the stock price time series and thus eliminate noise. Second, the RNN, which has a simple architecture and uses numerous fundamental and technical indicators, is applied to construct the input features chosen via Stepwise Regression-Correlation Selection (SRCS). Third, the Artificial Bee Colony algorithm (ABC) is utilized to optimize the RNN weights and biases under a parameter space design. For illustration and evaluation purposes, this study refers to the simulation results of several international stock markets, including the Dow Jones Industrial Average Index (DJIA), London FTSE-100 Index (FTSE), Tokyo Nikkei-225 Index (Nikkei), and Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX). As these simulation results demonstrate, the proposed system is highly promising and can be implemented in a real-time trading system for forecasting stock prices and maximizing profits.
TL;DR: Experimental results show the effectiveness of the proposed method in contrast to conventional fuzzy C means algorithms and also type II fuzzy algorithm.
Abstract: This paper presents a novel intuitionistic fuzzy C means clustering method using intuitionistic fuzzy set theory. The intuitionistic fuzzy set theory considers another uncertainty parameter which is the hesitation degree that arises while defining the membership function and thus the cluster centers may converge to a desirable location than the cluster centers obtained using fuzzy C means algorithm. Also a new objective function which is the intuitionistic fuzzy entropy is incorporated in the conventional fuzzy C means clustering algorithm. This is done to maximize the good points in the class. This clustering method is used in clustering different regions of the CT scan brain images and these may be used to identify the abnormalities in the brain. Experimental results show the effectiveness of the proposed method in contrast to conventional fuzzy C means algorithms and also type II fuzzy algorithm.
TL;DR: The results of the ABC-AP compared with results of other optimization algorithms from the literature show that this algorithm is a powerful search and optimization technique for structural design.
Abstract: The main goal of the structural optimization is to minimize the weight of structures while satisfying all design requirements imposed by design codes. In this paper, the Artificial Bee Colony algorithm with an adaptive penalty function approach (ABC-AP) is proposed to minimize the weight of truss structures. The ABC algorithm is swarm intelligence based optimization technique inspired by the intelligent foraging behavior of honeybees. Five truss examples with fixed-geometry and up to 200 elements were studied to verify that the ABC algorithm is an effective optimization algorithm in the creation of an optimal design for truss structures. The results of the ABC-AP compared with results of other optimization algorithms from the literature show that this algorithm is a powerful search and optimization technique for structural design.
TL;DR: This article attempts to fill the gap in the current literature by establishing a fuzzy weighted SERVQUAL model for evaluating the airline service quality and a case study of Taiwanese airline is conduced to demonstrate the effectiveness.
Abstract: The airline service quality is an important issue in the international air travel transportation industry. Although a number of studies focus on the subject of airline service quality evaluation in the past, most of these studies applied the SERVQUAL method to evaluate the airline service quality. But only few have attempted to evaluate the airline service quality using the weighted SERVQUAL method. Furthermore, human judgments are often vague and it is not easy for passengers to express the weights of evaluation criteria and the satisfaction of airline service quality using an exact numerical value. It is more realistic to use linguistic terms to describe the expectation value, perception value and important weight of evaluation criteria. Due to this type of existing fuzziness in the airline service quality evaluation, fuzzy set theory is an appropriate method for dealing with uncertainty. The subjective evaluation data can be more adequately expressed in linguistic variables. Thus this article attempts to fill this gap in the current literature by establishing a fuzzy weighted SERVQUAL model for evaluating the airline service quality. A case study of Taiwanese airline is conduced to demonstrate the effectiveness of the fuzzy weighted SERVQUAL model. Finally, some interesting conclusions and useful suggestions are given to airlines to improve the service quality.
TL;DR: The performance of ABC is at par with that of PSO, AIS and GA for all the loading configurations and is evaluated in comparison with other nature inspired techniques which includes Particle Swarm Optimization (PSO), Artificial Immune System (AIS) and Genetic Algorithm (GA).
Abstract: In this paper, we present a generic method/model for multi-objective design optimization of laminated composite components, based on Vector Evaluated Artificial Bee Colony (VEABC) algorithm VEABC is a parallel vector evaluated type, swarm intelligence multi-objective variant of the Artificial Bee Colony algorithm (ABC) In the current work a modified version of VEABC algorithm for discrete variables has been developed and implemented successfully for the multi-objective design optimization of composites The problem is formulated with multiple objectives of minimizing weight and the total cost of the composite component to achieve a specified strength The primary optimization variables are the number of layers, its stacking sequence (the orientation of the layers) and thickness of each layer The classical lamination theory is utilized to determine the stresses in the component and the design is evaluated based on three failure criteria: failure mechanism based failure criteria, maximum stress failure criteria and the tsai-wu failure criteria The optimization method is validated for a number of different loading configurations-uniaxial, biaxial and bending loads The design optimization has been carried for both variable stacking sequences, as well fixed standard stacking schemes and a comparative study of the different design configurations evolved has been presented Finally the performance is evaluated in comparison with other nature inspired techniques which includes Particle Swarm Optimization (PSO), Artificial Immune System (AIS) and Genetic Algorithm (GA) The performance of ABC is at par with that of PSO, AIS and GA for all the loading configurations
TL;DR: An analysis of the most commonly used problems, methods and measures together with the newer approaches and trends, as well as their interrelations and common ideas are shown.
Abstract: This paper provides a survey of the research done on optimization in dynamic environments over the past decade. We show an analysis of the most commonly used problems, methods and measures together with the newer approaches and trends, as well as their interrelations and common ideas. The survey is supported by a public web repository, located at http://www.dynamic-optimization.orgwhere the collected bibliography is manually organized and tagged according to different categories.
TL;DR: The objective of this review paper is to summarize the well-known approaches used in keystroke dynamics in the last two decades.
Abstract: Authentication is the process of determining whether someone or something is, in fact, who or what it is declared to be. As the dependence upon computers and computer networks grows, the need for authentication has increased. Biometrics is the science and technology of authentication by identifying the living individual's physiological or behavioral attributes. Keystroke dynamics is a behavioral measurement and it utilizes the manner and rhythm in which each individual types. The approaches in keystroke dynamics can be categorized by the selection of features and the classification methods employed. The objective of this review paper is to summarize the well-known approaches used in keystroke dynamics in the last two decades.
TL;DR: This paper proposes an effective local search algorithm based on simulated annealing and greedy search techniques to solve the traveling salesman problem and shows that the proposed algorithm provides better compromise between CPU time and accuracy among some recent algorithms for the TSP.
Abstract: The traveling salesman problem (TSP) is a classical problem in discrete or combinatorial optimization and belongs to the NP-complete classes, which means that it may be require an infeasible processing time to be solved by an exhaustive search method, and therefore less expensive heuristics in respect to the processing time are commonly used in order to obtain satisfactory solutions in short running time. This paper proposes an effective local search algorithm based on simulated annealing and greedy search techniques to solve the TSP. In order to obtain more accuracy solutions, the proposed algorithm based on the standard simulated annealing algorithm adopts the combination of three kinds of mutations with different probabilities during its search. Then greedy search technique is used to speed up the convergence rate of the proposed algorithm. Finally, parameters such as cool coefficient of the temperature, the times of greedy search, and the times of compulsive accept and the probability of accept a new solution, are adaptive according to the size of the TSP instances. As a result, experimental results show that the proposed algorithm provides better compromise between CPU time and accuracy among some recent algorithms for the TSP.
TL;DR: Chaos binary particle swarm optimization (CBPSO) is proposed to implement the feature selection, in which the K-nearest neighbor (K-NN) method with leave-one-out cross-validation (LOOCV) serves as a classifier for evaluating classification accuracies.
Abstract: Feature selection is a useful pre-processing technique for solving classification problems. The challenge of solving the feature selection problem lies in applying evolutionary algorithms capable of handling the huge number of features typically involved. Generally, given classification data may contain useless, redundant or misleading features. To increase classification accuracy, the primary objective is to remove irrelevant features in the feature space and to correctly identify relevant features. Binary particle swarm optimization (BPSO) has been applied successfully to solving feature selection problems. In this paper, two kinds of chaotic maps-so-called logistic maps and tent maps-are embedded in BPSO. The purpose of chaotic maps is to determine the inertia weight of the BPSO. We propose chaotic binary particle swarm optimization (CBPSO) to implement the feature selection, in which the K-nearest neighbor (K-NN) method with leave-one-out cross-validation (LOOCV) serves as a classifier for evaluating classification accuracies. The proposed feature selection method shows promising results with respect to the number of feature subsets. The classification accuracy is superior to other methods from the literature.
TL;DR: A new taxonomy for classifying software-based parallel ACO algorithms is introduced and a systematic and comprehensive survey of the current state-of-the-art on Parallel ACO implementations is presented.
Abstract: Ant colony optimization (ACO) is a well-known swarm intelligence method, inspired in the social behavior of ant colonies for solving optimization problems. When facing large and complex problem instances, parallel computing techniques are usually applied to improve the efficiency, allowing ACO algorithms to achieve high quality results in reasonable execution times, even when tackling hard-to-solve optimization problems. This work introduces a new taxonomy for classifying software-based parallel ACO algorithms and also presents a systematic and comprehensive survey of the current state-of-the-art on parallel ACO implementations. Each parallel model reviewed is categorized in the new taxonomy proposed, and an insight on trends and perspectives in the field of parallel ACO implementations is provided.
TL;DR: A hybrid model developed by integrating a case-based data clustering method and a fuzzy decision tree for medical data classification can produce accurate but also comprehensible decision rules that could potentially help medical doctors to extract effective conclusions in medical diagnosis.
Abstract: In this research, a hybrid model is developed by integrating a case-based data clustering method and a fuzzy decision tree for medical data classification. Two datasets from UCI Machine Learning Repository, i.e., liver disorders dataset and Breast Cancer Wisconsin (Diagnosis), are employed for benchmark test. Initially a case-based clustering method is applied to preprocess the dataset thus a more homogeneous data within each cluster will be attainted. A fuzzy decision tree is then applied to the data in each cluster and genetic algorithms (GAs) are further applied to construct a decision-making system based on the selected features and diseases identified. Finally, a set of fuzzy decision rules is generated for each cluster. As a result, the FDT model can accurately react to the test data by the inductions derived from the case-based fuzzy decision tree. The average forecasting accuracy for breast cancer of CBFDT model is 98.4% and for liver disorders is 81.6%. The accuracy of the hybrid model is the highest among those models compared. The hybrid model can produce accurate but also comprehensible decision rules that could potentially help medical doctors to extract effective conclusions in medical diagnosis.
TL;DR: A multi-objective artificial immune algorithm has been used to optimize the kernel and penalize parameters of SVM in this paper and successful results are obtained.
Abstract: Support vector machine (SVM) is a classification method based on the structured risk minimization principle. Penalize, C; and kernel, @s parameters of SVM must be carefully selected in establishing an efficient SVM model. These parameters are selected by trial and error or man's experience. Artificial immune system (AIS) can be defined as a soft computing method inspired by theoretical immune system in order to solve science and engineering problems. A multi-objective artificial immune algorithm has been used to optimize the kernel and penalize parameters of SVM in this paper. In training stage of SVM, multiple solutions are found by using multi-objective artificial immune algorithm and then these parameters are evaluated in test stage. The proposed algorithm is applied to fault diagnosis of induction motors and anomaly detection problems and successful results are obtained.
TL;DR: A two-step approach to evaluate classification algorithms for financial risk prediction is developed and the construction of a knowledge-rich financial risk management process to increase the usefulness of classification results in financial risk detection is discussed.
Abstract: A wide range of classification methods have been used for the early detection of financial risks in recent years. How to select an adequate classifier (or set of classifiers) for a given dataset is an important task in financial risk prediction. Previous studies indicate that classifiers' performances in financial risk prediction may vary using different performance measures and under different circumstances. The main goal of this paper is to develop a two-step approach to evaluate classification algorithms for financial risk prediction. It constructs a performance score to measure the performance of classification algorithms and introduces three multiple criteria decision making (MCDM) methods (i.e., TOPSIS, PROMETHEE, and VIKOR) to provide a final ranking of classifiers. An empirical study is designed to assess various classification algorithms over seven real-life credit risk and fraud risk datasets from six countries. The results show that linear logistic, Bayesian Network, and ensemble methods are ranked as the top-three classifiers by TOPSIS, PROMETHEE, and VIKOR. In addition, this work discusses the construction of a knowledge-rich financial risk management process to increase the usefulness of classification results in financial risk detection.
TL;DR: The usefulness of the data complexity measures is analyzed in order to evaluate the behavior of undersampling and oversampling methods and to derive rules from the intervals that describe both good or bad behaviors of C4.5 and PART for the different preprocessing approaches.
Abstract: In the classification framework there are problems in which the number of examples per class is not equitably distributed, formerly known as imbalanced data sets. This situation is a handicap when trying to identify the minority classes, as the learning algorithms are not usually adapted to such characteristics. An usual approach to deal with the problem of imbalanced data sets is the use of a preprocessing step. In this paper we analyze the usefulness of the data complexity measures in order to evaluate the behavior of undersampling and oversampling methods. Two classical learning methods, C4.5 and PART, are considered over a wide range of imbalanced data sets built from real data. Specifically, oversampling techniques and an evolutionary undersampling one have been selected for the study. We extract behavior patterns from the results in the data complexity space defined by the measures, coding them as intervals. Then, we derive rules from the intervals that describe both good or bad behaviors of C4.5 and PART for the different preprocessing approaches, thus obtaining a complete characterization of the data sets and the differences between the oversampling and undersampling results.
TL;DR: The results show that the jDElscop algorithm can deal with large-scale continuous optimization effectively and behaves significantly better than other three algorithms used in the comparison, in most cases.
Abstract: Many real-world optimization problems are large-scale in nature. In order to solve these problems, an optimization algorithm is required that is able to apply a global search regardless of the problems’ particularities. This paper proposes a self-adaptive differential evolution algorithm, called jDElscop, for solving large-scale optimization problems with continuous variables. The proposed algorithm employs three strategies and a population size reduction mechanism. The performance of the jDElscop algorithm is evaluated on a set of benchmark problems provided for the Special Issue on the Scalability of Evolutionary Algorithms and other Metaheuristics for Large Scale Continuous Optimization Problems. Non-parametric statistical procedures were performed for multiple comparisons between the proposed algorithm and three well-known algorithms from literature. The results show that the jDElscop algorithm can deal with large-scale continuous optimization effectively. It also behaves significantly better than other three algorithms used in the comparison, in most cases.
TL;DR: New approaches to handling drift and shift in on-line data streams with the help of evolving fuzzy systems (EFS), which are characterized by the fact that their structure is not fixed and not pre-determined, but is extracted from data streams on- line and in an incremental manner are presented.
Abstract: In this paper, we present new approaches to handling drift and shift in on-line data streams with the help of evolving fuzzy systems (EFS), which are characterized by the fact that their structure (rule base and parameters) is not fixed and not pre-determined, but is extracted from data streams on-line and in an incremental manner. When dealing with so-called drifts and s hifts in data streams, one needs to take into account (1) automatic detection of drifts and shifts and (2) automatic reaction to the drifts and shifts. This is important to avoid interruptions in the learning process and downtrends in predictive accuracy. To address the first problem, we propose an approach based on the concept fuzzy rule age. The second problem is addressed by including gradual forgetting of (1) antecedent parts and (2) consequent parameters. The latter can be achieved by including a forgetting factor in the recursive local learning process of the parameters, whose value is automatically extracted based on the intensity of the shift/drift. For addressing the former problem, we introduce two alternative methods: one is based on the evolving density-based clustering (eClustering) used to form the antecedents in the eTS approach; the other is based on the automatic adaptation of the learning rate of the evolving vector quantization (eVQ) method used to form the antecedent in the FLEXFIS approach. The paper concludes with an empirical evaluation of the impact of the proposed approaches in (on-line) real-world data sets in which drifts and shifts occur.
TL;DR: The objective of this study was to select an optimal alternative in the presence of incomplete information and linguistic preferences using multiple GSCM criteria using linguistic preferences that can be resolved with fuzzy set theory.
Abstract: As firms move toward environmental sustainability, management must extend managements efforts to improve environmental practices across the supply chain. The selection of a suitable green supplier according to green supply chain management criteria (GSCM) is essential for the sustainable development of manufacturing firms. The objective of this study was to select an optimal alternative in the presence of incomplete information and linguistic preferences using multiple GSCM criteria. The goal of GSCM is to reduce a firm's pollution and other environmental impacts. In the proposed method, the weights of GSCM criteria and alternatives are described using linguistic preferences that can be resolved with fuzzy set theory. Subsequently, the rank of each alternative was calculated from incomplete information by applying a grey degree. Moreover, a case study was used to resolve the proposed method, and the results and managerial implications of the analysis are discussed in detail.
TL;DR: This study aims at investigating the prediction performance that utilizes the classifier ensembles method to analyze stock returns and indicates that multiple classifiers outperform single classifiers in terms of prediction accuracy and returns on investment.
Abstract: The problem of predicting stock returns has been an important issue for many years. Advancement in computer technology has allowed many recent studies to utilize machine learning techniques such as neural networks and decision trees to predict stock returns. In the area of machine learning, classifier ensembles (i.e. combining multiple classifiers) have proven to be a method superior to single classifiers. In order to build a better model for predicting stock returns effectively and efficiently, this study aims at investigating the prediction performance that utilizes the classifier ensembles method to analyze stock returns. In particular, the hybrid methods of majority voting and bagging are considered. Moreover, performance using two types of classifier ensembles is compared with those using single baseline classifiers (i.e. neural networks, decision trees, and logistic regression). These two types of ensembles are 'homogeneous' classifier ensembles (e.g. an ensemble of neural networks) and 'heterogeneous' classifier ensembles (e.g. an ensemble of neural networks, decision trees and logistic regression). Average prediction accuracy, Type I and II errors, and return on investment of these models are also examined. Our results indicate that multiple classifiers outperform single classifiers in terms of prediction accuracy and returns on investment. In addition, heterogeneous classifier ensembles offer slightly better performance than the homogeneous ones. However, there is no significant difference between majority voting and bagging in prediction accuracy, but the former has better stock returns prediction accuracy than the latter. Finally, the homogeneous multiple classifiers using neural networks by majority voting perform best when predicting stock returns.
TL;DR: The design of a fuzzy decision support system in multi-criteria analysis approach for selecting the best plan alternatives or strategies in environment watershed is described and overall performance value of each alternative can be obtained based on the concept of fuzzy multiple-Criteria decision-making (FMCDM).
Abstract: In the real word, the decision-making problems are very vague and uncertain in a number of ways. Most of the criteria have interdependent and interactive features, so they cannot be evaluated by conventional measure method. Such as the feasibility, thus, to approximate the human subjective evaluation process, it would be more suitable to apply a fuzzy method in the environment-watershed plan topic. This paper describes the design of a fuzzy decision support system in multi-criteria analysis approach for selecting the best plan alternatives or strategies in environment watershed. The fuzzy analytic hierarchy process (FAHP) method is used to determine the preference weightings of criteria for decision makers by subjective perception (natural language). A questionnaire was used to find out from three related groups comprising 15 experts, including 5 from the university of expert scholars (include Water Resources Engineering and Conservation, Landscape and Recreation, Urban Planning, Environment Engineering, Architectural Engineering, etc.), 5 from the government departments, and 5 from industry. Subjectivity and vagueness analysis is dealt with the criteria and alternatives for selection process and simulation results by using fuzzy numbers with linguistic terms. It incorporated the decision-makers' attitude towards the preference; overall performance value of each alternative can be obtained based on the concept of fuzzy multiple-criteria decision-making (FMCDM). This research also gives an example of evaluation consisting of five alternatives, solicited from an environment-watershed plan work in Taiwan, is illustrated to demonstrate the effectiveness and usefulness of the proposed approach. The result is useful for destination planning and the sustainability of watershed tourism resources as well.