TL;DR: The proposed fuzzy prioritisation method uses fuzzy pairwise comparison judgements rather than exact numerical values of the comparison ratios and transforms the initial fuzzy prioritisations problem into a non-linear program, which eliminates the need of additional aggregation and ranking procedures.
Abstract: This paper proposes a new approach for tackling the uncertainty and imprecision of the service evaluation process. Identifying suitable service offers, evaluating the offers and choosing the best alternatives are activities that set the scene for the consequent stages in negotiations and influence in a unique manner the following deliberations. The pre-negotiation problem in negotiations over services is regarded as decision-making under uncertainty, based on multiple criteria of quantitative and qualitative nature, where the imprecise decision-maker’s judgements are represented as fuzzy numbers. A new fuzzy modification of the analytic hierarchy process is applied as an evaluation technique. The proposed fuzzy prioritisation method uses fuzzy pairwise comparison judgements rather than exact numerical values of the comparison ratios and transforms the initial fuzzy prioritisation problem into a non-linear program. Unlike the known fuzzy prioritisation techniques, the proposed method derives crisp weights from consistent and inconsistent fuzzy comparison matrices, which eliminates the need of additional aggregation and ranking procedures. A detailed numerical example, illustrating the application of our approach to service evaluation is given.
TL;DR: This paper addresses the well-known classification task of data mining, where the objective is to predict the class which an example belongs to, and proposes a hybrid decision tree/genetic algorithm method specifically designed for discovering rules covering examples belonging to small disjuncts.
Abstract: This paper addresses the well-known classification task of data mining, where the objective is to predict the class which an example belongs to. Discovered knowledge is expressed in the form of high-level, easy-to-interpret classification rules. In order to discover classification rules, we propose a hybrid decision tree/genetic algorithm method. The central idea of this hybrid method involves the concept of small disjuncts in data mining, as follows. In essence, a set of classification rules can be regarded as a logical disjunction of rules. so that each rule can be regarded as a disjunct. A small disjunct is a rule covering a small number of examples. Due to their nature, small disjuncts are error prone. However, although each small disjunct covers just a few examples, the set of all small disjuncts can cover a large number of examples, so that it is important to develop new approaches to cope with the problem of small disjuncts. In our hybrid approach, we have developed two genetic algorithms (GA) specifically designed for discovering rules covering examples belonging to small disjuncts, whereas a conventional decision tree algorithm is used to produce rules covering examples belonging to large disjuncts. We present results evaluating the performance of the hybrid method in 22 real-world data sets.
TL;DR: The factor oracle, a data structure proposed by Crochemore & al for string matching, is presented and the relation between this structure and the previous models is shown and how it can be adapted for learning musical sequences and generating improvisations in a real-time context.
Abstract: We describe variable markov models we have used for statistical learning of musical sequences, then we present the factor oracle, a data structure proposed by Crochemore & al for string matching. We show the relation between this structure and the previous models and indicate how it can be adapted for learning musical sequences and generating improvisations in a real-time context.
TL;DR: A general framework for data fusion based on a voting like process that tries to adjudicate conflict among the data, and presents a concept of reasonableness as a means for including in the fusion process any information available other then that provided by the sources.
Abstract: A general view of the multi-source data fusion process is presented. Some of the considerations and information that must go into the development of a multisource data fusion algorithm are described. Features that play a role in expressing user requirements are also described. We provide a general framework for data fusion based on a voting like process that tries to adjudicate conflict among the data. We discuss idea of a compatibility relationship and introduce several important examples of these relationships. We show that our formulation results in some conditions on the fused value implying that the fusion process has the nature of a mean type aggregation. Situations in which the sources have different credibility weights are considered. We present a concept of reasonableness as a means for including in the fusion process any information available other then that provided by the sources. We consider the situation where we allow our fused values to be granular objects such as linguistic terms or subsets.
TL;DR: A framework for selection and reordering of input variables to reduce generalization error in classification and probabilistic inference is presented and a generic fitness function for validation of input specification is designed and used to develop two genetic algorithm wrappers.
Abstract: In this paper, we address the automated tuning of input specification for supervised inductive learning and develop combinatorial optimization solutions for two such tuning problems. First, we present a framework for selection and reordering of input variables to reduce generalization error in classification and probabilistic inference. One purpose of selection is to control overfitting using validation set accuracy as a criterion for relevance. Similarly, some inductive learning algorithms, such as greedy algorithms for learning probabilistic networks, are sensitive to the evaluation order of variables. We design a generic fitness function for validation of input specification, then use it to develop two genetic algorithm wrappers: one for the variable selection problem for decision tree inducers and one for the variable ordering problem for Bayesian network structure learning. We evaluate the wrappers, using real-world data for the selection wrapper and synthetic data for both, and discuss their limitations and generalizability to other inducers.
TL;DR: A new fuzzy mining algorithm based on the AprioriTid approach to find fuzzy association rules from given quantitative transactions by focusing on the most important linguistic terms for reduced time complexity is proposed.
Abstract: Due to the increasing use of very large databases and data warehouses, mining useful information and helpful knowledge from transactions is evolving into an important research area. Most of conventional data mining algorithms identify the relation among transactions with binary values. Transactions with quantitative values are, however, commonly seen in real world applications. In the past, we proposed a fuzzy mining algorithm based on the Apriori approach to explore interesting knowledge from the transactions with quantitative values. This paper proposes another new fuzzy mining algorithm based on the AprioriTid approach to find fuzzy association rules from given quantitative transactions. Each item uses only the linguistic term with the maximum cardinality in later mining processes, thus making the number of fuzzy regions to be processed the same as that of the original items. The algorithm therefore focuses on the most important linguistic terms for reduced time complexity. Experimental results from the data in a supermarket of a department store show the feasibility of the proposed mining algorithm.
TL;DR: This paper presents some syntactical and semantical aspects of FAUST (Functional AUdio STreams), a programming language for real-time sound processing and synthesis, based on a block-diagram algebra.
Abstract: This paper presents some syntactical and semantical aspects of FAUST (Functional AUdio STreams), a programming language for real-time sound processing and synthesis. The programming model of FAUST combines two approaches: functional programming and block-diagrams composition. It is based on a block-diagram algebra. It as a well defined formal semantic and can be compiled into efficient C/C++ code.
TL;DR: A genetic algorithm is presented which considers different possible groupings of the data into outlier and non-outlier observations, and a genetic algorithm for a simultaneous outlier detection and variable selection is suggested.
Abstract: This article addresses some problems in outlier detection and variable selection in linear regression models. First, in outlier detection there are problems known as smearing and masking. Smearing means that one outlier makes another, non-outlier observation appear as an outlier, and masking that one outlier prevents another one from being detected. Detecting outliers one by one may therefore give misleading results. In this article a genetic algorithm is presented which considers different possible groupings of the data into outlier and non-outlier observations. In this way all outliers are detected at the same time. Second, it is known that outlier detection and variable selection can influence each other, and that different results may be obtained, depending on the order in which these two tasks are performed. It may therefore be useful to consider these tasks simultaneously, and a genetic algorithm for a simultaneous outlier detection and variable selection is suggested. Two real data sets are used to illustrate the algorithms, which are shown to work well. In addition, the scalability of the algorithms is considered with an experiment using generated data.
TL;DR: The results obtained for evolving a schedule of 400 customers’ orders on experimental model of FPIM indicate that the business delays in order of half-an-hour can be achieved.
Abstract: This paper presents an approach for scheduling of customers’ orders in factories of plastic injection machines (FPIM) as a case of real-world flexible job shop scheduling problem. The objective of discussed work is to provide FPIM with high business speed which implies (a) providing a customers with convenient way for remote online access to the factory’s database and (b) developing an efficient scheduling routine for planning the assignment of the submitted customers’ orders to FPIM machines. Remote online access to FPIM database, approached via delivering the software as a Web-service in accordance with the application service provider (ASP) paradigm is proposed. As an approach addressing the issue of efficient scheduling routine a hybrid evolutionary algorithm (HEA) combining priority-dispatching rules (PDRs) with GA is developed. An implementation of HEA as a database stored procedure is discussed. Performance evaluation results are presented. The results obtained for evolving a schedule of 400 customers’ orders on experimental model of FPIM indicate that the business delays in order of half-an-hour can be achieved.
TL;DR: Merits of fuzzy granular computation, in terms of performance and computation time, for the task of case generation in large scale case-based reasoning systems are illustrated through an example.
Abstract: Data mining and knowledge discovery is described from pattern recognition point of view along with the relevance of soft computing. Key features of the computational theory of perceptions and its significance in pattern recognition and knowledge discovery problems are explained. Role of fuzzy-granulation (f-granulation) in machine and human intelligence, and its modeling through rough-fuzzy integration are discussed. Merits of fuzzy granular computation, in terms of performance and computation time, for the task of case generation in large scale case-based reasoning systems are illustrated through an example.
TL;DR: The paper describes a hybrid inductive machine learning algorithm called CLIP4 that first partitions data into subsets using a tree structure and then generates production rules only from subsets stored at the leaf nodes, which is a unique feature of the algorithm.
Abstract: The paper describes a hybrid inductive machine learning algorithm called CLIP4. The algorithm first partitions data into subsets using a tree structure and then generates production rules only from subsets stored at the leaf nodes. The unique feature of the algorithm is generation of rules that involve inequalities. The algorithm works with the data that have large number of examples and attributes, can cope with noisy data, and can use numerical, nominal continuous, and missing-value attributes. The algorithm's flexibility and efficiency are shown on several well-known benchmarking data sets, and the results are compared with other machine learning algorithms. The benchmarking results in each instance show the CLIP4's accuracy, CPU time, and rule complexity, CLIP4 has built-in features like tree pruning, methods for partitioning the data (for data with large number of examples and attributes, and for data containing noise), data-independent mechanism for dealing with missing values, genetic operators to improve accuracy on small data, and the discretization schemes. CLIP4 generates model of data that consists of well-generalized rules, and ranks attributes and selectors that can be used for feature selection.
TL;DR: The new improved generalized neuron model is proposed, which has both summation and product as aggregation function and which has flexibility at both the aggregation and activation function level to cope with the non-linearity involved in the type of applications dealt with.
Abstract: The conventional neural networks consisting of simple neuron models have various drawbacks like large training time for complex problems, huge data requirement to train a non linear complex problems, unknown ANN structure, the relatively larger number of hidden nodes required, problem of local minima etc. To make the Artificial Neural Network more efficient and to overcome the above-mentioned problems the new improved generalized neuron model is proposed in this work. The proposed neuron models have both summation (Σ) and product (π) as aggregation function. The generalized neuron models have flexibility at both the aggregation and activation function level to cope with the non-linearity involved in the type of applications dealt with. The training and testing performance of these models have been compared for Short Term Load Forecasting Problem.
TL;DR: This paper proposes a restricted, polynomial time structure learning algorithm that is not as restrictive as both other approaches, and allows researchers to determine the right balance between classification performance and quality of the underlying probability distribution.
Abstract: Learning the structure of a Bayesian network from data is a difficult problem, as its associated search space is superexponentially large. As a consequence, researchers have studied learning Bayesian networks with a fixed structure, notably naive Bayesian networks and tree-augmented Bayesian networks, which involves no search at all. There is substantial evidence in the literature that the performance of such restricted networks can be surprisingly good. In this paper, we propose a restricted, polynomial time structure learning algorithm that is not as restrictive as both other approaches, and allows researchers to determine the right balance between classification performance and quality of the underlying probability distribution. The results obtained with this algorithm allow drawing some conclusions with regard to Bayesian-network structure learning in general.
TL;DR: When one enumerates periodic musical structures, the computation is done up to a cyclic shift, which means that two solutions which are cyclic shifts of one another are considered the same.
Abstract: When one enumerates periodic musical structures, the computation is done up to a cyclic shift. This means that two solutions which are cyclic shifts of one another are considered the same. Lyndon words provide a powerful way to do so. We illustrate this by two examples taken from African traditional music.
TL;DR: The aim of this contribution is to investigate the domination of OWA operators over t-norms whereas the main emphasis is on the domination over the Łukasiewicz t- norm.
Abstract: The fusion of transitive fuzzy relations preserving the transitivity is linked to the domination of the involved aggregation operator. The aim of this contribution is to investigate the domination of OWA operators over t-norms whereas the main emphasis is on the domination over the Łukasiewicz t-norm. The domination of OWA operators and related operators over continuous Archimedean t-norms will also be discussed.
TL;DR: The triple I methods of FMP and FMT for fuzzy reasoning and their consistency are formalized, thus fuzzy reasoning is put completely and rigorously into the logic framework of fuzzy logic.
Abstract: This paper focuses on the logic foundation of fuzzy reasoning. At first, a new complete first-order fuzzy predicate calculus system K* corresponding to the formal system L* is built. Based on the many-sort system Kms* corresponding to K*, the triple I methods of FMP and FMT for fuzzy reasoning and their consistency are formalized, thus fuzzy reasoning is put completely and rigorously into the logic framework of fuzzy logic.
TL;DR: Some industrial applications of fuzzy parametric tests are reviewed and some new algorithms for fuzzy nonparametric tests, namely a fuzzy sign test and a fuzzy Wilcoxon signed-ranks test are presented.
TL;DR: A new system, called GeneScout, for predicting gene structures in vertebrate genomic DNA, which contains specially designed hidden Markov models (HMMs) for detecting functional sites including proteintranslation start sites, mRNA splicing junction donor and acceptor sites, etc.
Abstract: Automated detection or prediction of coding sequences from within genomic DNA has been a major rate-limiting step in the pursuit of vertebrate genes. Programs currently available are far from being powerful enough to elucidate a gent structure completely. In this paper, we present a new system, called GeneScout, for predicting gene structures in vertebrate genomic DNA. The system contains specially designed hidden Markov models (HMMs) for detecting functional sites including proteintranslation start sites, mRNA splicing junction donor and acceptor sites, etc. An HMM model is also proposed for exon coding potential computation. Our main hypothesis is that, given a vertebrate genomic DNA sequence S, it is always possible to construct a directed acyclic graph G such that the path for the actual coding region of S is in the set of all paths on G. Thus, the gene detection problem is reduced to that of analyzing the paths in the graph G. A dynamic programming algorithm is used to lind the optimal path in G. The proposed system is trained using an expectation-maximization algorithm and its performance on vertebrate gene prediction is evaluated using the 10-way cross-validation method. Experimental results show that the proposed system performs well and is comparable to existing gene discovery tools.
TL;DR: It is demonstrated that a robust genetic algorithm for the traveling salesman problem (TSP) should preserve and add good edges efficiently, and at the same time, maintain the population diversity well.
Abstract: This paper demonstrates that a robust genetic algorithm for the traveling salesman problem (TSP) should preserve and add good edges efficiently, and at the same time, maintain the population diversity well. We analyzed the strengths and limitations of several well-known genetic operators for TSPs by the experiments. To evaluate these factors, we propose a new genetic algorithm integrating two genetic operators and a heterogeneous pairing selection. The former can preserve and add good edges efficiently and the later will be able to keep the population diversity. The proposed approach was evaluated on 15 well-known TSPs whose numbers of cities range from 101 to 13509. Experimental results indicated that our approach, somewhat slower, performs very robustly and is very competitive with other approaches in our best surveys. We believe that a genetic algorithm can be a stable approach for TSPs if its operators can preserve and add edges efficiently and it maintains population diversity.
TL;DR: In this article, clustering and Pareto optimisation are combined into a single evolutionary design algorithm to prevent the system from converging prematurely to a local minimum and to encourage a number of different designs that fulfil the design criteria.
Abstract: Evolutionary approaches have been used in a large variety of design domains, from aircraft engineering to the designs of analog filters. Many of these approaches use measures to improve the variety of solutions in the population. One such measure is clustering. In this paper, clustering and Pareto optimisation are combined into a single evolutionary design algorithm. The population is split into a number of clusters, and parent and offspring selection, as well as fitness calculation, are performed on a per-cluster basis. The objective of this is to prevent the system from converging prematurely to a local minimum and to encourage a number of different designs that fulfil the design criteria. Our approach is demonstrated in the domain of digital filter design. Using a polar coordinate based pole-zero representation, two different lowpass filter design problems are explored. The results are compared to designs created by a human expert. They demonstrate that the evolutionary process is able to create designs that are competitive with those created using a conventional design process by a human expert. They also demonstrate that each evolutionary run can produce a number of different designs with similar fitness values, but very different characteristics.
TL;DR: The objective of this issue is to assemble a set of high-quality original contributions that reflect the advances and the state-of-the-art in the area of Data Mining and Knowledge Discovery with Soft Computing Methodologies, thereby presenting a consolidated view to the interested researchers in the aforesaid fields and readers of the journal Information Sciences.
Abstract: Soft computing is a consortium of methodologies, (like fuzzy logic, neural networks, genetic algorithms, rough sets), that works synergistically and provides, in one form or another, flexible information processing capabilities for handling real life problems. Its aim is to exploit the tolerance for imprecision, uncertainty, approximate reasoning and partial truth in order to achieve tractability, robustness, low solution cost, and close resemblance with human like decision-making. The process of knowledge discovery from data bases (KDD), on the other hand, is a real life problem solving paradigm and is defined as the non-trivial process of identifying valid, novel, potentially useful and understandable patterns from large data bases, where the data is frequently ambiguous, incomplete, noisy, redundant and changes with time. Data mining is one of the fundamental steps in the KDD process and is concerned with the algorithmic means by which patterns or structures are enumerated from the data under acceptable computational efficiency. Soft computing tools, individually or in integrated manner, are turning out to be strong candidates for performing data mining tasks efficiently. At present, the results on these investigations, integrating soft computing and data mining, both theory and applications, are being available in different journals and conference proceedings mainly in the fields of computer science, information technology, engineering and mathematics. The objective of this issue is to assemble a set of high-quality original contributions that reflect the advances and the state-of-the-art in the area of Data Mining and Knowledge Discovery with Soft Computing Methodologies; thereby presenting a consolidated view to the interested researchers in the aforesaid fields, in general, and readers of the journal Information Sciences, in particular. It has ten articles. The first one is a title article. While the next four articles deal with classificatory rule analysis, the sixth one concerns with association rule mining. Utility of Self Organizing Map (SOM) in the context of text mining and clustering are discussed in seventh and eighth articles. The next contribution describes a novel multi-source data fusion strategy. The last article demonstrates an application of data mining in biological data analysis.
TL;DR: It is shown that under certain reasonable symmetry conditions, Lp metrics d(I,Ĩ)=∫|I(x)−Ĩ(x)|pdx are the best, and that the optimal value of p can be selected depending on the expected relative size r of the informative part of the image.
Abstract: Images take lot of computer space; in many practical situations, we cannot store all original images, we have to use compression. Moreover, in many such situations, compression ratio provided by even the best lossless compression is not sufficient, so we have to use lossy compression. In a lossy compression, the reconstructed image Ĩ is, in general, different from the original image I. There exist many different lossy compression methods, and most of these methods have several tunable parameters. In different situations, different methods lead to different quality reconstruction, so it is important to select, in each situation, the best compression method. A natural idea is to select the compression method for which the average value of some metric d(I,Ĩ) is the smallest possible. The question is then: which quality metric should we choose? In this paper, we show that under certain reasonable symmetry conditions, Lp metrics d(I,Ĩ)=∫|I(x)−Ĩ(x)|pdx are the best, and that the optimal value of p can be selected depending on the expected relative size r of the informative part of the image.
TL;DR: To the best of the knowledge, this is the first tool performing Web-based predictive analysis in Materials Science, using the Apriori Algorithm to derive association rules that represent relationships between input conditions and results of domain experiments.
Abstract: Experimental data in many domains serves as a basis for predicting useful trends. If the data and analysis are available over the Web this promotes E-Business by connecting clientele worldwide. This paper describes such a predictive tool "QuenchMiner™" in the domain "Materials Science". Data mining, more specifically the "Apriori Algorithm", is used to derive association rules that represent relationships between input conditions and results of domain experiments. This enables the tool to answer questions such as "Given cooling medium and agitation during material heat treatment, predict cooling rate". This allows users to perform case studies on the Web and use their results to optimize the involved processes, thus increasing customer satisfaction. Another interesting aspect is predicting material microstructure during heat treatment. Microstructure controls material properties such as hardness. Hence its prediction helps in making decisions about materials selection. Microstructure prediction has similarities to an artificial intelligence process called "Game-of-Life". Some challenges in our work are incorporating domain expert judgement while mining association rules, simulating microstructure evolution under different conditions, and dealing with uncertainty. These challenges and associated research issues are outlined here. To the best of our knowledge, this is the first tool performing Web-based predictive analysis in Materials Science.
TL;DR: The topological aspects of such reducts are focused on, identifying some of their limitations and introducing alternative definitions that do not suffer from these limitations.
Abstract: One of the main notions in the Rough Sets Theory (RST) is that of a reduct. According to its classic definition, the reduct is a minimal subset of the attributes that retains some important properties of the whole set of attributes. The idea of the reduct proved to be interesting enough to inspire a great deal of research and resulted in introducing various reduct-related ideas and notions. First of all, depending on the character of the attributes involved in the analysis, so called absolute and relative reducts can be defined. The more interesting of these, relative reducts, are minimal subsets of attributes that retain discernibility between objects belonging to different classes. This paper focuses on the topological aspects of such reducts, identifying some of their limitations and introducing alternative definitions that do not suffer from these limitations. The modified subsets of attributes, referred to as constructs, are intended to assist the subsequent inductive process of data generalisation and knowledge acquisition, which, in the context of RST, usually takes the form of decision rule generation. Usefulness of both reducts and constructs in this role is examined and evaluated in a massive computational experiment, which was carried out for a collection of real-life data sets.
TL;DR: This paper presents an evolvable hardware platform for the automated design and adaptation of multiplierless digital filters based on the Primitive Operator Filter design principle and shows that the functionality of filters evolved on the PLA was maintained despite an increasing number of faults.
Abstract: Finite impulse response filters (FIRs) are crucial devices for robust data communication and manipulation Multiplierless filters have been shown to produce high performance systems with fast signal processing and reduced area Furthermore, the distributed architecture inherent in multiplierless filters makes it a suitable candidate for fault tolerant design Alternative approaches to the design of fault tolerant systems have been proposed using evolutionary algorithms (EAs) and the concept of evolvable hardware (EHW) This paper presents an evolvable hardware platform for the automated design and adaptation of multiplierless digital filters Filters are realised within a dedicated programmable logic array (PLA) based on the Primitive Operator Filter design principle The platform employs a genetic algorithm to autonomously configure the PLA for a given set of coefficients The ability of the platform to adapt to increasing numbers of faults was investigated through the “evolution” of a 31-tap low-pass FIR filter Results show that the functionality of filters evolved on the PLA was maintained despite an increasing number of faults covering up to 25% of the PLA area Additionally, three PLA initialisation methods were investigated to ascertain which produced the fastest fault recovery times It was shown that seeding a population of random configuration-strings with the best configuration currently obtained resulted in a 6 fold increase in fault recovery speed over other methods investigated
TL;DR: This work applies a new approach to modeling uncertain probabilities to queuing theory and the optimal design of web servers by using fuzzy, finite, regular Markov chains to determine the fuzzy steady state probabilities and then computing the fuzzy numbers for system performance.
Abstract: We apply our new approach to modeling uncertain probabilities to queuing theory and the optimal design of web servers. This involves using fuzzy, finite, regular Markov chains to determine the fuzzy steady state probabilities and then computing the fuzzy numbers for system performance. We first ignore revenues and costs in determining an optimal system and then we incorporate these factors for optimal design. Then we add two new phenomena associated with the web in our optimization models: “burstiness” and “long tailed distributions”.
TL;DR: The papers in this special issue of the Soft Computing Journal deal with quite different aspects of neuro-fuzzy techniques, with a stress on data analysis and rule based systems.
Abstract: Modern information technology makes it possible today to collect, store, transfer, and combine huge amounts of data at very low costs. Thus more and more companies, and scientific and governmental institutions build up large archives of all kinds of data like numbers, tables, documents, images, sounds, etc. Although users often have a vague understanding of their data and can usually formulate hypotheses and guess dependencies, turning these – often abundantly available – data into useful information turns out to be rather difficult. In response to these challenges a new area of research has emerged, called ‘‘knowledge discovery in databases’’ or ‘‘data mining’’, that tries to provide tools to extract valid, useful, understandable, unknown, and unexpected relationships from large databases [1]. This current form of data analysis is an interdisciplinary field, and employs methods from statistics, soft computing, artificial intelligence and machine learning. The stress lies on the development of techniques that produce human-understandable results, and are suited for large, real world datasets. The data in real world applications has in most cases characteristics that challenge classical analyzing approaches. Besides being heterogeneous – which in its simplest form can mean that we have to deal with numerical and symbolic features – , the data is often of low quality. Algorithms must thus be able to deal with uncertainty and imprecision. The characteristics of the data sources – their quantity, complexity, dimensionality and imperfection – , the essential of extracting understandable patterns from these, and the need to incorporate available background knowledge in that process, makes us assume that (neuro-) fuzzy techniques will play a considerable role in the future of data mining [3]. It is the ability of fuzzy sets to transform between computer representations and (naturally linguistic) human concepts that makes them so valuable to meet the advanced data mining demands, and that Zadeh meant, when he promoted his idea of computing with words [4]. The inherent imprecision of words is not necessarily a weakness, but, on the contrary, can be crucial to model complex systems. From our own experience we observed that many practical applications have this certain robustness where full precision is not necessary. In such cases, exaggerated precision can be a waste of resources, and solutions obtained using fuzzy approaches might be easier to understand and to apply, and gain their strengths by explicitly taking into account vagueness, imprecision or uncertainty. One prominent way to use fuzzy systems in data analysis, is to induce fuzzy if-then rules from data by neurofuzzy techniques. The use of linguistic variables eases the readability and interpretability of the rule base. If we apply such techniques, we must be aware of the trade-off between precision and interpretability. However, the results in data mining are not only judged for their accuracy, but also for their interpretability, as the ultimate goal is to extract human understandable patterns [2]. The papers in this special issue of the Soft Computing Journal deal with quite different aspects of neuro-fuzzy techniques, with a stress on data analysis and rule based systems. Although the intention of this issue cannot be to give a survey of this wide area, it shall give an insight in current trends and research topics in this field. The papers can roughly be divided into three groups:
TL;DR: A heuristic argument is given to show how the statistical information inside the population can be used to tune the mutation rate at individual locus, resulting in higher overall performance.
Abstract: The biological observation of the difference in the mutation rates of allele on different loci is implemented in genetic algorithm so that the mutation rate is both time and locus dependent. The performance of this new locus oriented adaptive genetic algorithm (LOAGA) is evaluated on the test problem of zero/one knapsack for various sizes. It is found that LOAGA can solve the single constraint zero/one knapsack with high speed, high success rate, and small memory requirement. A heuristic argument is given to show how the statistical information inside the population can be used to tune the mutation rate at individual locus, resulting in higher overall performance.
TL;DR: Algorithm A and the well-known fuzzy clustering algorithm FCM have the same clustering track, and this fact builds the very bridge between probabilistic clustering and fuzzy clusters, and fruitful research results on Renyi entropy measure may help to further understand the essence of fuzzy clusters.
Abstract: In this short communication, based on Renyi entropy measure, a new Renyi information based clustering algorithm A is presented. Algorithm A and the well-known fuzzy clustering algorithm FCM have the same clustering track. This fact builds the very bridge between probabilistic clustering and fuzzy clustering, and fruitful research results on Renyi entropy measure may help us to further understand the essence of fuzzy clustering.