Top 68 papers presented at Intelligent Data Analysis in 1997

Showing papers presented at "Intelligent Data Analysis in 1997"

Journal Article•10.1016/S1088-467X(97)00008-5•

Feature Selection for Classification

[...]

Manoranjan Dash¹, Huan Liu¹•Institutions (1)

1 May 1997

TL;DR: This survey identifies the future research areas in feature selection, introduces newcomers to this field, and paves the way for practitioners who search for suitable methods for solving domain-specific real-world applications.

...read moreread less

Abstract: Feature selection has been the focus of interest for quite some time and much work has been done. With the creation of huge databases and the consequent requirements for good machine learning techniques, new problems arise and novel approaches to feature selection are in demand. This survey is a comprehensive overview of many existing methods from the 1970's to the present. It identifies four steps of a typical feature selection method, and categorizes the different existing methods in terms of generation procedures and evaluation functions, and reveals hitherto unattempted combinations of generation procedures and evaluation functions. Representative methods are chosen from each category for detailed explanation and discussion via example. Benchmark datasets with different characteristics are used for comparative study. The strengths and weaknesses of different methods are explained. Guidelines for applying feature selection methods are given based on data types and domain characteristics. This survey identifies the future research areas in feature selection, introduces newcomers to this field, and paves the way for practitioners who search for suitable methods for solving domain-specific real-world applications.

...read moreread less

3,443 citations

Journal Article•10.1016/S1088-467X(98)00007-9•

Data Preprocessing and Intelligent Data Analysis

[...]

A. Famili¹, Wei-Min Shen², Richard Weber, Evangelos Simoudis³•Institutions (3)

National Research Council¹, University of Southern California², IBM³

1 Jan 1997

TL;DR: This paper first provides an overview of data preprocessing, focusing on problems of real world data, and details of dataPreprocessing techniques achieving each of the above mentioned objectives.

...read moreread less

Abstract: This paper first provides an overview of data preprocessing, focusing on problems of real world data. These are primarily problems that have to be carefully understood and solved before any data analysis process can start. The paper discusses in detail two main reasons for performing data preprocessing: i problems with the data and ii preparation for data analysis. The paper continues with details of data preprocessing techniques achieving each of the above mentioned objectives. A total of 14 techniques are discussed. Two examples of data preprocessing applications from two of the most data rich domains are given at the end. The applications are related to semiconductor manufacturing and aerospace domains where large amounts of data are available, and they are fairly reliable. Future directions and some challenges are discussed at the end.

...read moreread less

344 citations

Book Chapter•10.1007/BFB0052868•

Techniques for Dealing with Missing Values in Classification

[...]

Wei Zhong Liu¹, Allan P. White², Simon Thompson¹, Max Bramer¹•Institutions (2)

University of Portsmouth¹, University of Birmingham²

4 Aug 1997

TL;DR: The technique of dynamic path generation is described in the context of tree-based classification methods and the waste of data which can result from casewise deletion of missing values in statistical algorithms is discussed and alternatives proposed.

...read moreread less

Abstract: A brief overview of the history of the development of decision tree induction algorithms is followed by a review of techniques for dealing with missing attribute values in the operation of these methods. The technique of dynamic path generation is described in the context of tree-based classification methods. The waste of data which can result from casewise deletion of missing values in statistical algorithms is discussed and alternatives proposed.

...read moreread less

101 citations

Book Chapter•10.1007/BFB0052867•

The BANG-Clustering System: Grid-Based Data Analysis

[...]

Erich Schikuta¹, Martin Erhart¹•Institutions (1)

University of Vienna¹

4 Aug 1997

TL;DR: The BANG-Clustering system presented in this paper is a novel approach to hierarchical data analysis and uses a multidimensional grid data structure to organize the value space surrounding the pattern values.

...read moreread less

Abstract: For the analysis of large images the clustering of the data set is a common technique to identify correlation characteristics of the underlying value space In this paper a new approach to hierarchical clustering of very large data sets is presented The BANG-Clustering system presented in this paper is a novel approach to hierarchical data analysis It is based on the BANG-Clustering method ([Sch96]) and uses a multidimensional grid data structure to organize the value space surrounding the pattern values The patterns are grouped into blocks and clustered with respect to the blocks by a topological neighbor search algorithm

...read moreread less

85 citations

Book Chapter•10.1007/BFB0052869•

The Use of Exogenous Knowledge to Learn Bayesian Networks from Incomplete Databases

[...]

Marco F. Ramoni¹, Paola Sebastiani•Institutions (1)

Open University¹

4 Aug 1997

TL;DR: A method — called Bound and Collapse (bc) — to learn Bayesian Belief Networks from incomplete databases which allows the analyst to efficiently integrate information provided by the database and exogenous knowledge about the pattern of missing data.

...read moreread less

Abstract: Current methods to learn Bayesian Belief Networks (bbns) from incomplete databases share the common assumption that the unreported data are missing at random. This paper describes a method — called Bound and Collapse (bc) — to learn bbns from incomplete databases which allows the analyst to efficiently integrate information provided by the database and exogenous knowledge about the pattern of missing data. bc starts by bounding the set of estimates consistent with the information conveyed by the database and then collapses the resulting set to a point via a convex combination of the extreme points, with weights depending on the assumed pattern of missing data. Experiments comparing bc to Gibbs Sampling are provided.

...read moreread less

71 citations

Book Chapter•10.1007/BFB0052847•

ECG Segmentation Using Time-Warping

[...]

H. J. L. M. Vullings¹, M. H. G. Verhaegen¹, H. B. Verbruggen¹•Institutions (1)

Delft University of Technology¹

4 Aug 1997

TL;DR: A method to segment the electrocardiogram (ECG) using time-warping, a technique commonly used in speech recognition, to cut the ECG into distinct periods (R-R interval).

...read moreread less

Abstract: We present a method to segment the electrocardiogram (ECG) using time-warping, a technique commonly used in speech recognition. First, the ECG is transformed to a piecewise linear approximation. Next, the slope amplitude is used to cut the ECG into distinct periods (R-R interval). These periods are then compared to each other using timewarping, and the pair which is most similar is selected. Finally, this pair is segmented into the different subpatterns usually encountered in the ECG, such as the QRS complex, the T wave, and the P wave.

...read moreread less

65 citations

Journal Article•10.1016/S1088-467X(98)00009-2•

Summary SQL --A Fuzzy Tool For Data Mining

[...]

Dan Rasmussen¹, Ronald R. Yager¹•Institutions (1)

Iona College¹

1 Jan 1997

TL;DR: An extension of a fuzzy query language called Summary SQL is introduced which can be used for knowledge discovery and data mining and it is shown how it could be used to search for fuzzy rules.

...read moreread less

Abstract: The increasing use of computers for transactions and communication have created mountains of data that contain potentially valuable knowledge. To search for this knowledge we have to develop a new generation of tools, which have the ability of flexible querying and intelligent searching. In this paper we will introduce an extension of a fuzzy query language called Summary SQL which can be used for knowledge discovery and data mining. We show how it can be used to search for fuzzy rules.

...read moreread less

61 citations

Journal Article•10.1016/S1088-467X(97)00004-8•

PCA of Wavelet Transformed Process Data for Monitoring

[...]

Karlene A. Kosanovich¹, Michael J. Piovoso²•Institutions (2)

University of South Carolina¹, Pennsylvania State University²

1 Mar 1997

TL;DR: This work demonstrates that the correlations and resulting monitoring models can be improved greatly with the addition of pre-filtering the time signals using a median filter, and time-scale decomposition using a multi-resolution wavelet function.

...read moreread less

Abstract: Producing a uniform product is important for several reasons such as maintenance of a competitive position, reduction in the number of shutdowns and startups, and the elimination of the sources of variability. Multivariate statistical methods can assist in the identification of process correlations and the development of process monitoring models. This work extends these concepts by demonstrating that the correlations and resulting monitoring models can be improved greatly with the addition of pre-filtering the time signals using a median filter, and time-scale decomposition using a multi-resolution wavelet function. After the data are filtered and decomposed, the multivariate statistical method of principal component analysis PCA is used to develop a process monitoring model. Data that was taken from a difficult-to-operate industrial process are used to demonstrate these ideas.

...read moreread less

59 citations

Journal Article•10.1016/S1088-467X(98)00008-0•

Compound Key Word Generation from Document Databases Using A Hierarchical Clustering ART Model

[...]

Alberto Muòoz¹•Institutions (1)

Carlos III Health Institute¹

1 Jan 1997

TL;DR: This paper addresses the specific problem of creating semantic term associations from a text database by using a hierarchical model made up of Fuzzy Adaptive Resonance Theory ART neural networks to cluster isolated words into semantic classes.

...read moreread less

Abstract: The growing availability of databases on the information highways motivates the development of new processing tools able to deal with a heterogeneous and changing information environment. A highly desirable feature of data processing systems handling this type of information is the ability to automatically extract its own key words. In this paper we address the specific problem of creating semantic term associations from a text database. The proposed method uses a hierarchical model made up of Fuzzy Adaptive Resonance Theory ART neural networks. First, the system uses several Fuzzy ART modules to cluster isolated words into semantic classes, starting from the database raw text. Next, this knowledge is used together with coocurrence information to extract semantically meaningful term associations. These associations are asymmetric and one-to-many due to the polisemy phenomenon. The strength of the associations between words can be measured numerically. Besides this, they implicitly define a hierarchy between descriptors. The underlying algorithm is appropriate for employment on large databases. The operation of the system is illustrated on several real databases.

...read moreread less

57 citations

Book Chapter•10.1007/BFB0052860•

Parallel Induction Algorithms for Data Mining

[...]

John Darlington¹, Yike Guo¹, Janjao Sutiwaraphun¹, Hing Wing To¹•Institutions (1)

Imperial College London¹

4 Aug 1997

TL;DR: Preliminary results on experiments in parallelising C4.5, a classification-rule learning system using decision-trees as a model representation, which has been used as a base model for investigating methods for parallelising induction algorithms are presented.

...read moreread less

Abstract: In the last decade, there has been an explosive growth in the generation and collection of data. Nonetheless, the quality of information inferred from this voluminous data has not been proportional to its size. One of the reasons for this is that the computational complexities of the algorithms used to extract information from the data are normally proportional to the number of input data items resulting in prohibitive execution time on large data sets. Parallelism is one solution to this problem. In this paper we present preliminary results on experiments in parallelising C4.5, a classification-rule learning system using decision-trees as a model representation, which has been used as a base model for investigating methods for parallelising induction algorithms. The experiments assess the potential for improving the execution time by exploiting parallelism in the algorithm.

...read moreread less

26 citations

Book Chapter•10.1007/BFB0052840•

Oblique Linear Tree

[...]

João Gama¹•Institutions (1)

University of Porto¹

4 Aug 1997

TL;DR: Ltree is able to define decision surfaces both orthogonal and oblique to the axes defined by the attributes of the input space by combining a decision tree with a linear discriminant by means of constructive induction.

...read moreread less

Abstract: In this paper we present system Ltree for proposicional supervised learning. Ltree is able to define decision surfaces both orthogonal and oblique to the axes defined by the attributes of the input space. This is done combining a decision tree with a linear discriminant by means of constructive induction. At each decision node Ltree defines a new instance space by insertion of new attributes that are projections of the examples that fall at this node over the hyper-planes given by a linear discriminant function. This new instance space is propagated down through the tree. Tests based on those new attributes are oblique with respect to the original input space. Ltree is a probabilistic tree in the sense that it outputs a class probability distribution for each query example. The class probability distribution is computed at learning time, taking into account the different class distributions on the path from the root to the actual node. We have carried out experiments on sixteen benchmark datasets and compared our system with other well known decision tree systems (orthogonal and oblique) like C4.5, OC1 and LMDT. On these datasets we have observed that our system has advantages in what concerns accuracy and tree size at statistically significant confidence levels.

...read moreread less

Book Chapter•10.1007/BFB0052844•

Forming Categories in Exploratory Data Analysis and Data Mining

[...]

Paul D. Scott¹, R. J. Williams¹, K. M. Ho¹•Institutions (1)

University of Essex¹

4 Aug 1997

TL;DR: The techniques used for categorizing variables in Snout an intelligent assistant for exploratory data analysis of survey and similar data sets that is currently under development are described.

...read moreread less

Abstract: This paper describes the techniques used for categorizing variables in Snout an intelligent assistant for exploratory data analysis of survey and similar data sets that is currently under development. We begin by reviewing existing work on category formation in data mining which has been mainly concerned with enabling decision tree programs to handle numeric variables. It is argued that there are other important but neglected aspects of category formation, notably the formation of new categorizations of nominal variables. We report the limited success achieved in categorizing variables from survey data using either endogenous methods or exogenous methods that maximise the association with only one dependent variable. We then describe the categorization technique used in Snout: a procedure that selects a partition that both maximises the number of variables associated with the partitioned variable and maximises the strength of those associations. We report on the success achieved using this procedure in exploring real survey data.

...read moreread less

Book Chapter•10.1007/BFB0052872•

Modelling Discrete Event Sequences as State Transition Diagrams

[...]

Adele E. Howe¹, Gabriel L. Somlo¹•Institutions (1)

Colorado State University¹

4 Aug 1997

TL;DR: A new algorithm for constructing one type of overview model: state transition diagrams is described, called State Transition Dependency Detection (STDD), which is the latest in a family of statistics based algorithms for modeling event sequences called Dependency detection.

...read moreread less

Abstract: Discrete event sequences have been modeled with two types of representation: snapshots and overviews. Snapshot models describe the process as a collection of relatively short sequences. Overview models collect key relationships into a single structure, providing an integrated but abstract view. This paper describes a new algorithm for constructing one type of overview model: state transition diagrams. The algorithm, called State Transition Dependency Detection (STDD), is the latest in a family of statistics based algorithms for modeling event sequences called Dependency Detection. We present accuracy results for the algorithm on synthetic data and data from the execution of two AI systems.

...read moreread less

Book Chapter•10.1007/BFB0052861•

Data Analysis for Query Processing

[...]

Jerome Robinson¹, Barry G. T. Lowden¹•Institutions (1)

University of Essex¹

4 Aug 1997

TL;DR: The goal of an intelligent analyser is to produce robust rules, stable in the presence of data change, which allow easy rule maintenance as data changes, and provide rapid query reformulation, refutation or answering.

...read moreread less

Abstract: Data analysis is needed in connection with query processing, to produce data summary information in the form of rules or assertions that allow semantic query optimisation or direct query answering without consulting the data itself. The goal of an intelligent analyser in this context is to produce robust rules, stable in the presence of data changes, which allow easy rule maintenance as data changes, and provide rapid query reformulation, refutation or answering. It must also limit the rule set to rules useful for query processing.

...read moreread less

Book Chapter•10.1007/BFB0052864•

A Modulated Parzen-Windows Approach for Probability Density Estimation

[...]

G. C. van den Eijkel¹, Jan C. A. van der Lubbe¹, E. Backer¹•Institutions (1)

Delft University of Technology¹

4 Aug 1997

TL;DR: Experiments show that the modulated Parzen-windows approach is more efficient in probability density function estimation, without costly preprocessing or severe loss of accuracy.

...read moreread less

Abstract: The Parzen-window approach is a well-known technique for estimating probability density functions. This paper introduces a modulated Parzen-windows approach. This approach uses kernels at equidistant samples to obtain a probability density function more efficiently. Experiments on both artificial and real data show that the modulated Parzen-windows approach is more efficient in probability density function estimation, without costly preprocessing or severe loss of accuracy.

...read moreread less

Book Chapter•10.1007/BFB0052866•

Robustness of Clustering under Outliers

[...]

Yurij S. Kharin¹•Institutions (1)

Belarusian State University¹

4 Aug 1997

TL;DR: The problem of clustering of multivariate random data is considered in presence of outliers and the new clustering algorithm with smoothing is presented.

...read moreread less

Abstract: The problem of clustering of multivariate random data is considered in presence of outliers. The hypothetical model of data is described by a mixture of regular m-parametric probability densities. Clustering of data is made by the often used in practice decision rule which is derived by substitution of ML-estimators (on the unclassified sample) of parameters for their unknown true values in Bayesian decision rule. Robustness of probability of classification error is evaluated. The new clustering algorithm with smoothing is presented. Illustration for the case of the Gaussian hypothetical model and for the Fisher's data under outliers is given.

...read moreread less

Journal Article•10.1016/S1088-467X(97)00009-7•

Decision Combination Based on the Characterisation of Predictive Accuracy

[...]

Kai Ming Ting¹•Institutions (1)

University of Waikato¹

1 May 1997

TL;DR: Empirical results show that the composite learner strategy is capable of partially overcoming the problem of locally low predictive accuracy, and at the same time improving the overall performance of its constituent algorithms in most of the domains studied.

...read moreread less

Abstract: In this article, we first explore an intrinsic problem that exists in the models induced by learning algorithms. Regardless of the selected algorithm, search methodology and hypothesis representation by which the model is induced, one would expect the model to make better predictions in some regions of the description space than others. We present the fact that an induced model will have some regions of relatively poor performance: the problem of locally low predictive accuracy. Holte, Arker, Porter [21] addressed this intrinsic problem in learning systems that describe the induced model as a disjunction of conjunctions of conditions. In this article, we investigate the characterisation of the problem in instance-based and Naive Bayesian classifiers.Having characterised the problem of locally low predictive accuracy, we propose to counter the problem in these two types of learning algorithms, using a composite learner framework. The strategy is to select an estimated better performing model to do the final prediction during classification. Empirical results from fifteen real-world domains show that the strategy is capable of partially overcoming the problem of locally low predictive accuracy, and at the same time improving the overall performance of its constituent algorithms in most of the domains studied. The composite learner is also found to outperform four methods of stacked generalisation, and also a model selection method based on cross-validation, in most of the experimental domains studied.

...read moreread less

Book Chapter•10.1007/BFB0052843•

Exploiting Symbolic Learning in Visual Inspection

[...]

Massimo Piccardi¹, Rita Cucchiara¹, Michele Bariani¹, Paola Mello¹•Institutions (1)

University of Ferrara¹

4 Aug 1997

TL;DR: Intelligent data analysis techniques based on symbolic learning by examples have been explored in order to automatically devise and parametrize effective quantitative models in the computer-vision inspection of industrial workpieces.

...read moreread less

Abstract: The paper describes the use of data analysis techniques in the computer-vision inspection of industrial workpieces. Computer-vision inspection aims at accomplishing quality verification of fabricated parts by means of automated visual procedures. Gathering the visual information into models proves a critical task, especially when subjective judgement is involved in quality verification. In this work, intelligent data analysis techniques based on symbolic learning by examples have been explored in order to automatically devise and parametrize effective quantitative models. The paper reports and discusses the experimental results achieved in an industrial application.

...read moreread less

Journal Article•10.1016/S1088-467X(98)00010-9•

A Classification Approach Incorporating Misclassification Costs

[...]

Jutta Schiffers

1 Jan 1997

TL;DR: An algorithm for learning a classification procedure to minimize the cost of misclassified examples is explored, based on the generation of oblique decision trees, which seems very promising.

...read moreread less

Abstract: We explore an algorithm for learning a classification procedure to minimize the cost of misclassified examples. The described approach is based on the generation of oblique decision trees. The various misclassification costs are defined by a cost matrix. A special splitting criterion is defined to determine the next node for splitting. Clustering techniques are used to process the splitting. The specific splitting criterion is based on cost histograms that count the misclassification costs per class. To avoid overfitting cross-validation techniques are directly integrated into the training cycle to terminate the splitting process. Several successful tests with different data sets cause this method to seem very promising.

...read moreread less

Book Chapter•10.1007/BFB0052873•

Detecting and Describing Patterns in Time-Varying Data Using Wavelets

[...]

Sarah Boyd¹•Institutions (1)

Macquarie University¹

4 Aug 1997

TL;DR: Techniques developed for detecting patterns in time-varying data with the ultimate aim of generating textual descriptions of the data are described and preliminary experiments are described in which the visually significant features in weather data are extracted and compared against hand-written expert descriptions.

...read moreread less

Abstract: Reasoning effectively about time-varying data requires sophisticated pattern detection mechanisms. This paper describes techniques developed for detecting patterns in time-varying data with the ultimate aim of generating textual descriptions of the data. Preliminary experiments are described in which the visually significant features in weather data are extracted and compared against hand-written expert descriptions.

...read moreread less

Journal Article•10.1016/S1088-467X(97)00005-X•

Possibilistic Testing of Distribution Functions for Change Detection

[...]

Olaf Wolkenhauer¹, John M. Edmunds¹•Institutions (1)

University of Manchester¹

1 Mar 1997

TL;DR: Change detection algorithms are proposed that are based on the comparison of distribution functions and fuzzy concepts are used to combine partial evaluations to a measure that indicates the departure of a signal from its reference.

...read moreread less

Abstract: Change detection algorithms are proposed that are based on the comparison of distribution functions. Estimated values of distributions are associated with a binomial distribution that is used to define fuzzy similarity classes. Fuzzy concepts are used to combine partial evaluations to a measure that indicates the departure of a signal from its reference.

...read moreread less

Book Chapter•10.1007/BFB0052870•

Reasoning about Outliers by Modelling Noisy Data

[...]

J.X. Wu, Gongxian Cheng¹, Xiaohui Liu¹•Institutions (1)

University of London¹

4 Aug 1997

TL;DR: This paper makes this distinction between measurement errors and measurement errors by modelling measurement errors instead, and is better suited to those applications where it is difficult to obtain relevant knowledge about real measurements.

...read moreread less

Abstract: Outliers are difficult to handle because some of them can be measurement errors, while others may represent phenomena of interest, something “significant” from the viewpoint of the application domain. Statistical methods for managing outliers do not distinguish between these two possibilities. In our previous work, we suggested a method for distinguishing these two possibilities by modelling “real measurements” — how measurements should be distributed in a domain of interest. In this paper, we make this distinction by modelling measurement errors instead. The proposed method is better suited to those applications where it is difficult to obtain relevant knowledge about real measurements. The test data collected from a recent glaucoma case finding study in a general practice are used to evaluate the method.

...read moreread less

Journal Article•10.1016/S1088-467X(97)00006-1•

Monotonicity of Entropy Computations in Belief Functions

[...]

David A. Maluf¹•Institutions (1)

Stanford University¹

1 May 1997

TL;DR: The entropy measure is presented as a monotonically decreasing function, symmetrical to the measure of dissonance, which can lead to further advances in optimization in information theory, which in turn may have a wide impact on decision and control.

...read moreread less

Abstract: This article addresses the issue of quantitative information measurement within the Dempster--Shafer belief function formalism. Entropy computation in Dempster--Shafer depends on the way uncertainty measures are conceptualized. However, freed of most probability constraints, uncertainty measures in Dempster--Shafer theory can lead to further advances in optimization in information theory, which in turn may have a wide impact on decision and control. This article examines one form of current development regarding the entropy measure induced from the measure of dissonance. For a significant period, the measure of dissonance has been taken as a measure of entropy. We present in this article the entropy measure as a monotonically decreasing function, symmetrical to the measure of dissonance.

...read moreread less

Journal Article•10.1016/S1088-467X(97)00002-4•

Pattern Recognition by Splitting Images Into Trees of Fuzzy Regions

[...]

Laurent Wendling¹, Jacky Desachy¹, Alain Paries²•Institutions (2)

Paul Sabatier University¹, University of Bordeaux²

1 Mar 1997

TL;DR: A new tree compression method is introduced in order to decrease the complexity when the authors have to manage with a large set of samples.

...read moreread less

Abstract: In this paper, a new method of pattern recognition based on images splitting into a set of trees composed of fuzzy regions is presented. First, either a gradient inverse function is applied on the raster image to define the fuzzy regions supports, or we manage with the basic grey level image if regions are easily topologically separable. Then, topologic features are computed on these sets. Therefore, a tree description of the image, which consists of fuzzy regions with associated topological features, is obtained. A set of sample trees is achieved from the application of the fuzzy segmentation algorithm on characteristic objects small images. Then a tree isomorphism is defined to recognize a particular object in an image. At last, a new tree compression method is introduced in order to decrease the complexity when we have to manage with a large set of samples.

...read moreread less

Book Chapter•10.1007/BFB0052848•

Interpreting Longitudinal Data through Temporal Abstractions: An Application to Diabetic Patients Monitoring

[...]

Riccardo Bellazzi¹, Cristiana Larizza¹, Alberto Riva•Institutions (1)

University of Pavia¹

4 Aug 1997

TL;DR: A new approach for the intelligent analysis of longitudinal data coming from diabetic patients home monitoring is presented, exploiting temporal abstractions to pre-process the raw data and to obtain a new time series of abstract episodes, whose features are then interpreted through statistical and probabilistic techniques.

...read moreread less

Abstract: In this paper we present a new approach for the intelligent analysis of longitudinal data coming from diabetic patients home monitoring. This approach consists in exploiting temporal abstractions to pre-process the raw data and to obtain a new time series of abstract episodes, whose features are then interpreted through statistical and probabilistic techniques. We finally show the application of this methodology on the data of two diabetic patients monitored for six months.

...read moreread less

Book Chapter•10.1007/BFB0052842•

Building Simple Models: A Case Study with Decision Trees

[...]

David Jensen¹, Tim Oates¹, Paul R. Cohen¹•Institutions (1)

University of Massachusetts Amherst¹

4 Aug 1997

TL;DR: Building correctly-sized models is a central challenge for induction algorithms, and under a broad range of circumstances, these approaches exhibit a nearly linear relationship between training set size and tree size, even after accuracy has ceased to increase.

...read moreread less

Abstract: Building correctly-sized models is a central challenge for induction algorithms. Many approaches to decision tree induction fail this challenge. Under a broad range of circumstances, these approaches exhibit a nearly linear relationship between training set size and tree size, even after accuracy has ceased to increase. These algorithms fail to adjust for the statistical effects of comparing multiple subtrees. Adjusting for these effects produces trees with little or no excess structure.

...read moreread less

Proceedings Article•10.5555/2639267.2639270•

Compound Key Word Generation from Document Databases Using A Hierarchical Clustering ART Model

[...]

MuòozAlberto

1 Jan 1997

TL;DR: The growing availability of databases on the information highways motivates the development of new processing tools able to deal with a heterogeneous and changing information environment.

...read moreread less

Book Chapter•10.1007/BFB0052856•

Genetic Fuzzy Clustering by Means of Discovering Membership Functions

[...]

Meltem Turhan¹•Institutions (1)

Middle East Technical University¹

4 Aug 1997

TL;DR: A comparative study of the results against the quotations in literature reveals that the standard c-means FC technique is outperformed by the proposed technique in the count of misclassifications aspect.

...read moreread less

Abstract: It has been observed that in the previous Genetic Algorithms (GA) based Fuzzy Clustering (FC) works only some of the parameters of an FC system are developed. Here, a new approach is proposed to develop directly the membership functions for the clusters using GA. This new technique is implemented and tested on common test data. A comparative study of the results against the quotations in literature reveals that the standard c-means FC technique is outperformed by the proposed technique in the count of misclassifications aspect.

...read moreread less

Book Chapter•10.1007/BFB0052845•

A Systematic Description of Greedy Optimisation Algorithms for Cost Sensitive Generalisation

[...]

Maarten van Someren¹, Cristina Torres¹, Floor Verdenius•Institutions (1)

University of Amsterdam¹

4 Aug 1997

TL;DR: A framework is presented that systematically describes problems that involve construction of decision trees or rules, optimising accuracy as well as measurement- and misclassification costs, and how this framework can be used to configure greedy algorithms for constructing such trees or Rules.

...read moreread less

Abstract: This paper defines a class of problems involving combinations of induction and (cost) optimisation. A framework is presented that systematically describes problems that involve construction of decision trees or rules, optimising accuracy as well as measurement- and misclassification costs. It does not present any new algorithms but shows how this framework can be used to configure greedy algorithms for constructing such trees or rules. The framework covers a number of existing algorithms. Moreover, the framework can also be used to define algorithm configurations with new functionalities, as expressed in their evaluation functions.

...read moreread less

Book Chapter•10.1007/BFB0052875•

Qualitative Uncertainty Models from Random Set Theory

[...]

Olaf Wolkenhauer¹•Institutions (1)

University of Manchester¹

4 Aug 1997

TL;DR: Random Set Theory is used to build possibilistic uncertainty models from sampled data and Goodman's one-point coverage function of a class of random sets is estimated from data.

...read moreread less

Abstract: When only incomplete information about the probability distribution of an experiment is available, we may have to admit imprecision in the formulation of an uncertainty model. In this paper Random Set Theory is used to build possibilistic uncertainty models from sampled data. In particular Goodman's one-point coverage function of a class of random sets is estimated from data. Finally, we focus on an example to illustrate how from random sets induced possibility distributions may be used in the detection of changes in time-series data.

...read moreread less