Open AccessProceedings Article
Efficient algorithms for attribute-oriented induction
Hoi-Yee Hwang,Wai-Chee Fu +1 more
- 20 Aug 1995
- pp 168-173
TL;DR: The static property of the database schema and the concept hierarchies are made use to derive more efficient algorithms and the amount of disk I/O is decreased compared to the previous methods.
read more
Abstract: Data mining or knowledge discovery in databases is the search for relationships and global patterns that exist but are hidden in large databases Many different methods have been proposed and one of them is the attribute-oriented induction method In this method, domain knowledge in the form of concept hierarchies helps to generalize the concepts of the attributes in the database relations This approach has been generalized to the rule-based attribute-oriented induction The time complexity of the original algorithms is given by O(N log N), where N is the number of relevant tuples in the database In this paper, we make use of the static property of the database schema and the concept hierarchies to derive more efficient algorithms Given that the concept hierarchies and the resulting knowledge are small in size compared to the database, the complexity of our algorithm is O(N) The amount of disk I/O is decreased by O(log N) times compared to the previous methods We believe that this improvement in performance will give extra power to the attribute-oriented method
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Efficient attribute-oriented generalization for knowledge discovery from large databases
TL;DR: GDBR and FIGR are two enhancements of Attribute Oriented Generalization, a well known knowledge discovery from databases technique that are optimal and compare them to two previous algorithms, LCHR and AOI, which are O(n log n) and O(np), respectively.
85
Soft computing in ontologies and semantic web
Zongmin Ma
- 01 Jan 2006
TL;DR: A Probabilistic, Logic-Based Framework for Automated Web Directory Alignment and Automatic Thematic Categorization of Multimedia Documents using Ontological Information and Fuzzy Algebra are presented.
51
Mining Market Basket Data Using Share Measures and Characterized Itemsets
Robert J. Hilderman,Colin L. Carter,Howard J. Hamilton,Nick Cercone +3 more
- 15 Apr 1998
TL;DR: The share-confidence framework for knowledge discovery from databases is proposed which addresses the problem of mining itemsets from market basket data and suggests how characterized itemsets can be generalized according to concept hierarchies associated with the characteristic attributes.
49
Data Mining in Large Databases Using Domain Generalization Graphs
Robert J. Hilderman,Howard J. Hamilton,Nick Cercone +2 more
- 01 Nov 1999
TL;DR: This work presents serial and parallel versions of the Multi-Attribute Generalization algorithm for traversing the generalization state space described by joining the domain generalization graphs for multiple attributes, and presents the interestingness of the resulting summaries using measures based upon variance and relative entropy.
45
Efficient Rule-Based Attribute-Oriented Induction for Data Mining
David W. Cheung,H. Y. Hwang,Ada W. Fu,Jiawei Han +3 more
- 01 Sep 2000
TL;DR: An efficient algorithm to facilitate induction on the rule-based case which can avoid the anomaly is developed and performance studies have shown that the algorithm is superior than a previously proposed algorithm based on backtracking.
43
References
•Proceedings Article
Fast Algorithms for Mining Association Rules in Large Databases
Rakesh Agrawal,Ramakrishnan Srikant +1 more
- 12 Sep 1994
TL;DR: Two new algorithms for solving thii problem that are fundamentally different from the known algorithms are presented and empirical evaluation shows that these algorithms outperform theknown algorithms by factors ranging from three for small problems to more than an order of magnitude for large problems.
An effective hash-based algorithm for mining association rules
Jong Soo Park,Ming-Syan Chen,Philip S. Yu +2 more
- 22 May 1995
TL;DR: The number of candidate 2-itemsets generated by the proposed algorithm is, in orders of magnitude, smaller than that by previous methods, thus resolving the performance bottleneck, and allows us to effectively trim the transaction database size at a much earlier stage of the iterations, thereby reducing the computational cost for later iterations significantly.
1.7K
Knowledge discovery in databases: an overview
TL;DR: After a decade of fundamental interdisciplinary research in machine learning, the spadework in this field has been done; the 1990s should see the widespread exploitation of knowledge discovery as an aid to assembling knowledge bases.
Knowledge DIscovery in Databases:An Overview
William J. Frawley,Gregory Piatetsky-Shapiro,Christopher J. Matheus +2 more
- 01 Jan 1991
TL;DR: In the 1990s, the AAAI Press book Knowledge Discovery in Databases was published, and the potential benefits of this research were discussed by the contributors to the book as discussed by the authors, who hope that some of this excitement will communicate itself to "AI Magazine readers of this article".
1.2K
FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets
Christos Faloutsos,King-Ip Lin +1 more
- 22 May 1995
TL;DR: A fast algorithm to map objects into points in some k-dimensional space (k is user-defined), such that the dis-similarities are preserved, and this method is introduced from pattern recognition, namely, Multi-Dimensional Scaling (MDS).
Related Papers (5)
Jiawei Han,Yongjian Fu +1 more
- 11 Sep 1995
Jiawei Han
- 24 Oct 1994
Jong Soo Park,Ming-Syan Chen,Philip S. Yu +2 more
- 22 May 1995