Top 38 papers published in the topic of Weighted Majority Algorithm in 2011

Showing papers on "Weighted Majority Algorithm published in 2011"

Journal Article•10.1007/S10472-011-9230-5•

Discovering the suitability of optimisation algorithms by learning from evolved instances

[...]

Kate Smith-Miles¹, Jano van Hemert²•Institutions (2)

Monash University¹, University of Edinburgh²

01 Feb 2011-Annals of Mathematics and Artificial Intelligence

TL;DR: This work uses an evolutionary algorithm to evolve instances that are uniquely easy or hard for each algorithm, thus providing a more direct method for studying the relative strengths and weaknesses of each algorithm.

...read moreread less

Abstract: The suitability of an optimisation algorithm selected from within an algorithm portfolio depends upon the features of the particular instance to be solved. Understanding the relative strengths and weaknesses of different algorithms in the portfolio is crucial for effective performance prediction, automated algorithm selection, and to generate knowledge about the ideal conditions for each algorithm to influence better algorithm design. Relying on well-studied benchmark instances, or randomly generated instances, limits our ability to truly challenge each of the algorithms in a portfolio and determine these ideal conditions. Instead we use an evolutionary algorithm to evolve instances that are uniquely easy or hard for each algorithm, thus providing a more direct method for studying the relative strengths and weaknesses of each algorithm. The proposed methodology ensures that the meta-data is sufficient to be able to learn the features of the instances that uniquely characterise the ideal conditions for each algorithm. A case study is presented based on a comprehensive study of the performance of two heuristics on the Travelling Salesman Problem. The results show that prediction of search effort as well as the best performing algorithm for a given instance can be achieved with high accuracy.

...read moreread less

108 citations

Proceedings Article•10.1109/ITSC.2011.6082823•

Reinforcement learning with average cost for adaptive control of traffic lights at intersections

[...]

L A Prashanth¹, Shalabh Bhatnagar¹•Institutions (1)

Indian Institute of Science¹

18 Nov 2011

TL;DR: It is observed that whereas (as expected) on a two-junction corridor, the full state representation algorithm shows the best results, the algorithm PG-AC-TLC is not implementable on larger road networks.

...read moreread less

Abstract: We propose for the first time two reinforcement learning algorithms with function approximation for average cost adaptive control of traffic lights. One of these algorithms is a version of Q-learning with function approximation while the other is a policy gradient actor-critic algorithm that incorporates multi-timescale stochastic approximation. We show performance comparisons on various network settings of these algorithms with a range of fixed timing algorithms, as well as a Q-learning algorithm with full state representation that we also implement. We observe that whereas (as expected) on a two-junction corridor, the full state representation algorithm shows the best results, this algorithm is not implementable on larger road networks. The algorithm PG-AC-TLC that we propose is seen to show the best overall performance.

...read moreread less

96 citations

Book Chapter•10.1007/978-3-642-23780-5_14•

Adaptive boosting for transfer learning using dynamic updates

[...]

Samir Al-Stouhi¹, Chandan K. Reddy¹•Institutions (1)

Wayne State University¹

5 Sep 2011

TL;DR: A dynamic factor is incorporated into TrAdaBoost to make it meet its intended design of incorporating the advantages of both AdaBoost and the "Weighted Majority Algorithm", and is applied as a "correction factor" that significantly improves the classification performance.

...read moreread less

Abstract: Instance-based transfer learning methods utilize labeled examples from one domain to improve learning performance in another domain via knowledge transfer. Boosting-based transfer learning algorithms are a subset of such methods and have been applied successfully within the transfer learning community. In this paper, we address some of the weaknesses of such algorithms and extend the most popular transfer boosting algorithm, TrAdaBoost. We incorporate a dynamic factor into TrAdaBoost to make it meet its intended design of incorporating the advantages of both AdaBoost and the "Weighted Majority Algorithm". We theoretically and empirically analyze the effect of this important factor on the boosting performance of TrAdaBoost and we apply it as a "correction factor" that significantly improves the classification performance. Our experimental results on several real-world datasets demonstrate the effectiveness of our framework in obtaining better classification results.

...read moreread less

91 citations

Proceedings Article•10.1145/1993806.1993814•

Toward more localized local algorithms: removing assumptions concerning global knowledge

[...]

Amos Korman¹, Jean-Sébastien Sereni¹, Laurent Viennot¹•Institutions (1)

Paris Diderot University¹

6 Jun 2011

TL;DR: In this article, the authors proposed a method for transforming a non-uniform local algorithm into a uniform one, and the resulting algorithm enjoys the same asymptotic running time as the original local algorithm.

...read moreread less

Abstract: Numerous sophisticated local algorithm were suggested in the literature for various fundamental problems. Notable examples are the MIS and (Δ+1)-coloring algorithms by Barenboim and Elkin [6], by Kuhn [22], and by Panconesi and Srinivasan [33], as well as the OΔ2-coloring algorithm by Linial [27]. Unfortunately, most known local algorithms (including, in particular, the aforementioned algorithms) are non-uniform, that is, they assume that all nodes know good estimations of one or more global parameters of the network, e.g., the maximum degree Δ or the number of nodes n.This paper provides a rather general method for transforming a non-uniform local algorithm into a uniform one. Furthermore, the resulting algorithm enjoys the same asymptotic running time as the original non-uniform algorithm. Our method applies to a wide family of both deterministic and randomized algorithms. Specifically, it applies to almost all of the state of the art non-uniform algorithms regarding MIS and Maximal Matching, as well as to many results concerning the coloring problem. (In particular, it applies to all aforementioned algorithms.)To obtain our transformations we introduce a new distributed tool called pruning algorithms, which we believe may be of independent interest.

...read moreread less

41 citations

Book Chapter•10.1007/978-3-642-18129-0_1•

Some Analysis and Research of the AdaBoost Algorithm

[...]

Peng Wu¹, Hui Zhao¹•Institutions (1)

Henan University¹

8 Jan 2011

TL;DR: This paper primarily makes some relevant introduction of Adaboost, and conducts an analysis and research of several aspects of the algorithm itself.

...read moreread less

Abstract: The AdaBoost algorithm enables weak classifiers to enhance their performance by establishing the set of multiple classifiers, and since it automatically adapts to the error rate of the basic algorithm in training through dynamic regulation of the weight of each sample, a wide range of concern has been aroused. This paper primarily makes some relevant introduction of Adaboost, and conducts an analysis and research of several aspects of the algorithm itself.

...read moreread less

37 citations

Journal Article•10.1016/J.JSS.2011.02.038•

Controversy corner: Optimized QoS-aware replica placement heuristics and applications in astronomy data grid

[...]

Zhihui Du¹, Jingkun Hu², Yinong Chen³, Zhili Cheng¹, Xiaoying Wang¹ - Show less +1 more•Institutions (3)

Tsinghua University¹, Peking University², Arizona State University³

01 Jul 2011-Journal of Systems and Software

TL;DR: This paper proposes two algorithms that can obtain better results within the given time period in the Quality-of-Service (QoS)-aware replica placement problem in a general graph model and uses random heuristic algorithms to generate initial population to avoid enormous useless searching.

...read moreread less

24 citations

Journal Article•10.1016/J.NEUCOM.2011.06.011•

Three-parameter sequential minimal optimization for support vector machines

[...]

Yih-Lon Lin¹, Jer-Guang Hsieh¹, Hsu-Kun Wu², Jyh-Horng Jeng¹•Institutions (2)

I-Shou University¹, National Sun Yat-sen University²

01 Oct 2011-Neurocomputing

TL;DR: Simulation results demonstrate that the 3PSMO outperforms the 2PSMO algorithm significantly in both executing time and computation complexity, which implies that the maximum can be attained more efficiently by 3PS MO algorithm.

...read moreread less

15 citations

Journal Article•

Polyceptron a polyhedral learning algorithm

[...]

Naresh Manwani, P. S. Sastry

08 Jul 2011-arXiv: Learning

TL;DR: A new algorithm for learning polyhedral classifiers which is a Perception like algorithm which updates the parameters only when the current classifier misclassifies any training data is proposed.

...read moreread less

Abstract: In this paper we propose a new algorithm for learning polyhedral classifiers which we call as Polyceptron. It is a Perception like algorithm which updates the parameters only when the current classifier misclassifies any training data. We give both batch and online version of Polyceptron algorithm. Finally we give experimental results to show the effectiveness of our approach.

...read moreread less

14 citations

Proceedings Article•10.1061/41173(414)34•

Alternative Approaches for Solving the Sensor Placement Problem in Large Networks

[...]

R. Pinzinger¹, Jochen Deuerlein, Andreas Wolters, Angus R. Simpson¹•Institutions (1)

University of Adelaide¹

19 May 2011

TL;DR: The first Greedy algorithm approaches the question on finding those nodes which are the most sensitive to variations in pressure and are thereby ideal places to monitor the hydraulic state of a water distribution network.

...read moreread less

Abstract: Positioning sensors in a water supply network is a NP–hard task. We propose three algorithms – one based on integer linear programming (ILP) and the other two based on the Greedy paradigm. We apply these algorithms to real case networks and com-pare the results of these algorithms with the results of an algorithm based on NSGA II, a genetic algorithm. We come to the conclusion that our algorithms outperform NSGA II in every single case. The algorithm based on linear integer programming may be applied as a competitor to the algorithm implemented in TEVA –SPOT (Ber-ry, 2009), while the first Greedy algorithm may replace the ILP algorithm in large networks due to its faster running time. The second Greedy algorithm approaches the question on finding those nodes which are the most sensitive to variations in pressure and are thereby ideal places to monitor the hydraulic state of a water distribution network. KEYWORDS Graph Theory, Sensor location layout, Greedy Algorithm, Genetic Algorithm, Integ-er Linear Programming, Sensitivity

...read moreread less

14 citations

Proceedings Article•10.5591/978-1-57735-516-8/IJCAI11-456•

Learning linear and kernel predictors with the 0-1 loss function

[...]

Shai Shalev-Shwartz¹, Ohad Shamir², Karthik Sridharan³•Institutions (3)

Hebrew University of Jerusalem¹, Microsoft², Toyota Technological Institute³

16 Jul 2011

TL;DR: This paper describes and analyzes a new algorithm for learning linear or kernel predictors with respect to the 0-1 loss function, and proves a hardness result, showing that under a certain cryptographic assumption, no algorithm can learn such classifiers in time polynomial in L.

...read moreread less

Abstract: Some of the most successful machine learning algorithms, such as Support Vector Machines, are based on learning linear and kernel predictors with respect to a convex loss function, such as the hinge loss. For classification purposes, a more natural loss function is the 0-1 loss. However, using it leads to a non-convex problem for which there is no known efficient algorithm. In this paper, we describe and analyze a new algorithm for learning linear or kernel predictors with respect to the 0-1 loss function. The algorithm is parameterized by L, which quantifies the effective width around the decision boundary in which the predictor may be uncertain. We show that without any distributional assumptions, and for any fixed L, the algorithm runs in polynomial time, and learns a classifier which is worse than the optimal such classifier by at most e. We also prove a hardness result, showing that under a certain cryptographic assumption, no algorithm can learn such classifiers in time polynomial in L.

...read moreread less

12 citations

Journal Article•10.1007/S11063-011-9173-1•

AVLR-EBP: A Variable Step Size Approach to Speed-up the Convergence of Error Back-Propagation Algorithm

[...]

Arman Didandeh¹, Nima Mirbakhsh¹, Ali Amiri², Mahmood Fathy³•Institutions (3)

Institute for Advanced Studies in Basic Sciences¹, University of Zanjan², Iran University of Science and Technology³

01 Apr 2011-Neural Processing Letters

TL;DR: An Adaptive Variable Learning Rate EBP algorithm is proposed to attack the challenging problem of reducing the convergence time in an EBP algorithms, aiming to have a high-speed convergence in comparison with standard E BP algorithm.

...read moreread less

Abstract: A critical issue of Neural Network based large-scale data mining algorithms is how to speed up their learning algorithm. This problem is particularly challenging for Error Back-Propagation (EBP) algorithm in Multi-Layered Perceptron (MLP) Neural Networks due to their significant applications in many scientific and engineering problems. In this paper, we propose an Adaptive Variable Learning Rate EBP algorithm to attack the challenging problem of reducing the convergence time in an EBP algorithm, aiming to have a high-speed convergence in comparison with standard EBP algorithm. The idea is inspired from adaptive filtering, which leaded us into two semi-similar methods of calculating the learning rate. Mathematical analysis of AVLR-EBP algorithm confirms its convergence property. The AVLR-EBP algorithm is utilized for data classification applications. Simulation results on many well-known data sets shall demonstrate that this algorithm reaches to a considerable reduction in convergence time in comparison to the standard EBP algorithm. The proposed algorithm, in classifying the IRIS, Wine, Breast Cancer, Semeion and SPECT Heart datasets shows a reduction of the learning epochs relative to the standard EBP algorithm.

...read moreread less

Posted Content•

Suboptimal Solution Path Algorithm for Support Vector Machine

[...]

Masayuki Karasuyama¹, Ichiro Takeuchi¹•Institutions (1)

Nagoya Institute of Technology¹

03 May 2011-arXiv: Learning

TL;DR: It is shown that the authors' suboptimal solutions can be interpreted as the solution of aperturbed optimization problem from the original one and some theoretical analyses of the algorithm are provided based on this novel interpretation.

...read moreread less

Abstract: We consider a suboptimal solution path algorithm for the Support Vector Machine. The solution path algorithm is an effective tool for solving a sequence of a parametrized optimization problems in machine learning. The path of the solutions provided by this algorithm are very accurate and they satisfy the optimality conditions more strictly than other SVM optimization algorithms. In many machine learning application, however, this strict optimality is often unnecessary, and it adversely affects the computational efficiency. Our algorithm can generate the path of suboptimal solutions within an arbitrary user-specified tolerance level. It allows us to control the trade-off between the accuracy of the solution and the computational cost. Moreover, We also show that our suboptimal solutions can be interpreted as the solution of a \emph{perturbed optimization problem} from the original one. We provide some theoretical analyses of our algorithm based on this novel interpretation. The experimental results also demonstrate the effectiveness of our algorithm.

...read moreread less

Proceedings Article•10.1109/CECNET.2011.5768284•

The research of test-suite reduction technique

[...]

Cui Donghua¹, Yin Wenjie¹•Institutions (1)

Taiyuan University of Technology¹

16 Apr 2011

TL;DR: The results show that this algorithm can significantly reduce the size and the cost of the test-suite, and achieved higher effectiveness of test-Suite minimization.

...read moreread less

Abstract: Ant colony algorithm is a bionic optimization algorithm, it can solve combinatorial problems effectively. For the problem of the test suite reduction, this algorithm could find the balance point between the speed and the accuracy of solution. Unlike other existing algorithms, this algorithm used test cost criteria, as well as the test coverage criteria. Finally, the paper presented the results, the results is given by the others classical algorithms compared with this algorithms. The results show that this algorithm can significantly reduce the size and the cost of the test-suite, and achieved higher effectiveness of test-suite minimization.

...read moreread less

Proceedings Article•10.1109/HICSS.2011.25•

A Randomized Algorithm for Maximizing the Diversity of Recommendations

[...]

Khalid Alodhaibi¹, Alexander Brodsky¹, George A. Mihaila²•Institutions (2)

George Mason University¹, IBM²

4 Jan 2011

TL;DR: The experimental results show that the proposed algorithm is highly efficient computationally and that in terms of diversity, it consistently outperforms the two competitive algorithms and converges to the optimal solutions on cases run with the exhaustive algorithm in under 100 ms.

...read moreread less

Abstract: This paper proposes a new approach, and studies an algorithm to address the Maximum Diversity Problem (MDP) of recommendations for composite products or services. First, the proposed approach is based on constructing and using a multi-dimensional diversity feature space, which is separate from the utility space used for utility elicitation. Second, we introduce a randomized algorithm, which is based on iterative relaxation of selections by the Greedy algorithm with an exponential probability distribution. The algorithm produces a competitive solution with respect to finding a diverse set from candidate recommendations. Finally, we conduct an experimental study to compare the efficacy and efficiency of the proposed algorithm with two broadly used diversity algorithms, as well as with the exhaustive algorithm, which we could only compute for sets of up to seven returned recommendations. The experimental results show that the proposed algorithm is highly efficient computationally and that in terms of diversity, it consistently outperforms the two competitive algorithms and converges to the optimal solutions on cases run with the exhaustive algorithm in under 100 ms.

...read moreread less

Book Chapter•10.1007/978-3-642-29116-6_4•

An online algorithm optimally self-tuning to congestion for power management problems

[...]

Wolfgang W. Bein¹, Naoki Hatta², Nelson Hernandez-Cons², Hiro Ito², Shoji Kasahara², Jun Kawahara - Show less +2 more•Institutions (2)

University of Nevada, Las Vegas¹, Kyoto University²

8 Sep 2011

TL;DR: By relaxing the worst case competitive ratio of the online algorithm to 2+e, where e is an arbitrary small constant, the algorithm automatically tunes itself to slackness degree and gives better performance than the optimal 2-competitive algorithm for real world inputs.

...read moreread less

Abstract: We consider the classical power management problem: There is a device which has two states ON and OFF and one has to develop a control algorithm for changing between these states as to minimize (energy) cost when given a sequence of service requests. Although an optimal 2-competitive algorithm exists, that algorithm does not have good performance in many practical situations, especially in case the device is not used frequently. To take the frequency of device usage into account, we construct an algorithm based on the concept of "slackness degree." Then by relaxing the worst case competitive ratio of our online algorithm to 2+e, where e is an arbitrary small constant, we make the algorithm flexible to slackness. The algorithm thus automatically tunes itself to slackness degree and gives better performance than the optimal 2-competitive algorithm for real world inputs. In addition to worst case competitive ratio analysis, a queueing model analysis is given and computer simulations are reported, confirming that the performance of the algorithm is high.

...read moreread less

Journal Article•

A classification method for class-imbalanced data

[...]

Zhou Wei-xiong

01 Jan 2011-Journal of Shandong University

TL;DR: In this study, the improved NaiveBayes algorithm was the base classification, and the base classifiers were fused by the AdaBoost algorithm with improved weight for voting to improve the classification performance for minority class in an unbalanced dataset.

...read moreread less

Abstract: To improve the classification performance for minority class in an unbalanced dataset,an improved AdaBoost algorithm(UnAdaBoost algorithm) for an unbalanced dataset was proposed.This algorithm could make the base classification better in order to raise the classification efficienly for the minority class,while to a certain extent losing the accuracy for the majority class.This algorithm could also ensemble the base classifications to make up loss of accuracy in majority class.The performance for the minority class could be improved and the accuracy for majority class would not be lost.In this study,the improved NaiveBayes algorithm was the base classification,and the base classifiers were fused by the AdaBoost algorithm with improved weight for voting.Experimental results showed that the UnAdaBoost algorithm was effective for an unbalanced dataset compared with the AdaBoost algorithm.

...read moreread less

Proceedings Article•10.1109/MLSP.2011.6064551•

Multi-parametric solution-path algorithm for instance-weighted support vector machines

[...]

Masayuki Karasuyama¹, Naoyuki Harada², Masashi Sugiyama¹, Ichiro Takeuchi²•Institutions (2)

Tokyo Institute of Technology¹, Nagoya Institute of Technology²

1 Nov 2011

TL;DR: This paper develops an algorithm that can efficiently and exactly update the weighted SVM solutions for arbitrary change of instance weights and introduces a parametrization which allows us to find the breakpoints in high-dimensional space easily.

...read moreread less

Abstract: An instance-weighted variant of the support vector machine (SVM) has attracted considerable attention recently since they are useful in various machine learning tasks such as non-stationary data analysis, heteroscedastic data modeling, transfer learning, learning to rank, and transduction. An important challenge in these scenarios is to overcome the computational bottleneck—instance weights often change dynamically or adaptively, and thus the weighted SVM solutions must be repeatedly computed. In this paper, we develop an algorithm that can efficiently and exactly update the weighted SVM solutions for arbitrary change of instance weights. Technically, this contribution can be regarded as an extension of the conventional solution-path algorithm for a single regularization parameter to multiple instance-weight parameters. However, this extension gives rise to a significant problem that breakpoints (at which the solution path turns) have to be identified in high-dimensional space. To facilitate this, we introduce a parametric representation of instance weights which allows us to find the breakpoints in high-dimensional space easily. Despite its simplicity, our parametrization covers various important machine learning tasks and it widens the applicability of the solution-path algorithm. Through extensive experiments on various practical applications, we demonstrate the usefulness of the proposed algorithm.

...read moreread less

Proceedings Article•

Adaptive functional module selection using machine learning: Framework for intelligent robotics

[...]

Martin Lukac¹, Michitaka Kameyama¹•Institutions (1)

Tohoku University¹

27 Oct 2011

TL;DR: This paper proposes a machine learning based approach for the real-time selection of computational resources (algorithms) based on both the high level objectives of the robot as well as on the low level environmental requirements (image quality, etc.).

...read moreread less

Abstract: In robotics, it is a common problem that for a given task many algorithms are available. For a particular environmental context and some computational constraints some algorithms will perform better and others will perform worse. Consequently, a robot, evolving in a real world environment where both the context and the constraints change in real time, should be able to select in real time algorithms that will provide it with the most accurate world description as well as will allow it to extract the currently most vital information and artifacts. In this paper we propose a machine learning based approach for the real-time selection of computational resources (algorithms) based on both the high level objectives of the robot as well as on the low level environmental requirements (image quality, etc.). The learning mechanism described is using a Genetic Algorithm and the learning method is based on supervised learning; an initial set of algorithms with input data is provided as examples that are used for learning.

...read moreread less

Book Chapter•10.1007/978-3-642-21765-4_2•

A Community Detecting Algorithm in Directed Weighted Networks

[...]

Hongtao Liu¹, Xiao Qin¹, Hongfeng Yun¹, Yu Wu¹•Institutions (1)

Chongqing University of Posts and Telecommunications¹

1 Jan 2011

TL;DR: The impact factors of in-degree and out-degree are introduced into community detection, and the directed weighted degree is used to measure the importance of the node to meet the trend of standard entropy better.

...read moreread less

Abstract: In this paper, the impact factors of in-degree and out-degree are introduced into community detection, and the directed weighted degree is used to measure the importance of the node. Based on the core nodes, a community detecting algorithm for directed and weighted networks is proposed. Then the community detection on the blog site of Sciencenet is conducted with standard structure entropy as a measure. Experimental results demonstrate that in directed and weighted networks, the proposed algorithm is efficient with shorter execution time. By comparing with the classical algorithm, the detecting results of our algorithm meet the trend of standard entropy better. It means the algorithm proposed is improved to some extent.

...read moreread less

Proceedings Article•10.1109/ICECTECH.2011.5941667•

Scaling k-medoid algorithm for clustering large categorical dataset and its performance analysis

[...]

Ritesh Joshi, Anil Patidar, Surendra Mishra

8 Apr 2011

TL;DR: Experimental results show that the proposed k-medoid algorithm may reduce the number of distance calculations by a factor of more lhan a thousand limes when compared to existing algorithms while producing clusters of comparable quality.

...read moreread less

Abstract: Scalable data mining algorithms have become crucial to efficiently support KDD processes on large datasets. The k-medoid is one of the partitioning algorithms used for the purpose of clustering. We show that basic k-medoid algorithm is very much time consuming for large dataset. Instead we present the advanced algorithm which performs much better than known algorithm. In addition to presenting detailed experimental results for advanced k-medoid algorithm, we also conduct an experimental study with real life data sets to demonstrate the effectiveness of our technique. We address the task of scaling up k-medoids based algorithm through the utilization of memoization technique. Experimental results based on several datasets, including synthetic and real data, show that the proposed algorithm may reduce the number of distance calculations by a factor of more lhan a thousand limes when compared to existing algorithms while producing clusters of comparable quality.

...read moreread less

Proceedings Article•10.1109/AISP.2011.5960991•

An incremental spam detection algorithm

[...]

Elham Ghanbari¹, Hamid Beigy¹•Institutions (1)

Sharif University of Technology¹

15 Jun 2011

TL;DR: A new algorithm based on incremental learning is introduced that composes new knowledge from new training data with previous knowledge by combining classifiers based on weighted majority voting and outperforms other related incremental algorithms and non-incremental algorithms.

...read moreread less

Abstract: The voluminous of the e-mails are spam. Several algorithms are represented for spam detection based on batch learning. In this paper, a new algorithm based on incremental learning is introduced. The algorithm composes new knowledge from new training data with previous knowledge by combining classifiers based on weighted majority voting. The experiment results show that the proposed algorithm outperforms other related incremental algorithms and non-incremental algorithms.

...read moreread less

Proceedings Article•10.1109/ICSMC.2011.6084214•

Patterned Growth algorithm using Hub-Averaging without pre-assigned weights

[...]

B. Chandra¹, Shalini Bhaskar¹•Institutions (1)

Indian Institutes of Technology¹

21 Nov 2011

TL;DR: A new approach HAP-G growth (Hub-Averaging Pattern-Growth) has been proposed for WARM without pre-assigned weights and for large datasets, there is drastic reduction in the computational time for the proposed algorithm and at the same time drift effect is reduced to a great extent.

...read moreread less

Abstract: The concept of finding frequent itemsets without pre-assigned weights is of great importance in Association Rule Mining (ARM). The prime advantage of this approach is that weights can be derived from the dataset itself rather than being given by domain expert. The modification of Apriori algorithm for Weighted Association Rule Mining (WARM) without pre-assigned weights using HITS algorithm has been attempted in the past. However, drift effect is a major limitation of HITS algorithm. In this paper, a new approach HAP-Growth (Hub-Averaging Pattern-Growth) has been proposed for WARM without pre-assigned weights. HAP-Growth algorithm generates frequent itemsets using Hub-Averaging in conjunction with pattern tree approach. Performance of the proposed algorithm has been compared with HITS algorithm in conjunction with pattern tree approach and the existing algorithm. Experimental studies have been carried out on large number of synthetic datasets of varying sizes (generated using IBM Synthetic Data Generator) and real life datasets (taken from UCI Machine Learning Repository and other sources). It is observed that for large datasets, there is drastic reduction in the computational time for the proposed algorithm and at the same time drift effect is reduced to a great extent.

...read moreread less

Implication of image processing algorithm in remote sensing and GIS applications

[...]

Mubeena Pathan, Kamaruzaman Jusoff, Alias Mohd Sood, Yaakob Razali, Barkatullah Qureshi - Show less +1 more

1 Jan 2011

TL;DR: This paper generally analyzed and branch out algorithms to perceive their limitations and delimitation and concluded that Greedy algorithm is comparatively better than other algorithms regarding the optimal solution.

...read moreread less

Abstract: An algorithm solves the complex problems more efficiently and consistently. The traditional ways of solving the problems, have been replaced, by several new algorithms. The selection of an appropriate algorithm for any given chore is an imperative issue because different algorithms are based on the different concepts. One problem can be solved in more than one way; in this regards many alternative algorithms are developed with computational proficiency. This review presents evaluation and utilization of different algorithms such as Simple Recursive Algorithm, Backtracking Algorithm, Divide and Conquer Algorithm, Dynamic Algorithm, Branch and Bound Algorithm, Brute Force Algorithm and Randomized Algorithm. This paper generally analyzed and branch out algorithms to perceive their limitations and delimitation. This review emphasizes the effects and consumption of different algorithms in different image processing applications. Minimum Spanning Tree (MST), the most functional algorithm, described exclusively by the undirected graph in which all nodes are connected. Greedy algorithms expresses as a simple solution algorithm that choose a local optimum solution at each step to achieve a global optimum. We considered the drawbacks and advantage various algorithms and concluded that Greedy algorithm is comparatively better than other algorithms regarding the optimal solution.

...read moreread less

A new algorithm for knowledge discovery from data sets using cross-entropy measurement

[...]

Ömer Akgöbek

1 Jan 2011

TL;DR: A new algorithm named REx-1C is derived from REX-1 algorithm that uses entropy in order to test effects of crossentropy on the learning phenomenon (by using accuracy and rule number) and it was observed that REX -1C algorithm produced better results compared to Rules-3 Plus, Rules-6, REX1 and C5.0 algorithms in respect to accuracy.

...read moreread less

Abstract: This study suggests a new method for selecting attributes in algorithms used for generating rules for data mining. The most common measure resorted for selection of attribute is entropy. Entropy is defined as a measure of uncertainty. According to this, the entropy of a system is higher as the uncertainty in the system. Usually the entropy is used to measure uncertainty of C4.5, CN2, CART etc. Attributes in data mining and the cross-entropy is not used frequently. Therefore a new algorithm named REX-1C is derived from REX-1 algorithm that uses entropy in order to test effects of crossentropy on the learning phenomenon (by using accuracy and rule number). Twenty data sets of different specifications and sizes which are commonly used in the machine learning field and sampled from real life were chosen to test the success of said algorithm. Using those data sets, effects of norms on accuracy of the algorithm and number of rules it produces were calculated and results were compared to Rules-3 Plus, Rules-6, REX-1 and C5.0 algorithms. According to the results achieved, it was observed that REX-1C algorithm produced better results compared to Rules-3 Plus, Rules-6, REX-1 and C5.0 algorithms in respect to accuracy.

...read moreread less

Proceedings Article•10.1109/IJCNN.2011.6033284•

On the structure of algorithm spaces

[...]

Adam H. Peterson¹, Tony Martinez², George L. Rudolph³•Institutions (3)

Adobe Systems¹, Brigham Young University², The Citadel, The Military College of South Carolina³

3 Oct 2011

TL;DR: This research uses the COD (Classifier Output Difference) distance metric for measuring how similar or different learning algorithms are, and constructs a distance matrix from the individual COD values, and uses the matrix to show the spectrum of differences among families of learning algorithms.

...read moreread less

Abstract: Many learning algorithms have been developed to solve various problems. Machine learning practitioners must use their knowledge of the merits of the algorithms they know to decide which to use for each task. This process often raises questions such as: (1) If performance is poor after trying certain algorithms, which should be tried next? (2) Are some learning algorithms the same in terms of actual task classification? (3) Which algorithms are most different from each other? (4) How different? (5) Which algorithms should be tried for a particular problem? This research uses the COD (Classifier Output Difference) distance metric for measuring how similar or different learning algorithms are. The COD quantifies the difference in output behavior between pairs of learning algorithms. We construct a distance matrix from the individual COD values, and use the matrix to show the spectrum of differences among families of learning algorithms. Results show that individual algorithms tend to cluster along family and functional lines. Our focus, however, is on the structure of relationships among algorithm families in the space of algorithms, rather than on individual algorithms. A number of visualizations illustrate these results. The uniform numerical representation of COD data lends itself to human visualization techniques.

...read moreread less

Journal Article•10.4028/WWW.SCIENTIFIC.NET/AMM.52-54.1976•

Analysis and Improvement for K-Means Algorithm

[...]

Jing Zhong Xiao¹, Li Xiao¹•Institutions (1)

Southwest University for Nationalities¹

01 Mar 2011-Applied Mechanics and Materials

TL;DR: This paper will show some improvement to K-Means algorithm, including how to choose initial center points, and how to calculate the means, which will improve the algorithm at a certain extent.

...read moreread less

Abstract: K-Means algorithm is one of the mostly used foundation algorithm in data mining, it base on a greedy clustering algorithm. This paper will introduce this algorithm and analysis. Then prove the correctness of the algorithm. And then show the productivity of this algorithm. And at last, this paper will show some improvement to K-Means algorithm, including how to choose initial center points, and how to calculate the means. This will improve the algorithm at a certain extent.

...read moreread less

Journal Article•10.14569/IJACSA.2011.021224•

Solving the MDBCS Problem Using the Metaheuric–Genetic Algorithm

[...]

Milena Bogdanović

01 Jan 2011-International Journal of Advanced Computer Science and Applications

TL;DR: The paper is given an ILP model to solve the problem MDBCS, as well as the genetic algorithm, which calculates a good enough solution for the input graph with a greater number of nodes.

...read moreread less

Abstract: The problems degree-limited graph of nodes considering the weight of the vertex or weight of the edges, with the aim to find the optimal weighted graph in terms of certain restrictions on the degree of the vertices in the subgraph. This class of combinatorial problems was extensively studied because of the implementation and application in network design, connection of networks and routing algorithms. It is likely that solution of MDBCS problem will find its place and application in these areas. The paper is given an ILP model to solve the problem MDBCS, as well as the genetic algorithm, which calculates a good enough solution for the input graph with a greater number of nodes. An important feature of the heuristic algorithms is that can approximate, but still good enough to solve the problems of exponential complexity. However, it should solve the problem heuristic algorithms may not lead to a satisfactory solution, and that for some of the problems, heuristic algorithms give relatively poor results. This is particularly true of problems for which no exact polynomial algorithm complexity. Also, heuristic algorithms are not the same, because some parts of heuristic algorithms differ depending on the situation and problems in which they are used. These parts are usually the objective function (transformation), and their definition significantly affects the efficiency of the algorithm. By mode of action, genetic algorithms are among the methods directed random search space solutions are looking for a global optimum.

...read moreread less

Journal Article•

An Improved Bayesian Network Structure Learning Algorithm Based on the Conditional Independence Test

[...]

Fan Jing¹•Institutions (1)

Yunnan University¹

01 Jan 2011-Journal of Yunnan University of Nationalities

TL;DR: For solving the defects of Bayesian network structure learning algorithm based on the conditional independence test, the paper proposes an improved algorithm that adds the mutual information between node x and y that is effective and feasible.

...read moreread less

Abstract: For solving the defects of Bayesian network structure learning algorithm based on the conditional independence test,the paper proposes an improved algorithm that adds the mutual information between node x and y.The algorithm takes into account adequately the three existing graphical structures in the theory of D-separate.The algorithm can reduce the triangulated clique and the probability of the cyclic route in a directed graph.The network structure produced by the algorithm is closer to solution.The experimental results show that the algorithm is effective and feasible.

...read moreread less

Book Chapter•10.1007/978-3-642-21524-7_43•

A research of reduction algorithm for support vector machine

[...]

Susu Liu, Limin Sun

12 Jun 2011

TL;DR: A reduction algorithm combined SVM with KNN algorithm is presented and the experiment results show that the algorithm can reduce the number of training dataset and support vectors on the condition of keeping the classification accuracy of the original training dataset.

...read moreread less

Abstract: Support vector machine is a new field of machine learning. Generalization accuracy and response time are two important criterions of support vector machine used in practical application. It is hoped that it will minimum the number of training dataset and support vectors, simplify the algorithm realization on the condition of keeping classification accuracy. Based on the above consideration, a reduction algorithm combined SVM with KNN algorithm is presented. The experiment results show that the algorithm can reduce the number of training dataset and support vectors on the condition of keeping the classification accuracy of the original training dataset.

...read moreread less

Proceedings Article•10.1109/MMAR.2011.6031363•

CN2-R: Faster CN2 with randomly generated complexes

[...]

Janis Zuters¹•Institutions (1)

University of Latvia¹

29 Sep 2011

TL;DR: The proposed modification, CN2-R, substitutes the star concept of the original algorithm with a technique of randomly generated complexes in order to substantially improve on running times without significant loss in accuracy.

...read moreread less

Abstract: Among the rule induction algorithms, the classic CN2 is still one of the most popular ones; a great amount of enhancements and improvements to it is to witness this. Despite the growing computing capacities since the algorithm was proposed, one of the main issues is resource demand. The proposed modification, CN2-R, substitutes the star concept of the original algorithm with a technique of randomly generated complexes in order to substantially improve on running times without significant loss in accuracy.

...read moreread less