Proceedings Article10.1109/SMARTIOT.2018.00055
A Hybrid Feature Selection Algorithm For Classification Unbalanced Data Processsing
Xue Zhang,Zhiguo Shi,Xuan Liu,Xueni Li +3 more
- 01 Aug 2018
- pp 269-275
17
TL;DR: A hybrid feature selection algorithm is proposed to process the two classification unbalanced data problem and multi classification problem and its results show that the area under receiver operating characteristic curve for two classifications and the accuracy rate forMulti classification problem have been improved compared with other models.
read more
Abstract: The performance and accuracy of classifier are affected by the result of feature selection directly. Based on the one-class F-Score feature selection and the improved F-Score feature selection and genetic algorithm, combined with machine learning methods like the K nearest neighbor, support vector machine, random forest, naive Bayes, a hybrid feature selection algorithm is proposed to process the two classification unbalanced data problem and multi classification problem. Compared with the traditional machine learning algorithm, it can search in wider feature space and promote classifier to deal with the characteristics of unbalanced data sets according to heuristic rules, which can handle the problem of unbalanced classification better. The experiment results show that the area under receiver operating characteristic curve for two classifications and the accuracy rate for multi classification problem have been improved compared with other models
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
A fog computing data reduce level to enhance the cloud of things performance
Tarek Moulahi,Tarek Moulahi,Salim El Khediri,Salim El Khediri,Rehan Ullah Khan,Salah Zidi,Salah Zidi +6 more
TL;DR: It is believed that data reduction can provide a blueprint for avoiding unnecessary data storage and processing and demonstrates the efficacy of the reduced ML models.
14
Enhancing cloud of things performance by avoiding unnecessary data through artificial intelligence tools
Sami Mahfoudhi,Musheera Frehat,Tarek Moulahi +2 more
- 24 Jun 2019
TL;DR: This research focuses on how to use artificial intelligence to segregate unnecessary data collected by things, to avoid unnecessary charging of storage and processing resources of things as well as of cloud.
12
An intent classification method for questions in "Treatise on Febrile diseases" based on TinyBERT-CNN fusion model
TL;DR: In this paper , a knowledge distillation-based bidirectional Transformer encoder combined with a convolutional neural network model (TinyBERT-CNN) was used for the task of question intent classification in "Treatise on Febrile Diseases", which used TinyBERT as an embedding and encoding layer to obtain the global vector information of the text and then completed the intent classification by feeding the encoded feature information into the CNN.
11
Comparative Study of Embedded Feature Selection Methods on Microarray Data
Hind Hamla,Khadoudja Ghanem +1 more
- 25 Jun 2021
TL;DR: In this paper, the authors compared the performance of five embedded feature selection methods namely decision tree, random forest, lasso, ridge, and SVM-RFE in the classification of microarray data.
7
Feature selection based on a hybrid simplified particle swarm optimization algorithm with maximum separation and minimum redundancy
TL;DR: A hybrid simplified PSO-based feature selection algorithm with the elite strategy (HECSPSO) is proposed, which can achieve a feature subset with better performance, and is a highly competitive algorithm for feature selection.
7
References
A study of the behavior of several methods for balancing machine learning training data
TL;DR: This work performs a broad experimental evaluation involving ten methods, three of them proposed by the authors, to deal with the class imbalance problem in thirteen UCI data sets, and shows that, in general, over-sampling methods provide more accurate results than under-sampled methods considering the area under the ROC curve (AUC).
Mining with rarity: a unifying framework
TL;DR: It is demonstrated that rare classes and rare cases are very similar phenomena---both forms of rarity are shown to cause similar problems during data mining and benefit from the same remediation methods.
A Branch and Bound Algorithm for Feature Subset Selection
TL;DR: In this paper, a branch and bound-based feature subset selection algorithm is proposed to select the best subset of m features from an n-feature set without exhaustive search, which is computationally computationally unfeasible.
1.4K
A survey of network anomaly detection techniques
TL;DR: This paper presents an in-depth analysis of four major categories of anomaly detection techniques which include classification, statistical, information theory and clustering and evaluates effectiveness of different categories of techniques.
1.4K
Related Papers (5)
Cecille Freeman,Dana Kulic,Otman A. Basir +2 more
- 21 Nov 2011
Eva Tuba,Ivana Strumberger,Nebojsa Bacanin,Raka Jovanovic,Milan Tuba +4 more
- 10 Jun 2019