Efficient algorithm for big data clustering on single machine
103
TL;DR: A new parallel clustering algorithm based on the k-means algorithm that significantly reduces the exponential growth of computations and splits a dataset into batches while preserving the characteristics of the initial dataset and increasing the clustering speed.
read more
About: This article is published in CAAI Transactions on Intelligence Technology. The article was published on 01 Mar 2020. and is currently open access. The article focuses on the topics: Cluster analysis & k-means clustering.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
The Revolution of Blockchain: State-of-the-Art and Research Challenges
TL;DR: The fundamentals of Blockchain, the technology or working procedure of Blockchain including many applications in several fields are discussed and future work directions and open research challenges in the domain of Blockchain have been discussed in detail.
175
EHDHE: Enhancing security of healthcare documents in IoT-enabled digital healthcare ecosystems using blockchain
Pratima Sharma,Suyel Namasudra,Rubén González Crespo,Javier Parra Fuente,Munesh Chandra Trivedi +4 more
TL;DR: In this paper , a secure blockchain-based proposed application (PA) is designed to generate, maintain, and validate healthcare certificates, which acts as a communication medium between the backend blockchain network and application entities like hospitals, patients, doctors, and IoT devices to create and verify medical certificates.
130
Securing Multimedia by Using DNA-Based Encryption in the Cloud Computing Environment
TL;DR: A novel DNA-based encryption scheme is proposed in this article for protecting multimedia files in the cloud computing environment and the efficiency of the proposed scheme over some well-known existing schemes is shown.
112
Nonlinear Neural Network Based Forecasting Model for Predicting COVID-19 Cases.
TL;DR: In this paper, a Nonlinear Autoregressive Neural Network Time Series (NAR-NNTS) model is proposed for predicting confirmed, recovered and death cases of COVID-19 outbreak.
References
Data clustering: 50 years beyond K-means
Anil K. Jain
- 01 Jun 2010
TL;DR: A brief overview of clustering is provided, well known clustering methods are summarized, the major challenges and key issues in designing clustering algorithms are discussed, and some of the emerging and useful research directions are pointed out.
8.4K
Web-scale k-means clustering
D. Sculley
- 26 Apr 2010
TL;DR: This work proposes the use of mini-batch optimization for k-means clustering, which reduces computation cost by orders of magnitude compared to the classic batch algorithm while yielding significantly better solutions than online stochastic gradient descent.
Data Classification: Algorithms and Applications
Charu C. Aggarwal
- 25 Jul 2014
TL;DR: Data Classification: Algorithms and Applications explores the underlying algorithms of classification as well as applications of classification in a variety of problem domains, including text, multimedia, social network, and biological data.
Data Clustering: Algorithms and Applications
Charu C. Aggarwal,Chandan K. Reddy +1 more
- 21 Aug 2013
TL;DR: Top researchers from around the world explore the characteristics of clustering problems in a variety of application areas and explain how to glean detailed insight from the clustering process including how to verify the quality of the underlying cluster through supervision, human intervention, or the automated generation of alternative clusters.
796
Smart Devices are Different: Assessing and MitigatingMobile Sensing Heterogeneities for Activity Recognition
Allan Stisen,Henrik Blunck,Sourav Bhattacharya,Thor Siiger Prentow,Mikkel Baun Kjærgaard,Anind K. Dey,Tobias Sonne,Mads Møller Jensen +7 more
- 01 Nov 2015
TL;DR: It is indicated that on-device sensor and sensor handling heterogeneities impair HAR performances significantly and a novel clustering-based mitigation technique suitable for large-scale deployment of HAR is proposed, where heterogeneity of devices and their usage scenarios are intrinsic.
796