Top 543 papers published in the topic of Unsupervised learning in 2024

Showing papers on "Unsupervised learning published in 2024"

Generalized Video Anomaly Event Detection: Systematic Taxonomy and Comparison of Deep Models

[...]

Yang Liu, Dingkang Yang, Yan Wang, Jing Liu, Jun Li, Azzedine Boukerche, Peng Sun, Liang Song - Show less +4 more

09 Apr 2024-ACM Computing Surveys

TL;DR: GVAED survey encompassing supervised and weakly-supervised approaches for video anomaly detection, introducing a taxonomy and discussing challenges and future directions.

...read moreread less

Abstract: Video Anomaly Detection (VAD) serves as a pivotal technology in the intelligent surveillance systems, enabling the temporal or spatial identification of anomalous events within videos. While existing reviews predominantly concentrate on conventional unsupervised methods, they often overlook the emergence of weakly-supervised and fully-unsupervised approaches. To address this gap, this survey extends the conventional scope of VAD beyond unsupervised methods, encompassing a broader spectrum termed Generalized Video Anomaly Event Detection (GVAED). By skillfully incorporating recent advancements rooted in diverse assumptions and learning frameworks, this survey introduces an intuitive taxonomy that seamlessly navigates through unsupervised, weakly-supervised, supervised and fully-unsupervised VAD methodologies, elucidating the distinctions and interconnections within these research trajectories. In addition, this survey facilitates prospective researchers by assembling a compilation of research resources, including public datasets, available codebases, programming tools, and pertinent literature. Furthermore, this survey quantitatively assesses model performance, delves into research challenges and directions, and outlines potential avenues for future exploration.

...read moreread less

32 citations

Preprint•10.31219/osf.io/qtmcs•

Supervised Learning - A Systematic Literature Review

[...]

Salim Dridi

13 Jun 2024

TL;DR: Supervised learning involves pre-training a model on a labeled dataset and optimizing class label models using predictor features. It entails classification and regression tasks.

...read moreread less

Abstract: Machine Learning (ML) is a rapidly emerging field that enables a plethora of innovative approaches to solving real-worldproblems. It enables machines to learn without human intervention from data and is used in a variety of applications,from fraud detection to recommendation systems and medical imaging. Supervised learning, unsupervised learning, andreinforcement learning are the 3 main categories of ML. Supervised learning involves pre-training the model on a labeleddataset and entails two distinct types of learning: classification and regression. Regression is used when the output iscontinuous. By contrast, classification is used when the output is categorical.Supervised learning aims to optimize class label models using predictor features. Following that, a second classifieris used to assign class labels to the test data in cases where the values of the predictor characteristics are known butthe value of the class label is unknown. In classification, the label identifies the class to which the training set belongs.However, in regression, the label is a real-value response that corresponds to the example.

...read moreread less

16 citations

Journal Article•10.1038/s41592-024-02200-1•

A-SOiD, an active-learning platform for expert-guided, data-efficient discovery of behavior

[...]

Jens Schweihoff, Alexander Hsu, Martin K. Schwarz, Eric A. Yttri

21 Feb 2024-Nature Methods

15 citations

Preprint•10.31219/osf.io/mpkht•

Unsupervised Learning - A Systematic Literature Review

[...]

Slimane Dridi

13 Jun 2024

TL;DR: Unsupervised learning involves training a model on an unlabeled dataset and learning features on its own to make predictions on test data. It includes clustering algorithms such as k-means and hierarchical clustering.

...read moreread less

Abstract: Machine learning (ML) is a data-driven approach in which machines learn from the data without the involvement ofhumans. Several domains take advantage of mind-boggling applications of ML. There are three main learning problems inML: supervised learning, unsupervised learning, and reinforcement learning. Supervised learning involves the trainingof the model on a labelled dataset. Unsupervised learning involves the training of a model in an unlabeled dataset. Themodel learns on its own by learning the features of the training dataset. Based on that learning features, the model makespredictions on test data. Several unsupervised learning approaches and algorithms range from clustering, k-means toagglomerative, Principal component analysis, and Fuzzy C-means. Clustering involves the grouping of objects based ontheir similar features. The algorithms in clustering are categorized into two broad categories such as hierarchal clusteringand partitional clustering.

...read moreread less

15 citations

Journal Article•10.3390/asi7020018•

Unsupervised Learning Approach for Anomaly Detection in Industrial Control Systems

[...]

Woo-Hyun Choi, Jong-Won Kim

21 Feb 2024-Applied system innovation

TL;DR: Unsupervised learning approach for anomaly detection in industrial control systems effectively detects and classifies anomalous behavior without labeled data.

...read moreread less

Abstract: Industrial control systems (ICSs) play a crucial role in managing and monitoring critical processes across various industries, such as manufacturing, energy, and water treatment. The connection of equipment from various manufacturers, complex communication methods, and the need for the continuity of operations in a limited environment make it difficult to detect system anomalies. Traditional approaches that rely on supervised machine learning require time and expertise due to the need for labeled datasets. This study suggests an alternative approach to identifying anomalous behavior within ICSs by means of unsupervised machine learning. The approach employs unsupervised machine learning to identify anomalous behavior within ICSs. This study shows that unsupervised learning algorithms can effectively detect and classify anomalous behavior without the need for pre-labeled data using a composite autoencoder model. Based on a dataset that utilizes HIL-augmented ICSs (HAIs), this study shows that the model is capable of accurately identifying important data characteristics and detecting anomalous patterns related to both value and time. Intentional error data injection experiments could potentially be used to validate the model’s robustness in real-time monitoring and industrial process performance optimization. As a result, this approach can improve system reliability and operational efficiency, which can establish a foundation for safe and sustainable ICS operations.

...read moreread less

15 citations

Journal Article•10.62836/jcmea.v4i1.040105•

Credit Risk Assessment Using a Combined Approach of Supervised and Unsupervised Learning

[...]

Tianyi Xu

25 Jun 2024

TL;DR: Findings indicate that the combination of unsupervised learning with Kohonen's Self-Organizing Maps and supervised learning with Random Forest can effectively improve the accuracy of credit scoring, providing financial institutions with a more reliable tool for credit risk assessment.

...read moreread less

Abstract: In the financial industry, credit scoring is a crucial tool for assessing credit risk. The study aims to enhance the accuracy and reliability of credit scoring by combining supervised and unsupervised learning methods. We propose an integrated model that combines Kohonen's Self-Organizing Maps (SOM) with the Random Forest algorithm to provide a more comprehensive analysis of credit card user data. Key features for model training were identified through feature selection and extraction. Experimental results show that the integrated model improved the AUC from 0.82 to 0.89, increased user satisfaction from a score of 3.8 to 4.35, and boosted usage rates by 12.5%. Additionally, the integrated model significantly enhanced the discrimination and prediction accuracy of user credit risk. These findings indicate that the combination of unsupervised learning with Kohonen's Self-Organizing Maps and supervised learning with Random Forest can effectively improve the accuracy of credit scoring, providing financial institutions with a more reliable tool for credit risk assessment.

...read moreread less

14 citations

Book Chapter•10.1007/978-981-99-6906-7_4•

Prediction Model for the Healthcare Industry Using Machine Learning

[...]

Birendra Kumar Saraswat, Aditya Saxena, Prem Chand Vashist

1 Jan 2024

TL;DR: Machine learning is revolutionizing healthcare by enabling accurate disease prediction and diagnostics through the analysis of large datasets.

...read moreread less

Abstract: The role of machine learning in health care in emerging times, the field of research is industry. In machine learning, there are various forms of learning, including supervised, unsupervised, and reinforcement learning. These strategies are necessary to discover previously unknown relationships in data that are beneficial to society. In predictive modeling, historical data are used to predict a result variable. The uses of machine learning in medical care are turning into a benefit for disease identification and diagnostics. The healthcare industry can benefit from machine learning's capacity to assist in the intelligent analysis of huge amounts of data. Different methods of machine learning, including supervised, unsupervised, and semi-supervised, reinforcement learning for health care, such as SVM, KNN, K-Mean clustering, neural network, and decision tree, provide varying levels of accuracy, precision, and sensitivity. The area of machine learning (ML) is on the rise. The purpose of machine learning is to automatically discover patterns and reason with data. ML offers tailored therapy-dubbed precision medicine. Health care has benefited from the application of machine learning approaches. Within a few years, machine learning will alter the healthcare industry.

...read moreread less

11 citations

Journal Article•10.1016/j.bspc.2023.105769•

USL-Net: Uncertainty self-learning network for unsupervised skin lesion segmentation

[...]

Xiaofan Li, Bo Peng, Jie Hu, Chao Ma, Daipeng Yang, Zhuyang Xie - Show less +2 more

01 Mar 2024-Biomedical Signal Processing and Control

TL;DR: USL-Net is an unsupervised skin lesion segmentation network that eliminates the need for manual labeling guidance. It utilizes contrastive learning, CAMs, and uncertainty self-learning to achieve performance comparable to supervised methods.

...read moreread less

Abstract: Unsupervised skin lesion segmentation offers several benefits, such as conserving expert human resources, reducing discrepancies caused by subjective human labeling, and adapting to novel environments. However, segmenting dermoscopic images without manual labeling guidance is a challenging task due to image artifacts such as hair noise, blister noise, and subtle edge differences. In this paper, we introduce an innovative Uncertainty Self-Learning Network (USL-Net) to eliminate the need for manual labeling guidance for the segmentation. Initially, features are extracted using contrastive learning, followed by the generation of Class Activation Maps (CAMs) as saliency maps. High-saliency regions in the map serve as pseudo-labels for lesion regions while low-saliency regions represent the background. Besides, intermediate regions can be hard to classify, often due to their proximity to lesion edges or interference from hair or blisters. Rather than risking potential pseudo-labeling errors or learning confusion by forcefully classifying these regions, they are taken as uncertainty regions by exempted from pseudo-labeling and allowing the network to self-learning. Further, we employ connectivity detection and centrality detection to refine foreground pseudo-labels and reduce noise-induced errors. The performance is further enhanced by the iterated refinement process. The experimental validation on ISIC-2017, ISIC-2018, and PH2 datasets demonstrates that its performance is comparable to supervised methods, and exceeds that of other existing unsupervised methods. On the typical ISIC-2017 dataset, our method outperforms state-of-the-art unsupervised methods by 1.7% in accuracy, 6.6% in Dice coefficient, 4.0% in Jaccard index, and 10.6% in sensitivity.

...read moreread less

9 citations

Journal Article•10.1109/tpami.2023.3296600•

OPAL: Occlusion Pattern Aware Loss for Unsupervised Light Field Disparity Estimation

[...]

Peng Li, Jing Zhao, Jingyao Wu, Chao Deng, Yan Han, Haoqian Wang, Tao Yu - Show less +3 more

01 Feb 2024-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: OPAL achieves accurate and robust disparity estimation by effectively handling occlusions and significantly reducing network parameters. It significantly improves accuracy compared with SOTA unsupervised methods and possesses stronger generalization capacity on real-world data compared with SOTA supervised methods.

...read moreread less

Abstract: Light field disparity estimation is an essential task in computer vision. Currently, supervised learning-based methods have achieved better performance than both unsupervised and optimization-based methods. However, the generalization capacity of supervised methods on real-world data, where no ground truth is available for training, remains limited. In this paper, we argue that unsupervised methods can achieve not only much stronger generalization capacity on real-world data but also more accurate disparity estimation results on synthetic datasets. To fulfill this goal, we present the Occlusion Pattern Aware Loss, named OPAL, which successfully extracts and encodes general occlusion patterns inherent in the light field for calculating the disparity loss. OPAL enables: i) accurate and robust disparity estimation by teaching the network how to handle occlusions effectively and ii) significantly reduced network parameters required for accurate and efficient estimation. We further propose an EPI transformer and a gradient-based refinement module for achieving more accurate and pixel-aligned disparity estimation results. Extensive experiments demonstrate our method not only significantly improves the accuracy compared with SOTA unsupervised methods, but also possesses stronger generalization capacity on real-world data compared with SOTA supervised methods. Last but not least, the network training and inference efficiency are much higher than existing learning-based methods. Our code will be made publicly available.

...read moreread less

8 citations

Journal Article•10.1109/tip.2024.3380243•

Weakly-supervised Contrastive Learning for Unsupervised Object Discovery

[...]

Yunqiu Lv, Jing Zhang, Nick Barnes, Yuchao Dai

01 Jan 2024-IEEE Transactions on Image Processing

TL;DR: Unsupervised object discovery using weakly-supervised contrastive learning for bounding-box-level localization and pixel-level segmentation.

...read moreread less

Abstract: Unsupervised object discovery (UOD) refers to the task of discriminating the whole region of objects from the background within a scene without relying on labeled datasets, which benefits the task of bounding-box-level localization and pixel-level segmentation. This task is promising due to its ability to discover objects in a generic manner. We roughly categorize existing techniques into two main directions, namely the generative solutions based on image resynthesis, and the clustering methods based on self-supervised models. We have observed that the former heavily relies on the quality of image reconstruction, while the latter shows limitations in effectively modeling semantic correlations. To directly target at object discovery, we focus on the latter approach and propose a novel solution by incorporating weakly-supervised contrastive learning (WCL) to enhance semantic information exploration. We design a semantic-guided self-supervised learning model to extract high-level semantic features from images, which is achieved by fine-tuning the feature encoder of a self-supervised model, namely DINO, via WCL. Subsequently, we introduce Principal Component Analysis (PCA) to localize object regions. The principal projection direction, corresponding to the maximal eigenvalue, serves as an indicator of the object region(s). Extensive experiments on benchmark unsupervised object discovery datasets demonstrate the effectiveness of our proposed solution. The source code and experimental results are publicly available via our project page at https://github.com/npucvr/WSCUOD.git.

...read moreread less

8 citations

Journal Article•10.1109/tmi.2024.3351201•

Unsupervised CT Metal Artifact Reduction by Plugging Diffusion Priors in Dual Domains

[...]

X. Liu, Yaoqin Xie, Songhui Diao, Shan Tan, Xiaokun Liang - Show less +1 more

01 Jan 2024-IEEE Transactions on Medical Imaging

TL;DR: Unsupervised CT metal artifact reduction by plugging diffusion priors in dual domains achieves superior performance compared to existing methods.

...read moreread less

Abstract: During the process of computed tomography (CT), metallic implants often cause disruptive artifacts in the reconstructed images, impeding accurate diagnosis. Many supervised deep learning-based approaches have been proposed for metal artifact reduction (MAR). However, these methods heavily rely on training with paired simulated data, which are challenging to acquire. This limitation can lead to decreased performance when applying these methods in clinical practice. Existing unsupervised MAR methods, whether based on learning or not, typically work within a single domain, either in the image domain or the sinogram domain. In this paper, we propose an unsupervised MAR method based on the diffusion model, a generative model with a high capacity to represent data distributions. Specifically, we first train a diffusion model using CT images without metal artifacts. Subsequently, we iteratively introduce the diffusion priors in both the sinogram domain and image domain to restore the degraded portions caused by metal artifacts. Besides, we design temporally dynamic weight masks for the image-domian fusion. The dual-domain processing empowers our approach to outperform existing unsupervised MAR methods, including another MAR method based on diffusion model. The effectiveness has been qualitatively and quantitatively validated on synthetic datasets. Moreover, our method demonstrates superior visual results among both supervised and unsupervised methods on clinical datasets. Codes are available in github.com/DeepXuan/DuDoDp-MAR.

...read moreread less

Preprint•10.48550/arxiv.2402.15734•

Data-Efficient Operator Learning via Unsupervised Pretraining and In-Context Learning

[...]

Wuyang Chen, Jialin Song, Pu Ren, Shashank Subramanian, Dmitriy Morozov, Michael W. Mahoney - Show less +2 more

24 Feb 2024

TL;DR: Data-efficient operator learning via unsupervised pretraining and in-context learning significantly reduces the data requirements for PDE operator learning, improving generalizability and performance.

...read moreread less

Abstract: Recent years have witnessed the promise of coupling machine learning methods and physical domain-specific insight for solving scientific problems based on partial differential equations (PDEs). However, being data-intensive, these methods still require a large amount of PDE data. This reintroduces the need for expensive numerical PDE solutions, partially undermining the original goal of avoiding these expensive simulations. In this work, seeking data efficiency, we design unsupervised pretraining and in-context learning methods for PDE operator learning. To reduce the need for training data with simulated solutions, we pretrain neural operators on unlabeled PDE data using reconstruction-based proxy tasks. To improve out-of-distribution performance, we further assist neural operators in flexibly leveraging in-context learning methods, without incurring extra training costs or designs. Extensive empirical evaluations on a diverse set of PDEs demonstrate that our method is highly data-efficient, more generalizable, and even outperforms conventional vision-pretrained models.

...read moreread less

Journal Article•10.38124/ijisrt/ijisrt24may2087•

Synergizing Unsupervised and Supervised Learning: A Hybrid Approach for Accurate Natural Language Task Modeling

[...]

Wrick Talukdar, Aritra Biswas

03 Jun 2024-International journal of innovative science and research technology

TL;DR: Synergizing unsupervised and supervised learning for accurate natural language task modeling achieves state-of-the-art results on text classification and named entity recognition tasks.

...read moreread less

Abstract: While supervised learning models have shown remarkable performance in various natural language processing (NLP) tasks, their success heavily relies on the availability of large-scale labeled datasets, which can be costly and time-consuming to obtain. Conversely, unsupervised learning techniques can leverage abundant unlabeled text data to learn rich representations, but they do not directly optimize for specific NLP tasks. This paper presents a novel hybrid approach that synergizes unsupervised and supervised learning to improve the accuracy of NLP task modeling. While supervised models excel at specific tasks, they rely on large labeled datasets. Unsupervised techniques can learn rich representations from abundant unlabeled text but don't directly optimize for tasks. Our methodology integrates an unsupervised module that learns representations from unlabeled corpora (e.g., language models, word embeddings) and a supervised module that leverages these representations to enhance task-specific models [4]. We evaluate our approach on text classification and named entity recognition (NER), demonstrating consistent performance gains over supervised baselines. For text classification, contextual word embeddings from a language model pretrain a recurrent or transformer-based classifier. For NER, word embeddings initialize a BiLSTM sequence labeler. By synergizing techniques, our hybrid approach achieves SOTA results on benchmark datasets, paving the way for more data-efficient and robust NLP systems.

...read moreread less

Journal Article•10.1016/j.sciaf.2024.e02386•

Anomaly detection using unsupervised machine learning algorithms: A simulation study

[...]

Edmund Fosu Agyemang

01 Sep 2024-Scientific African

Journal Article•10.1016/j.ndteint.2024.103175•

Enhancing Corrosion Detection in Pulsed Eddy Current Testing Systems through Autoencoder-Based Unsupervised Learning

[...]

Minhhuy Le, Phuong Thi Thu Pham, Le Quang Trung, S. Hoang, Duc Minh Le, Quang Vuong Pham, Van Su Luong - Show less +3 more

01 Jul 2024-Ndt & E International

Review•10.1016/j.ascom.2024.100851•

A review of unsupervised learning in astronomy

[...]

S. Fotopoulou

01 Jun 2024-Astronomy and Computing

TL;DR: A review of unsupervised learning in astronomy focuses on summarizing popular methods and their uses in the field. Unsupervised learning aims to organise and extract knowledge from datasets through dimensionality reduction, clustering, and complex frameworks.

...read moreread less

Abstract: This review summarizes popular unsupervised learning methods, and gives an overview of their past, current, and future uses in astronomy. Unsupervised learning aims to organise the information content of a dataset, in such a way that knowledge can be extracted. Traditionally this has been achieved through dimensionality reduction techniques that aid the ranking of a dataset, for example through principal component analysis or by using auto-encoders, or simpler visualisation of a high dimensional space, for example through the use of a self organising map. Other desirable properties of unsupervised learning include the identification of clusters, i.e. groups of similar objects, which has traditionally been achieved by the k-means algorithm and more recently through density-based clustering such as HDBSCAN. More recently, complex frameworks have emerged, that chain together dimensionality reduction and clustering methods. However, no dataset is fully unknown. Thus, nowadays a lot of research has been directed towards self-supervised and semi-supervised methods that stand to gain from both supervised and unsupervised learning.

...read moreread less

Journal Article•10.1109/cvpr52733.2024.01596•

Shallow-Deep Collaborative Learning for Unsupervised Visible-Infrared Person Re-Identification

[...]

Bin Yang, Jun Chen, Mang Ye

16 Jun 2024

TL;DR: This paper proposes Shallow-Deep Collaborative Learning (SDCL) for unsupervised visible-infrared person re-identification, leveraging shallow and deep features through contrastive learning and neighbor alignment to improve cross-modality retrieval accuracy.

...read moreread less

Abstract: Unsupervised visible-infrared person re-identification (US-VI-ReID) centers on learning a cross-modality retrieval model without labels, reducing the reliance on expensive cross-modality manual annotation. Previous US-VI-ReID works gravitate toward learning cross-modality information with the deep features extracted from the ultimate layer. Nevertheless, interfered by the multiple discrepancies, solely relying on deep features is insufficient for accurately learning modality-invariant features, resulting in negative optimization. The shallow feature from the shallow layers contains nuanced detail information, which is critical for effective cross-modality learning but is dis- regarded regrettably by the existing methods. To address the above issues, we design a Shallow-Deep Collaborative Learning (SDCL) framework based on the transformer with shallow-deep contrastive learning, incorporating Collaborative Neighbor Learning (CNL) and Collaborative Ranking Association (CRA) module. Specifically, CNL unveils the intrinsic homogeneous and heterogeneous collaboration which are harnessed for neighbor alignment, enhancing the robustness in a dynamic manner. Furthermore, CRA associates the cross-modality labels with the ranking association between shallow and deep features, furnishing valuable supervision for cross-modality learning. Extensive experiments validate the superiority of our method, even outperforming certain supervised counterparts.

...read moreread less

Journal Article•10.1145/3696453•

Personalized Federated Mutual Learning for Unsupervised Camera-aware Person Re-identification

[...]

Jiabei Liu, Weiming Zhuang, Yuanyuan Liu, Yonggang Wen, Jun Huang, Wei Lin - Show less +2 more

21 Sep 2024-ACM Transactions on Multimedia Computing, Communications, and Applications

TL;DR: This study proposes PerFedDual, a personalized federated learning framework for unsupervised person re-identification, leveraging mutual learning and camera-centric clustering to improve accuracy and flexibility in diverse camera configurations while ensuring data privacy.

...read moreread less

Abstract: Person re-identification (ReID) is essential for enhancing security and tracking in multi-camera surveillance systems. To achieve effective person re-identification (ReID) performance across diverse datasets, the Federated Unsupervised Person Re-identification via Camera-aware Clustering (FedUCA) approach has made strides in utilizing distributed datasets while ensuring data privacy. Nevertheless, its uniform model may not adequately cater to the specific characteristics of each participant’s data, given the diversity in camera perspectives and client-specific data variances, thus obtaining degraded results. To address this issue, we propose an advanced framework, Personalized Federated Dual-Model Learning for Camera-Aware Person Re-Identification (PerFedDual), which introduces knowledge-sharing techniques inherent to mutual learning for FedUCA with the camera-centric clustering process. PerFedDual supports a dual-model training approach that creates a cooperative learning space that improves both the global model and client-specific models by exchanging knowledge both ways. The methodology adopted is precisely adjusted to the distinctive data environment of each client, ensuring the protection of privacy while simultaneously enhancing the accuracy and flexibility of ReID models in diverse camera configurations. The empirical evaluation reveals that PerFedDual outperforms FedUCA and alternative federated learning strategies, highlighting the benefits of our technique that leverage collective intelligence to enhance unsupervised person re-identification.

...read moreread less

Journal Article•10.3390/fi16070253•

A Novel Hybrid Unsupervised Learning Approach for Enhanced Cybersecurity in the IoT

[...]

K. Prabu, P. Sudhakar, T. Manikandan, Balamurugan Balusamy, Francesco Benedetto - Show less +1 more

18 Jul 2024-Future Internet

TL;DR: This approach enables the identification of novel attacks while mitigating the impact of imbalanced training data on model performance, and achieves accuracies exceeding 98% for the two datasets, thus confirming its efficacy and effectiveness for application in efficient intrusion detection systems.

...read moreread less

Abstract: The proliferation of IoT services has spurred a surge in network attacks, heightening cybersecurity concerns. Essential to network defense, intrusion detection and prevention systems (IDPSs) identify malicious activities, including denial of service (DoS), distributed denial of service (DDoS), botnet, brute force, infiltration, and Heartbleed. This study focuses on leveraging unsupervised learning for training detection models to counter these threats effectively. The proposed method utilizes basic autoencoders (bAEs) for dimensionality reduction and encompasses a three-stage detection model: one-class support vector machine (OCSVM) and deep autoencoder (dAE) attack detection, complemented by density-based spatial clustering of applications with noise (DBSCAN) for attack clustering. Accurately delineated clusters aid in mapping attack tactics. The MITRE ATT&CK framework establishes a “Cyber Threat Repository”, cataloging attacks and tactics, enabling immediate response based on priority. Leveraging preprocessed and unlabeled normal network traffic data, this approach enables the identification of novel attacks while mitigating the impact of imbalanced training data on model performance. The autoencoder method utilizes reconstruction error, OCSVM employs a kernel function to establish a hyperplane for anomaly detection, while DBSCAN employs a density-based approach to identify clusters, manage noise, accommodate diverse shapes, automatically determining cluster count, ensuring scalability, and minimizing false positives and false negatives. Evaluated on standard datasets such as CIC-IDS2017 and CSECIC-IDS2018, the proposed model outperforms existing state of art methods. Our approach achieves accuracies exceeding 98% for the two datasets, thus confirming its efficacy and effectiveness for application in efficient intrusion detection systems.

...read moreread less

Journal Article•10.1109/tgrs.2024.3515258•

AEKAN: Exploring Superpixel-based AutoEncoder Kolmogorov-Arnold Network for Unsupervised Multimodal Change Detection

[...]

Tongfei Liu, Jianjian Xu, Tao Lei, Yingbo Wang, Xiaogang Du, Weichuan Zhang, Zhiyong Lv, Maoguo Gong - Show less +4 more

01 Jan 2024-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: This study proposes AEKAN, a superpixel-based AutoEncoder Kolmogorov-Arnold Network for unsupervised multimodal change detection, leveraging commonality features between heterogeneous remote sensing images to assess change magnitude with improved performance over existing methods.

...read moreread less

Abstract: Multimodal change detection (MCD) has garnered significant interest due to its capacity to address a variety of emergencies in a timely and effective manner. However, discrepancies in sensors and imaging techniques often hinder the direct comparison of heterogeneous remote sensing images (HRSIs), making it difficult to extract change information. To overcome this challenge, we propose a novel superpixel-based AutoEncoder Kolmogorov-Arnold Network (AEKAN) for unsupervised MCD. The primary objective of AEKAN is to excavate the latent commonality features between HRSIs. Notably, commonality features in unchanged regions are generally more pronounced than those in changed regions, which can be leveraged to assess change magnitude. To achieve this, the proposed method utilizes the Kolmogorov-Arnold Network (KAN), renowned for its capability to model data distributions, to extract these commonality features between HRSIs. Concretely, the proposed AEKAN consists of a Siamese KAN encoder and dual KAN decoders. The Siamese encoder aims to map HRSIs and extract latent commonality features, while the dual decoders reconstruct original bitemporal images from these features. In addition, we incorporate a hierarchical commonality loss function within the Siamese encoder to train AEKAN. This loss function is designed to intentionally guide the network in capturing commonality features by minimizing the discrepancies in features extracted from HRSIs at each layer of the Siamese encoder. The extracted commonality features are then adopted to quantify the change magnitude between images through mean square error (MSE). Extensive experiments on five MCD datasets demonstrate that the proposed AEKAN outperforms existing methods. The source code is available at: https://github.com/TongfeiLiu/AEKAN-for-MCD.

...read moreread less

Journal Article•10.1016/j.modpat.2024.100680•

Non-Generative Artificial Intelligence (AI) in Medicine: Advancements and Applications in Supervised and Unsupervised Machine Learning

[...]

Liron Pantanowitz, Thomas M. Pearce, Ibrahim Abukhiran, Matthew G. Hanna, Sarah Wheeler, T. Rinda Soong, Ahmad P. Tafti, Joshua Pantanowitz, Ming Y. Lu, Faisal Mahmood, Qiangqiang Gu, Hooman H. Rashidi - Show less +8 more

01 Dec 2024-Modern Pathology

Abstract: The use of Artificial Intelligence (AI) within pathology and healthcare has advanced extensively. We have accordingly witnessed increased adoption of various AI tools which are transforming our approach to clinical decision support, personalized medicine, predictive analytics, automation, and discovery. The familiar and more reliable AI tools that have been incorporated within healthcare thus far fall mostly under the non-generative AI domain, which includes supervised and unsupervised machine learning (ML) techniques. This review article explores how such non-generative AI methods, rooted in traditional rules-based systems, enhance diagnostic accuracy, efficiency, and consistency within medicine. Key concepts and the application of supervised learning models (i.e. classification and regression) such as decision trees, support vector machines, linear and logistic regression, K-nearest neighbor, and neural networks are explained along with the newer landscape of neural network-based non-generative foundation models. Unsupervised learning techniques including clustering, dimensionality reduction, and anomaly detection are also discussed for their role in uncovering novel disease subtypes or identifying outliers. Technical details related to the application of non-generative AI algorithms for analyzing whole slide images is also highlighted. The performance, explainability and reliability of non-generative AI models essential for clinical decision-making is also reviewed, as well as challenges related to data quality, model interpretability, and risk of data drift. An understanding of which AI-ML models to employ and which shortcomings need to be addressed is imperative to safely and efficiently leverage, integrate, and monitor these traditional AI tools in clinical practice and research.

...read moreread less

Journal Article•10.1007/s11263-024-02027-5•

Unsupervised Point Cloud Representation Learning by Clustering and Neural Rendering

[...]

Guofeng Mei, Cristiano Saltori, Elisa Ricci, Nicu Sebe, Qiang Wu, Jian Zhang, Fabio Poiesi - Show less +3 more

08 Mar 2024-International Journal of Computer Vision

TL;DR: Unsupervised point cloud representation learning by clustering and neural rendering achieves state-of-the-art performance on various tasks without relying on data augmentation or uni-modal information.

...read moreread less

Abstract: Abstract Data augmentation has contributed to the rapid advancement of unsupervised learning on 3D point clouds. However, we argue that data augmentation is not ideal, as it requires a careful application-dependent selection of the types of augmentations to be performed, thus potentially biasing the information learned by the network during self-training. Moreover, several unsupervised methods only focus on uni-modal information, thus potentially introducing challenges in the case of sparse and textureless point clouds. To address these issues, we propose an augmentation-free unsupervised approach for point clouds, named CluRender, to learn transferable point-level features by leveraging uni-modal information for soft clustering and cross-modal information for neural rendering. Soft clustering enables self-training through a pseudo-label prediction task, where the affiliation of points to their clusters is used as a proxy under the constraint that these pseudo-labels divide the point cloud into approximate equal partitions. This allows us to formulate a clustering loss to minimize the standard cross-entropy between pseudo and predicted labels. Neural rendering generates photorealistic renderings from various viewpoints to transfer photometric cues from 2D images to the features. The consistency between rendered and real images is then measured to form a fitting loss, combined with the cross-entropy loss to self-train networks. Experiments on downstream applications, including 3D object detection, semantic segmentation, classification, part segmentation, and few-shot learning, demonstrate the effectiveness of our framework in outperforming state-of-the-art techniques.

...read moreread less

Journal Article•10.1016/j.procir.2024.08.288•

Anomaly Detection of Wire Arc Additively Manufactured Parts via Surface Tension Transfer through Unsupervised Machine Learning Techniques

[...]

Giulio Mattera, Joseph Polden, Alessandra Caggiano, Patrick Commins, Luigi Nele, Zengxi Pan - Show less +2 more

01 Jan 2024-Procedia CIRP

Journal Article•10.1109/tsg.2023.3325276•

Unsupervised Anomaly Detection and Diagnosis in Power Electronic Networks: Informative Leverage and Multivariate Functional Clustering Approaches

[...]

Shushan Wu, Lu Fang, Jinan Zhang, T.N. Sriram, Stephen J. Coshatt, Feraidoon Zahiri, Alan Mantooth, Jin Ye, Wenxuan Zhong, Ping Ma, Wen-Zhan Song - Show less +7 more

01 Mar 2024-IEEE Transactions on Smart Grid

TL;DR: Unsupervised anomaly detection and diagnosis in power electronic networks using TFD features, ILAD, and MFPCA clustering.

...read moreread less

Abstract: We propose a novel unsupervised anomaly detection and diagnosis algorithm in power electronic networks. Since most anomaly detection and diagnosis algorithms in the literature are based on supervised methods that can hardly be generalized to broader scenarios, we propose unsupervised algorithms. Our algorithm extracts the Time-Frequency Domain (TFD) features from the three-phase currents and three-phase voltages of the point of coupling (PCC) nodes to detect anomalies and distinguish between different types of anomalies, such as cyber-attacks and physical faults. To detect anomalies through TFD features, we propose a novel Informative Leveraging for Anomaly Detection (ILAD) algorithm. The proposed unsupervised ILAD algorithm automatically extracts noise-reduced anomalous signals, resulting in more accurate anomaly detection results than other score-based methods. To assign anomaly types for anomaly diagnosis, we apply a novel Multivariate Functional Principal Component Analysis (MFPCA) clustering method. Unlike the deep learning methods, the MFPCA clustering method does not require labels for training and provides more accurate results than other deep embedding-based clustering approaches. Furthermore, it is even comparable to supervised algorithms in both offline and online experiments. To the best of our knowledge, the proposed unsupervised framework accomplishing anomaly detection and anomaly diagnosis tasks is the first of its kind in power electronic networks.

...read moreread less

Proceedings Article•10.1609/aaai.v38i2.27866•

Unsupervised Group Re-identification via Adaptive Clustering-Driven Progressive Learning

[...]

Hongxu Chen, Quan Zhang, Jianhuang Lai, Xiaohua Xie

24 Mar 2024-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: Unsupervised G-ReID via adaptive clustering-driven progressive learning achieves state-of-the-art performance without requiring labeled samples.

...read moreread less

Abstract: Group re-identification (G-ReID) aims to correctly associate groups with the same members captured by different cameras. However, supervised approaches for this task often suffer from the high cost of cross-camera sample labeling. Unsupervised methods based on clustering can avoid sample labeling, but the problem of member variations often makes clustering unstable, leading to incorrect pseudo-labels. To address these challenges, we propose an adaptive clustering-driven progressive learning approach (ACPL), which consists of a group adaptive clustering (GAC) module and a global dynamic prototype update (GDPU) module. Specifically, GAC designs the quasi-distance between groups, thus fully capitalizing on both individual-level and holistic information within groups. In the case of great uncertainty in intra-group members, GAC effectively minimizes the impact of non-discriminative features and reduces the noise in the model's pseudo-labels. Additionally, our GDPU devises a dynamic weight to update the prototypes and effectively mine the hard samples with complex member variations, which improves the model's robustness. Extensive experiments conducted on four popular G-ReID datasets demonstrate that our method not only achieves state-of-the-art performance on unsupervised G-ReID but also performs comparably to several fully supervised approaches.

...read moreread less

Journal Article•10.1016/j.compmedimag.2024.102351•

Quasi-supervised learning for super-resolution PET

[...]

Guangtong Yang, Chen Li, Yu‐Dong Yao, Ge Wang, Yueyang Teng - Show less +1 more

01 Apr 2024-Computerized Medical Imaging and Graphics

TL;DR: A novel quasi-supervised learning method for super-resolution PET recovers HR PET images from LR counterparts by leveraging similarity between unpaired LR and HR image patches.

...read moreread less

Abstract: Low resolution of positron emission tomography (PET) limits its diagnostic performance. Deep learning has been successfully applied to achieve super-resolution PET. However, commonly used supervised learning methods in this context require many pairs of low- and high-resolution (LR and HR) PET images. Although unsupervised learning utilizes unpaired images, the results are not as good as that obtained with supervised deep learning. In this paper, we propose a quasi-supervised learning method, which is a new type of weakly-supervised learning methods, to recover HR PET images from LR counterparts by leveraging similarity between unpaired LR and HR image patches. Specifically, LR image patches are taken from a patient as inputs, while the most similar HR patches from other patients are found as labels. The similarity between the matched HR and LR patches serves as a prior for network construction. Our proposed method can be implemented by designing a new network or modifying an existing network. As an example in this study, we have modified the cycle-consistent generative adversarial network (CycleGAN) for super-resolution PET. Our numerical and experimental results qualitatively and quantitatively show the merits of our method relative to the state-of-the-art methods. The code is publicly available at https://github.com/PigYang-ops/CycleGAN-QSDL.

...read moreread less

Journal Article•10.1109/tgrs.2024.3354118•

Unsupervised CD in satellite image time series by contrastive learning and feature tracking

[...]

Yuxing Chen, Lorenzo Bruzzone

01 Jan 2024-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: Unsupervised change detection in satellite image time series using contrastive learning and feature tracking achieves significant improvements over existing techniques by exploiting spatial-temporal information and addressing challenges related to seasonal changes and generalization.

...read moreread less

Abstract: Unsupervised change detection using contrastive learning has significantly improved the performance of literature techniques. However, at present it only focuses on the bi-temporal change detection scenario. Previous state-of-the-art models for image time-series change detection have traditionally depended on features obtained either through clustering learning or by training models from scratch using pseudo labels tailored to each scene. However, these approaches fail to either exploit the spatial-temporal information of image time-series or generalize to unseen scenarios. In this work, we propose a two-stage approach to unsupervised change detection in satellite image time-series using contrastive learning with feature tracking. By deriving pseudo labels from pre-trained models and using feature tracking to propagate them within the image time-series, we improve the consistency of our pseudo labels and address the challenges of seasonal changes in long-term remote sensing image time-series. We adopt the self-training algorithm with ConvLSTM on the obtained pseudo labels, where we first use supervised contrastive loss and contrastive random walks to further improve the feature correspondence in space-time. Then a fully connected layer is fine-tuned on the pre-trained multi-temporal features for generating the final change maps. Through comprehensive experiments on two datasets, we demonstrate consistent improvements in accuracy on fitting and inference scenarios.

...read moreread less

Journal Article•10.1109/icaaic60222.2024.10575444•

High Dimensional Text Classification using Unsupervised Machine Learning Algorithm

[...]

Prem Naresh, B R Akshay, B. Rajasree, G. Ramesh, K. Yashwanth Kumar - Show less +1 more

5 Jun 2024

TL;DR: This study proposes enhancements to the Naive Bayes classification method and K-means clustering methods, leveraging the inherent similarity property to simplify data representation while preserving essential features.

...read moreread less

Abstract: Document Clustering is an automatic data organization technique that greatly minimizes complexity and time. The size of high-dimensional papers poses significant concerns and needs attention, as it can lead to both positive and negative consequences, particularly in the context of document classification. Inadequate dimensional reduction may hinder the desired outcomes of document classification using reduced dimensions. This research study primarily focuses on dimensional reduction techniques, leveraging the inherent similarity property to simplify data representation while preserving essential features. This study explores various strategies aimed at classification and grouping tasks, employing diverse datasets and clustering and classification techniques. Specifically, this study proposes enhancements to the Naive Bayes classification method and K-means clustering methods. The effectiveness of the proposed approach is evaluated using standard metrics such as recall, precision, and F-score, demonstrating its potential to improve classification accuracy and efficiency.

...read moreread less

Journal Article•10.1016/j.chemgeo.2024.121997•

Tracking element-mineral associations with unsupervised learning and dimensionality reduction in chemical and optical image stacks of thin sections

[...]

Marco Andres Acevedo Zamora, Balz S. Kamber, Michael W. M. Jones, Christoph Schrank, C.G. Ryan, Daryl L. Howard, David L. Paterson, Teresa Ubide, David T. Murphy - Show less +5 more

01 Feb 2024-Chemical Geology

TL;DR: In-situ chemical analysis data interpretation in thin sections is improved using dimensionality reduction techniques and pixel-based classification.

...read moreread less

Abstract: In-situ chemical analysis on thin sections is a cornerstone of geochemical research. Despite massive advances in digital image analysis, the interpretation of such data within the optical petrological context of the thin section has largely remained an analogue task in geochemistry. In this contribution, we registered optical and micro-chemical images from large thin section areas. Chemical datasets from scanning electron microscopy energy dispersive spectrometry (SEM-EDX) and Synchrotron X-ray fluorescence microscopy (S-XFM) were exported from proprietary software. We then evaluated two dimensionality reduction techniques against false-colour images and phase maps to test whether previously unnoticed features became discoverable. Principal component analysis (PCA) and deep sparse autoencoder (DSA) neural networks were used to summarise the multi-channel SEM-EDX and S-XFM into one single red-green-blue (RGB) representation each. Applied to an oceanic gabbro, a cratonic peridotite, and a pelagic limestone, the PCA-based dimensionality reduction was found to produce crisp RGB maps of phases with less noise than false-colour three-element RGB maps. They were registered with optical images to form a simple multi-channel input for pixel-based classification into classes (i.e., semantic segmentation). This was fast and worked well for typical igneous and metamorphic rocks (1–3 h routine). In the chemically quite homogenous limestone, the combined optical and S-XFM PCA input was successfully classified into many phases that were unclassifiable on the conventional SEM-EDX phase map. From the segmented optical and S-XFM PCA map, we exported pixel locations of classified accessory phases back into the original proprietary S-XFM software. This allowed quantification of trace elements whose X-ray peaks were invisible in the original S-XFM quantification. The DSA-based approach yielded phase maps with greater noise, but the technique was very strong at detecting subtle features. In the gabbro sample, the DSA image identified cryptic relict cores in clinopyroxene more clearly than in the previously used Cr concentration map. A DSA training set can be obtained on a small ROI to generate a consistent large area DSA image from which more fragmental cores were observed. For a ROI on the gabbro thin section, the DSA representation map and the blended optical image were segmented and inspected with a loupe tool. This allowed us to generate interactive histograms and bin scatter concentration plots of selected phases search for co-variations of element concentrations. This approach can be expanded to include data from any in situ geochemical acquisition tools and may help discover previously unseen features in complex elemental imagery.

...read moreread less

Journal Article•10.1109/access.2024.3425590•

A Deep Learning Framework for Net Load Forecasting With Unsupervised Behind-The-Meter Disaggregated Data

[...]

Chaichan Thepprom, Natawut Nupairoj, Peerapon Vateekul

01 Jan 2024-IEEE Access

TL;DR: The net load forecasting on the disaggregated series outperforms the net load series directly due to the accuracy of the unsupervised disaggregation of the BTM data, proving superior to the semi-supervised technique.

...read moreread less

Abstract: Recently, distributed photovoltaic (PV) generation has increased significantly, leading to a high penetration of behind-the-meter (BTM) solar generation systems. In this work, we aim to improve net load forecasting by disaggregating BTM components to provide better representation. For the disaggregation process, we propose an unsupervised contrastive-based optimization method for estimating BTM PV generation from the net load at the aggregated level. Our proposed method uses a deep neural network to leverage the strong correlation between solar irradiance and PV generation. This means that our proposed method is independent of the availability of BTM data and the assumption of a physical model. Furthermore, to obtain the best forecasted trends on the disaggregated series (pure load and PV generation), various recent forecasting models have been compared i.e. DeepAR, Temporal Fusion Transformer (TFT), and Time-series Dense Encoder (TiDE). The experiment is conducted on two real-world electricity prosumption datasets collected from New York and Texas. Results show that the net load forecasting on the disaggregated series outperforms the net load series directly. Such an improvement is due to the accuracy of our unsupervised disaggregation of the BTM data, proving superior to the semi-supervised technique.

...read moreread less

...

Expand