Gustaf Ahdritz, Nazim Bouatta, Christina Floristean, Sachin Kadyan, Qinghui Xia, William Gerecke, Timothy J. O’Donnell, Daniel Berenberg, I. Fisk, Niccolò Zanichelli, Bo Zhang, Arkadiusz Nowaczynski, Bei Wang, Marta M. Stepniewska-Dziubinska, Shang Zhang, Adegoke Ojewole, Murat Efe Guney, Stella Biderman, Andrew Watkins, Stephen Ra, Pablo Ribalta Lorenzo, Lucas G. Nivón, Brian D. Weitzner, Yih‐En Andrew Ban, Shiyang Chen, Minjia Zhang, Conglong Li, Shuaiwen Leon Song, Yuxiong He, Peter K. Sorger, Emad Mostaque, Zhao Zhang, Richard Bonneau, Mohammed AlQuraishi
TL;DR: OpenFold is a trainable open-source implementation of AlphaFold2 that enables insights into its learning mechanisms and capacity for generalization.
Abstract: AlphaFold2 revolutionized structural biology with the ability to predict protein structures with exceptionally high accuracy. Its implementation, however, lacks the code and data required to train new models. These are necessary to (1) tackle new tasks, like protein–ligand complex structure prediction, (2) investigate the process by which the model learns and (3) assess the model's capacity to generalize to unseen regions of fold space. Here we report OpenFold, a fast, memory efficient and trainable implementation of AlphaFold2. We train OpenFold from scratch, matching the accuracy of AlphaFold2. Having established parity, we find that OpenFold is remarkably robust at generalizing even when the size and diversity of its training set is deliberately limited, including near-complete elisions of classes of secondary structure elements. By analyzing intermediate structures produced during training, we also gain insights into the hierarchical manner in which OpenFold learns to fold. In sum, our studies demonstrate the power and utility of OpenFold, which we believe will prove to be a crucial resource for the protein modeling community. OpenFold is a trainable open-source implementation of AlphaFold2. It is fast and memory efficient, and the code and training data are available under a permissive license.
TL;DR: RSPrompter leverages the SAM model and incorporates semantic category information to automate instance segmentation for remote sensing images.
Abstract: Leveraging the extensive training data from SA-1B, the Segment Anything Model (SAM) demonstrates remarkable generalization and zero-shot capabilities. However, as a category-agnostic instance segmentation method, SAM heavily relies on prior manual guidance, including points, boxes, and coarse-grained masks. Furthermore, its performance in remote sensing image segmentation tasks remains largely unexplored and unproven. In this paper, we aim to develop an automated instance segmentation approach for remote sensing images, based on the foundational SAM model and incorporating semantic category information. Drawing inspiration from prompt learning, we propose a method to learn the generation of appropriate prompts for SAM. This enables SAM to produce semantically discernible segmentation results for remote sensing images, a concept we have termed RSPrompter. We also propose several ongoing derivatives for instance segmentation tasks, drawing on recent advancements within the SAM community, and compare their performance with RSPrompter. Extensive experimental results, derived from the WHU building, NWPU VHR-10, and SSDD datasets, validate the effectiveness of our proposed method. The code for our method is publicly available at https://kychen.me/RSPrompter.
TL;DR: Medical imaging AI uses demographic shortcuts, leading to fairness discrepancies across subpopulations. Models with less encoding of demographic attributes are often most globally optimal.
Abstract: Abstract As artificial intelligence (AI) rapidly approaches human-level performance in medical imaging, it is crucial that it does not exacerbate or propagate healthcare disparities. Previous research established AI’s capacity to infer demographic data from chest X-rays, leading to a key concern: do models using demographic shortcuts have unfair predictions across subpopulations? In this study, we conducted a thorough investigation into the extent to which medical AI uses demographic encodings, focusing on potential fairness discrepancies within both in-distribution training sets and external test sets. Our analysis covers three key medical imaging disciplines—radiology, dermatology and ophthalmology—and incorporates data from six global chest X-ray datasets. We confirm that medical imaging AI leverages demographic shortcuts in disease classification. Although correcting shortcuts algorithmically effectively addresses fairness gaps to create ‘locally optimal’ models within the original data distribution, this optimality is not true in new test settings. Surprisingly, we found that models with less encoding of demographic attributes are often most ‘globally optimal’, exhibiting better fairness during model evaluation in new test environments. Our work establishes best practices for medical imaging models that maintain their performance and fairness in deployments beyond their initial training contexts, underscoring critical considerations for AI clinical deployments across populations and sites.
TL;DR: UCL-Dehaze utilizes contrastive learning to enhance image dehazing performance using unpaired real-world hazy and clean images, alleviating the domain shift problem.
Abstract: While the wisdom of training an image dehazing model on synthetic hazy data can alleviate the difficulty of collecting real-world hazy/clean image pairs, it brings the well-known domain shift problem. From a different yet new perspective, this paper explores contrastive learning with an adversarial training effort to leverage unpaired real-world hazy and clean images, thus alleviating the domain shift problem and enhancing the network's generalization ability in real-world scenarios. We propose an effective unsupervised contrastive learning paradigm for image dehazing, dubbed UCL-Dehaze. Unpaired real-world clean and hazy images are easily captured, and will serve as the important positive and negative samples respectively when training our UCL-Dehaze network. To train the network more effectively, we formulate a new self-contrastive perceptual loss function, which encourages the restored images to approach the positive samples and keep away from the negative samples in the embedding space. Besides the overall network architecture of UCL-Dehaze, adversarial training is utilized to align the distributions between the positive samples and the dehazed images. Compared with recent image dehazing works, UCL-Dehaze does not require paired data during training and utilizes unpaired positive/negative data to better enhance the dehazing performance. We conduct comprehensive experiments to evaluate our UCL-Dehaze and demonstrate its superiority over the state-of-the-arts, even only 1,800 unpaired real-world images are used to train our network. Source code is publicly available at https://github.com/yz-wang/UCL-Dehaze.
TL;DR: Meta-learning approaches for few-shot learning survey recent advances in addressing challenges of deep learning on unseen tasks and poor generalization from few samples.
Abstract: Despite its astounding success in learning deeper multi-dimensional data, the performance of deep learning declines on new unseen tasks mainly due to its focus on same-distribution prediction. Moreover, deep learning is notorious for poor generalization from few samples. Meta-learning is a promising approach that addresses these issues by adapting to new tasks with few-shot datasets. This survey first briefly introduces meta-learning and then investigates state-of-the-art meta-learning methods and recent advances in: (i) metric-based, (ii) memory-based, (iii), and learning-based methods. Finally, current challenges and insights for future researches are discussed.
TL;DR: A survey on model-based reinforcement learning focusing on recent advancements in deep RL. The survey discusses the challenges of model-based RL, including generalization error and the disparity between policy training in the environment model and the actual environment. It also covers recent developments in other forms of RL and the applicability and benefits of MBRL for real-world tasks.
Abstract: Reinforcement learning (RL) interacts with the environment to solve sequential decision-making problems via a trial-and-error approach. Errors are always undesirable in real-world applications, even though RL excels at playing complex video games that permit several trial-and-error attempts. To improve sample efficiency and thus reduce errors, model-based reinforcement learning (MBRL) is believed to be a promising direction, as it constructs environment models in which trial-and-errors can occur without incurring actual costs. In this survey, we investigate MBRL with a particular focus on the recent advancements in deep RL. There is a generalization error between the learned model of a non-tabular environment and the actual environment. Consequently, it is crucial to analyze the disparity between policy training in the environment model and that in the actual environment, guiding algorithm design for improved model learning, model utilization, and policy training. In addition, we discuss the recent developments of model-based techniques in other forms of RL, such as offline RL, goal-conditioned RL, multi-agent RL, and meta-RL. Furthermore, we discuss the applicability and benefits of MBRL for real-world tasks. Finally, this survey concludes with a discussion of the promising future development prospects for MBRL. We believe that MBRL has great unrealized potential and benefits in real-world applications, and we hope this survey will encourage additional research on MBRL.
TL;DR: Survey and benchmark of federated learning research on generalization, robustness, fairness. Covers history, terminology, lines of research, methods, datasets, benchmarks, open issues, and future directions.
Abstract: Federated learning has emerged as a promising paradigm for privacy-preserving collaboration among different parties. Recently, with the popularity of federated learning, an influx of approaches have delivered towards different realistic challenges. In this survey, we provide a systematic overview of the important and recent developments of research on federated learning. Firstly, we introduce the study history and terminology definition of this area. Then, we comprehensively review three basic lines of research: generalization, robustness, and fairness, by introducing their respective background concepts, task settings, and main challenges. We also offer a detailed overview of representative literature on both methods and datasets. We further benchmark the reviewed methods on several well-known datasets. Finally, we point out several open issues in this field and suggest opportunities for further research. We also provide a public website to continuously track developments in this fast advancing field: https://github.com/WenkeHuang/MarsFL.
TL;DR: The proposed Critical Forgery Mining (CFM) framework effectively detects face forgeries by mining critical clues, including noise patterns, blending boundaries, and frequency artifacts, without relying on prior expert knowledge.
Abstract: Face forgery detection is essential in combating malicious digital face attacks. Previous methods mainly rely on prior expert knowledge to capture specific forgery clues, such as noise patterns, blending boundaries, and frequency artifacts. However, these methods tend to get trapped in local optima, resulting in limited robustness and generalization capability. To address these issues, we propose a novel Critical Forgery Mining (CFM) framework, which can be flexibly assembled with various backbones to boost their generalization and robustness performance. Specifically, we first build a fine-grained triplet and suppress specific forgery traces through prior knowledge-agnostic data augmentation. Subsequently, we propose a fine-grained relation learning prototype to mine critical information in forgeries through instance and local similarity-aware losses. Moreover, we design a novel progressive learning controller to guide the model to focus on principal feature components, enabling it to learn critical forgery features in a coarse-to-fine manner. The proposed method achieves state-of-the-art forgery detection performance under various challenging evaluation settings. The source code is available at: https://github.com/LoveSiameseCat/CFM .
TL;DR: This study investigates the impact of varying train-test split ratios on machine learning model performance using the BraTS 2013 dataset, revealing significant variations in accuracies and emphasizing the need to strike a balance to avoid overfitting or underfitting.
Abstract: Artificial intelligence (AI) and machine learning (ML) aim to mimic human intelligence and enhance decision making processes across various fields. A key performance determinant in a ML model is the ratio between the training and testing dataset. This research investigates the impact of varying train-test split ratios on machine learning model performance and generalization capabilities using the BraTS 2013 dataset. Logistic regression, random forest, k nearest neighbors, and support vector machines were trained with split ratios ranging from 60:40 to 95:05. Findings reveal significant variations in accuracies across these ratios, emphasizing the critical need to strike a balance to avoid overfitting or underfitting. The study underscores the importance of selecting an optimal train-test split ratio that considers tradeoffs such as model performance metrics, statistical measures, and resource constraints. Ultimately, these insights contribute to a deeper understanding of how ratio selection impacts the effectiveness and reliability of machine learning applications across diverse fields.
TL;DR: The NSGA-II is less effective for large numbers of objectives due to its independent crowding distance calculation.
Abstract: The NSGA-II is one of the most prominent algorithms to solve multi-objective optimization problems. Despite numerous successful applications, several studies have shown that the NSGA-II is less effective for larger numbers of objectives. In this work, we use mathematical runtime analyses to rigorously demonstrate and quantify this phenomenon. We show that even on the simple m-objective generalization of the discrete OneMinMax benchmark, where every solution is Pareto optimal, the NSGA-II also with large population sizes cannot compute the full Pareto front (objective vectors of all Pareto optima) in sub-exponential time when the number of objectives is at least three. The reason for this unexpected behavior lies in the fact that in the computation of the crowding distance, the different objectives are regarded independently. This is not a problem for two objectives, where any sorting of a pair-wise incomparable set of solutions according to one objective is also such a sorting according to the other objective (in the inverse order).
TL;DR: A novel DMOA based on HRS is proposed to handle dynamic multiobjective optimization problems effectively. The algorithm integrates diversity-, memory-, and prediction-based methods to make flexible responses to environmental changes. It outperforms popular baseline DMOAs and other state-of-the-art DMOAs in terms of convergence and diversity.
Abstract: In this article, a novel dynamic multiobjective optimization algorithm (DMOA) is proposed based on a designed hierarchical response system (HRS). Named HRS-DMOA, the proposed algorithm mainly aims at integrating merits from the mainstream ideas of dynamic behavior handling (i.e., the diversity-, memory-, and prediction-based methods) in order to make flexible responses to environmental changes. In particular, by two predefined thresholds, the environmental changes are quantified as three levels. In case of a slight environmental change, the previous Pareto set-based refinement strategy is recommended, while the diversity-based reinitialization method is applied in case of a dramatic environmental change. For changes occurring at a medium level, the transfer-learning-based response is adopted to make full use of the historical searching experiences. The proposed HRS-DMOA is comprehensively evaluated on a series of benchmark functions, and the results show an improved comprehensive performance as compared with four popular baseline DMOAs in terms of both convergence and diversity, which also outperforms other two state-of-the-art DMOAs in ten out of 14 testing cases, exhibiting the competitiveness and superiority of the algorithm. Finally, extensive ablation studies are carried out, and from the results, it is found that as compared with randomly selecting the response methods, the proposed HRS enables more reasonable and efficient responses in most cases. In addition, the generalization ability of the proposed HRS as a flexible plug-and-play module to handle dynamic behaviors is proven as well.
TL;DR: Traffic flow prediction model based on multimodal data in cloud computing improves traffic flow prediction accuracy and provides reliable tools for real-time data analysis and traffic management decisions.
Abstract: This study uses cloud computing platform to process multi-modal data, and constructs a traffic flow prediction model based on LSTM neural network by integrating data from multiple dimensions such as traffic flow, occupancy and speed. In the process of model construction, we fully consider the hourly characteristics and hysteresis characteristics, and carry out fine scaling and splitting of the data to improve the accuracy and generalization ability of the model. The experimental results show that our model outperforms the baseline on both the training set and the test set, which verifies its effectiveness in traffic flow prediction. By keeping our models in a cloud environment, we provide reliable tools and support for future real-time data analysis and traffic management decisions. This study provides an important reference for the development of traffic management system based on cloud computing, and also provides new ideas and methods for other fields to solve practical problems by using multi-modal data.
TL;DR: Effective image tampering localization with multi-scale ConvNeXt feature fusion achieves high accuracy and robustness.
Abstract: With the widespread use of powerful image editing tools, image tampering becomes easy and realistic. Existing image forensic methods still face challenges of low generalization performance and robustness. In this letter, we propose an effective image tampering localization scheme based on ConvNeXt encoder and multi-scale Feature Fusion (ConvNeXtFF). Stacked ConvNeXt blocks are utilized as an encoder to capture hierarchical multi-scale features, which are then fused in decoder for locating tampered pixels accurately. Combined loss function and effective data augmentation strategies are adopted to further improve the model performance. Extensive experimental results show that both localization accuracy and robustness of the ConvNeXtFF scheme outperform other state-of-the-art ones. The source code is available at https://github.com/multimediaFor/ConvNeXtFF.
TL;DR: The generalization performance of quantum machine learning models is not well-explained by traditional approaches and is fundamentally different from classical models.
Abstract: Quantum machine learning models have shown successful generalization performance even when trained with few data. In this work, through systematic randomization experiments, we show that traditional approaches to understanding generalization fail to explain the behavior of such quantum models. Our experiments reveal that state-of-the-art quantum neural networks accurately fit random states and random labeling of training data. This ability to memorize random data defies current notions of small generalization error, problematizing approaches that build on complexity measures such as the VC dimension, the Rademacher complexity, and all their uniform relatives. We complement our empirical results with a theoretical construction showing that quantum neural networks can fit arbitrary labels to quantum states, hinting at their memorization ability. Our results do not preclude the possibility of good generalization with few training data but rather rule out any possible guarantees based only on the properties of the model family. These findings expose a fundamental challenge in the conventional understanding of generalization in quantum machine learning and highlight the need for a paradigm shift in the study of quantum models for machine learning tasks.
TL;DR: Multi-task aquatic toxicity prediction model based on multi-level features fusion accurately predicts aquatic toxicity across four fish species, outperforming single-task learning and previous algorithms.
Abstract: With the escalating menace of organic compounds in environmental pollution imperiling the survival of aquatic organisms, the investigation of organic compound toxicity across diverse aquatic species assumes paramount significance for environmental protection. Understanding how different species respond to these compounds helps assess the potential ecological impact of pollution on aquatic ecosystems as a whole. Compared with traditional experimental methods, deep learning methods have higher accuracy in predicting aquatic toxicity, faster data processing speed and better generalization ability. This article presents ATFPGT-multi, an advanced multi-task deep neural network prediction model for organic toxicity. The model integrates molecular fingerprints and molecule graphs to characterize molecules, enabling the simultaneous prediction of acute toxicity for the same organic compound across four distinct fish species. Furthermore, to validate the advantages of multi-task learning, we independently construct prediction models, named ATFPGT-single, for each fish species. We employ cross-validation in our experiments to assess the performance and generalization ability of ATFPGT-multi. The experimental results indicate, first, that ATFPGT-multi outperforms ATFPGT-single on four fish datasets with AUC improvements of 9.8%, 4%, 4.8%, and 8.2%, respectively, demonstrating the superiority of multi-task learning over single-task learning. Furthermore, in comparison with previous algorithms, ATFPGT-multi outperforms comparative methods, emphasizing that our approach exhibits higher accuracy and reliability in predicting aquatic toxicity. Moreover, ATFPGT-multi utilizes attention scores to identify molecular fragments associated with fish toxicity in organic molecules, as demonstrated by two organic molecule examples in the main text, demonstrating the interpretability of ATFPGT-multi. In summary, ATFPGT-multi provides important support and reference for the further development of aquatic toxicity assessment. All of codes and datasets are freely available online at https://github.com/zhaoqi106/ATFPGT-multi.
TL;DR: Deep-learning-based lithium battery defect detection via cross-domain generalization achieves high accuracy using a novel approach incorporating cross-domain augmentation, multi-task learning, and iteration learning.
Abstract: This research addresses the critical challenge of classifying surface defects in lithium electronic components, crucial for ensuring the reliability and safety of lithium batteries. With a scarcity of specific defect data, we introduce an innovative Cross-Domain Generalization (CDG) approach, incorporating Cross-domain Augmentation, Multi-task Learning, and Iteration Learning. Leveraging a steel surface defect dataset as foundational knowledge, our approach compensates for the limited lithium-specific data and enhances model generalization. We also introduce the Lithium Electronic Surface Defect Classification (IESDC) dataset, demonstrating significant accuracy improvements over baseline methods. Our comprehensive evaluation covers model interpretability, robustness, and adaptability. Beyond battery technology, this methodology offers a framework for data scarcity challenges in various industries, emphasizing the importance of adaptable learning methods.
TL;DR: This study proposes DeepAIPs-Pred, a novel computational model predicting anti-inflammatory peptides using local evolutionary transformation images, structural embedding, and self-normalized bidirectional temporal convolutional networks, achieving 94.92% accuracy and 0.97 AUC.
Abstract: Inflammation is a biological response to harmful stimuli, playing a crucial role in facilitating tissue repair by eradicating pathogenic microorganisms. However, when inflammation becomes chronic, it leads to numerous serious disorders, particularly in autoimmune diseases. Anti-inflammatory peptides (AIPs) have emerged as promising therapeutic agents due to their high specificity, potency, and low toxicity. However, identifying AIPs using traditional in vivo methods is time-consuming and expensive. Recent advancements in computational-based intelligent models for peptides have offered a cost-effective alternative for identifying various inflammatory diseases, owing to their selectivity toward targeted cells with low side effects. In this paper, we propose a novel computational model, namely, DeepAIPs-Pred, for the accurate prediction of AIP sequences. The training samples are represented using LBP-PSSM- and LBP-SMR-based evolutionary image transformation methods. Additionally, to capture contextual semantic features, we employed attention-based ProtBERT-BFD embedding and QLC for structural features. Furthermore, differential evolution (DE)-based weighted feature integration is utilized to produce a multiview feature vector. The SMOTE-Tomek Links are introduced to address the class imbalance problem, and a two-layer feature selection technique is proposed to reduce and select the optimal features. Finally, the novel self-normalized bidirectional temporal convolutional networks (SnBiTCN) are trained using optimal features, achieving a significant predictive accuracy of 94.92% and an AUC of 0.97. The generalization of our proposed model is validated using two independent datasets, demonstrating higher performance with the improvement of ∼2 and ∼10% of accuracies than the existing state-of-the-art model using Ind-I and Ind-II, respectively. The efficacy and reliability of DeepAIPs-Pred highlight its potential as a valuable and promising tool for drug development and research academia.
TL;DR: Aligning LLMs with diverse user preferences via system message generalization. A new paradigm where users specify their values within the system message to steer the LLM's behavior. The Multifaceted Collection dataset and the Janus LLM achieve high alignment rates on various benchmarks.
Abstract: Although humans inherently have diverse values, current large language model (LLM) alignment methods often assume that aligning LLMs with the general public's preferences is optimal. A major challenge in adopting a more individualized approach to LLM alignment is its lack of scalability, as it involves repeatedly acquiring preference data and training new reward models and LLMs for each individual's preferences. To address these challenges, we propose a new paradigm where users specify what they value most within the system message, steering the LLM's generation behavior to better align with the user's intentions. However, a naive application of such an approach is non-trivial since LLMs are typically trained on a uniform system message (e.g., "You are a helpful assistant") which limits their ability to generalize to diverse, unseen system messages. To improve this generalization, we create the Multifaceted Collection, a preference dataset with 192k combinations of values beyond generic helpfulness and harmlessness, spanning 65k user instructions. Using this dataset, we train a 7B LLM called Janus and test it on 921 prompts from 5 benchmarks (AlpacaEval 2.0, FLASK, Koala, MT-Bench, and Self-Instruct) by adding various unseen system messages that reflect user preferences. Janus achieves tie+win rate of 75.2%, 72.4%, and 66.4% against Mistral 7B Instruct v0.2, GPT-3.5 Turbo, and GPT-4, respectively. Unexpectedly, on three benchmarks focused on response helpfulness (AlpacaEval 2.0, MT-Bench, Arena Hard Auto v0.1), Janus also outperforms LLaMA 3 8B Instruct by a +4.0%, +0.1%, +3.0% margin, underscoring that training with a vast array of system messages could also enhance alignment to the general public's preference as well. Our code, dataset, benchmark, and models are available at https://github.com/kaistAI/Janus.
TL;DR: Analysis of VQA datasets via design of minimalistic video quality models reveals the limitations of current progress and provides insights into dataset and model design.
Abstract: Blind video quality assessment (BVQA) plays an indispensable role in monitoring and improving the end-users' viewing experience in various real-world video-enabled media applications. As an experimental field, the improvements of BVQA models have been measured primarily on a few human-rated VQA datasets. Thus, it is crucial to gain a better understanding of existing VQA datasets in order to properly evaluate the current progress in BVQA. Towards this goal, we conduct a first-of-its-kind computational analysis of VQA datasets via designing minimalistic BVQA models. By minimalistic, we restrict our family of BVQA models to build only upon basic blocks: a video preprocessor (for aggressive spatiotemporal downsampling), a spatial quality analyzer, an optional temporal quality analyzer, and a quality regressor, all with the simplest possible instantiations. By comparing the quality prediction performance of different model variants on eight VQA datasets with realistic distortions, we find that nearly all datasets suffer from the easy dataset problem of varying severity, some of which even admit blind image quality assessment (BIQA) solutions. We additionally justify our claims by comparing our model generalization capabilities on these VQA datasets, and by ablating a dizzying set of BVQA design choices related to the basic building blocks. Our results cast doubt on the current progress in BVQA, and meanwhile shed light on good practices of constructing next-generation VQA datasets and models.
TL;DR: One model to unite them all: pFLSynth personalizes federated learning of multi-contrast MRI synthesis to address data heterogeneity.
Abstract: Curation of large, diverse MRI datasets via multi-institutional collaborations can help improve learning of generalizable synthesis models that reliably translate source- onto target-contrast images. To facilitate collaborations, federated learning (FL) adopts decentralized model training while mitigating privacy concerns by avoiding sharing of imaging data. However, conventional FL methods can be impaired by the inherent heterogeneity in the data distribution, with domain shifts evident within and across imaging sites. Here we introduce the first personalized FL method for MRI Synthesis (pFLSynth) that improves reliability against data heterogeneity via model specialization to individual sites and synthesis tasks (i.e., source-target contrasts). To do this, pFLSynth leverages an adversarial model equipped with novel personalization blocks that control the statistics of generated feature maps across the spatial/channel dimensions, given latent variables specific to sites and tasks. To further promote communication efficiency and site specialization, partial network aggregation is employed over later generator stages while earlier generator stages and the discriminator are trained locally. As such, pFLSynth enables multi-task training of multi-site synthesis models with high generalization performance across sites and tasks. Comprehensive experiments demonstrate the superior performance and reliability of pFLSynth in MRI synthesis against prior federated methods.
TL;DR: Optimal auctions through deep learning: Advances in differentiable economics TLDR - Optimal auction design using deep learning techniques to maximize expected revenue.
Abstract: Designing an incentive compatible auction that maximizes expected revenue is an intricate task. The single-item case was resolved in a seminal piece of work by Myerson in 1981, but more than 40 years later, a full analytical understanding of the optimal design still remains elusive for settings with two or more items. In this work, we initiate the exploration of the use of tools from deep learning for the automated design of optimal auctions. We model an auction as a multi-layer neural network, frame optimal auction design as a constrained learning problem, and show how it can be solved using standard machine learning pipelines. In addition to providing generalization bounds, we present extensive experimental results, recovering essentially all known solutions that come from the theoretical analysis of optimal auction design problems and obtaining novel mechanisms for settings in which the optimal mechanism is unknown.
TL;DR: Optimized self-supervised learning strategies for unlabeled data enhance accuracy and generalization capabilities, leading to improved model training.
Abstract: This study explores optimization strategies for self-supervised learning in the use of unlabeled data. By deeply analyzing existing research, we propose a novel method that significantly enhances the performance of algorithms on unlabeled data, achieving improved accuracy and generalization capabilities. Our method is validated across multiple datasets, demonstrating superior performance compared to traditional approaches. We also discuss how to optimize self-supervised learning strategies in the use of unlabeled data. Through improvements and optimizations of self-supervised learning algorithms, we introduce a new method for effectively utilizing unlabeled data for model training. Experimental results show significant performance improvements across various datasets, highlighting the method's robust generalization ability. This research is significant for advancing self-supervised learning technologies, providing valuable insights for related fields.
TL;DR: A framework for simulating quantum dynamics using quantum machine learning with provable generalization bounds. The algorithm is resource efficient and exhibits efficient scaling with problem size.
Abstract: Much attention has been paid to dynamical simulation and quantum machine learning (QML) independently as applications for quantum advantage, while the possibility of using QML to enhance dynamical simulations has not been thoroughly investigated. Here we develop a framework for using QML methods to simulate quantum dynamics on near-term quantum hardware. We use generalization bounds, which bound the error a machine learning model makes on unseen data, to rigorously analyze the training data requirements of an algorithm within this framework. Our algorithm is thus resource efficient in terms of qubit and data requirements. Furthermore, our preliminary numerics for the XY model exhibit efficient scaling with problem size, and we simulate 20 times longer than Trotterization on IBMQ-Bogota.
TL;DR: Utilizing data science and AI for customer churn prediction in marketing yields high accuracy using ensemble learning algorithms, particularly CatBoost, which outperforms single algorithms and showcases superior generalization capabilities.
Abstract: This study explores the application of data science and AI techniques in predicting customer churn within the telecommunications industry, a sector characterized by intense competition and high customer turnover rates. By analyzing historical customer data, including usage patterns and service preferences, the study aims to identify factors contributing to churn and propose targeted retention strategies to mitigate losses. Traditional classification algorithms and ensemble techniques are evaluated using the Telecom-Customer-Churn dataset, with emphasis on the underutilized Stacking ensemble method. The results demonstrate that ensemble learning algorithms, particularly the Stacking model, outperform single algorithms, with CatBoost exhibiting the highest accuracy at 0.8119, followed closely by RandomForest at 0.7902 and XGBoost at 0.7820. These findings underscore CatBoost's superior generalization capabilities, likely attributed to its adept handling of categorical features and missing values, and its ability to model complex data relationships. The study contributes to advancing understanding of ensemble models and offers valuable insights for predicting telecom customer churn, thereby aiding in the development of effective retention strategies and enhancing customer satisfaction and loyalty.
TL;DR: Selective knowledge sharing for privacy-preserving federated distillation without a good teacher significantly improves the generalization capabilities of the framework.
Abstract: Abstract While federated learning (FL) is promising for efficient collaborative learning without revealing local data, it remains vulnerable to white-box privacy attacks, suffers from high communication overhead, and struggles to adapt to heterogeneous models. Federated distillation (FD) emerges as an alternative paradigm to tackle these challenges, which transfers knowledge among clients instead of model parameters. Nevertheless, challenges arise due to variations in local data distributions and the absence of a well-trained teacher model, which leads to misleading and ambiguous knowledge sharing that significantly degrades model performance. To address these issues, this paper proposes a selective knowledge sharing mechanism for FD, termed Selective-FD , to identify accurate and precise knowledge from local and ensemble predictions, respectively. Empirical studies, backed by theoretical insights, demonstrate that our approach enhances the generalization capabilities of the FD framework and consistently outperforms baseline methods. We anticipate our study to enable a privacy-preserving, communication-efficient, and heterogeneity-adaptive federated training framework.
TL;DR: Deep learning is a powerful tool for complex data mining analysis and pattern recognition. It introduces basic concepts, explores application methods, analyzes challenges, and proposes future trends and research directions.
Abstract: This paper systematically investigates and discusses the application of deep learning in complex data mining analysis and pattern recognition. Firstly, it introduces the basic concepts of deep learning and commonly used models, including artificial neural networks, convolutional neural networks, and recurrent neural networks. Then, it elaborates on the application methods and techniques of deep learning in mining different types of complex data (such as images, text, time series, etc.), and explores the latest research progress in the field of pattern recognition. Furthermore, it analyzes the challenges faced by deep learning in practical applications, such as data scarcity and model generalization capabilities, and proposes future development trends and research directions. Finally, it summarizes the research content and significance of this paper, emphasizing the importance and application prospects of deep learning in the field of complex data mining and pattern recognition.
TL;DR: This survey paper explores meta-learning for domain generalization, introducing a taxonomy and decision graph to navigate methodologies, and provides a comprehensive review of existing methods and theories, highlighting promising research directions for improving DNNs' performance with out-of-distribution data.
Abstract: Abstract Deep neural networks (DNNs) have revolutionized artificial intelligence but often lack performance when faced with out-of-distribution data, a common scenario due to the inevitable domain shifts in real-world applications. This limitation stems from the common assumption that training and testing data share the same distribution-an assumption frequently violated in practice. Despite their effectiveness with large amounts of data and computational power, DNNs struggle with distributional shifts and limited labeled data, leading to overfitting and poor generalization across various tasks and domains. Meta-learning presents a promising approach by employing algorithms that acquire transferable knowledge across various tasks for fast adaptation, eliminating the need to learn each task from scratch. This survey paper delves into the realm of meta-learning with a focus on its contribution to domain generalization. We first clarify the concept of meta-learning for domain generalization and introduce a novel taxonomy based on the feature extraction strategy and the classifier learning methodology, offering a granular view of methodologies. Additionally, we present a decision graph to assist readers in navigating the taxonomy based on data availability and domain shifts, enabling them to select and develop a proper model tailored to their specific problem requirements. Through an exhaustive review of existing methods and underlying theories, we map out the fundamentals of the field. Our survey provides practical insights and an informed discussion on promising research directions.