Data Integration Challenges for Machine Learning in Precision Medicine
TL;DR: A number of challenges arise in the design of successful designs to medical data analytics under currently demanding conditions of performance in personalized medicine, while also subject to time, computational power, and bioethical constraints as discussed by the authors .
read more
Abstract: A main goal of Precision Medicine is that of incorporating and integrating the vast corpora on different databases about the molecular and environmental origins of disease, into analytic frameworks, allowing the development of individualized, context-dependent diagnostics, and therapeutic approaches. In this regard, artificial intelligence and machine learning approaches can be used to build analytical models of complex disease aimed at prediction of personalized health conditions and outcomes. Such models must handle the wide heterogeneity of individuals in both their genetic predisposition and their social and environmental determinants. Computational approaches to medicine need to be able to efficiently manage, visualize and integrate, large datasets combining structure, and unstructured formats. This needs to be done while constrained by different levels of confidentiality, ideally doing so within a unified analytical architecture. Efficient data integration and management is key to the successful application of computational intelligence approaches to medicine. A number of challenges arise in the design of successful designs to medical data analytics under currently demanding conditions of performance in personalized medicine, while also subject to time, computational power, and bioethical constraints. Here, we will review some of these constraints and discuss possible avenues to overcome current challenges.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Advancing Precision Medicine: A Review of Innovative In Silico Approaches for Drug Development, Clinical Pharmacology and Personalized Healthcare
Lara Marques,Bárbara Costa,M. Fernando R. Pereira,Abigail Silva,Julia M. Santos,Leonor Saldanha,Isabel Silva,Paulo J. Magalhães,Stephan Schmidt,Nuno Vale +9 more
TL;DR: In silico approaches are revolutionizing precision medicine by driving individual healthcare through tailored diagnostics and treatments based on genetic variations, environmental factors, and lifestyle.
58
Data analytics in healthcare: A review of patient-centric approaches and healthcare delivery
Andrew Ifesinachi Daraojimba,Chidera Victoria,Ibeh,Oluwafunmi Adijat Elufioye,Temidayo Olorunsogo,Onyeka Franca Asuzu,Ndubuisi Leonard Nduubuisi +6 more
TL;DR: This review explores patient-centric data analytics in healthcare, leveraging EHRs, wearable devices, and advanced analytics to enhance patient care, optimize healthcare delivery, and predict disease outcomes, while addressing ethical considerations and challenges.
24
Omics-Based Investigations of Breast Cancer
TL;DR: Omics-and epiomics-based multidimensional approaches provide opportunities to gain insights into BC heterogeneity and its underlying mechanisms in predictive, precision, and personalized oncology as discussed by the authors .
AI-Based Decision Support Systems in Industry 4.0, A Review
Mohsen Soori,Fooad Karimi Ghaleh Jough,Roza Dastres,Behrooz Arezoo +3 more
17
Improving child health through Big Data and data science
TL;DR: In this article , the authors highlight existing pediatric Big Data initiatives and identify areas of future research, highlighting the need for new strategies that intentionally address institutional, ethical, regulatory, cultural, technical, and systemic barriers as well as developing partnerships with children and families from diverse backgrounds.
References
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
Stephen F. Altschul,Thomas L. Madden,Alejandro A. Schäffer,Jinghui Zhang,Zheng Zhang,Webb Miller,David J. Lipman +6 more
TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks
Paul Shannon,Andrew Markiel,Owen Ozier,Nitin S. Baliga,Jonathan T. Wang,Daniel Ramage,Nada Amin,Benno Schwikowski,Trey Ideker +8 more
TL;DR: Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.
Gene Ontology: tool for the unification of biology
M Ashburner,Catherine A. Ball,Judith A. Blake,David Botstein,Heather Butler,J. M. Cherry,Allan Peter Davis,Kara Dolinski,Selina S. Dwight,J.T. Eppig,Midori A. Harris,David P. Hill,Laurie Issel-Tarver,Andrew Kasarskis,Suzanna E. Lewis,John C. Matese,Joel E. Richardson,M. Ringwald,Gerald M. Rubin,Gavin Sherlock +19 more
TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses
Shaun Purcell,Shaun Purcell,Benjamin M. Neale,Benjamin M. Neale,Kathe Todd-Brown,Lori Thomas,Manuel A. R. Ferreira,David Bender,David Bender,Julian Maller,Julian Maller,Pamela Sklar,Pamela Sklar,Paul I.W. de Bakker,Paul I.W. de Bakker,Mark J. Daly,Mark J. Daly,Pak C. Sham +17 more
TL;DR: This work introduces PLINK, an open-source C/C++ WGAS tool set, and describes the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation, which focuses on the estimation and use of identity- by-state and identity/descent information in the context of population-based whole-genome studies.
33.2K
The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data
Aaron McKenna,Matthew Hanna,Eric Banks,Andrey Sivachenko,Kristian Cibulskis,Andrew Kernytsky,Kiran V. Garimella,David Altshuler,Stacey Gabriel,Mark J. Daly,Mark A. DePristo +10 more
TL;DR: The GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.