Network-based group variable selection for detecting expression quantitative trait loci (eQTL).
Weichen Wang,Xuegong Zhang +1 more
TL;DR: The proposed network-based group variable selection (NGVS) method for QTL detection outperforms the classical Lasso method and is appropriate for problems with high-dimensional data and high-noise background.
read more
Abstract: Analysis of expression quantitative trait loci (eQTL) aims to identify the genetic loci associated with the expression level of genes. Penalized regression with a proper penalty is suitable for the high-dimensional biological data. Its performance should be enhanced when we incorporate biological knowledge of gene expression network and linkage disequilibrium (LD) structure between loci in high-noise background. We propose a network-based group variable selection (NGVS) method for QTL detection. Our method simultaneously maps highly correlated expression traits sharing the same biological function to marker sets formed by LD. By grouping markers, complex joint activity of multiple SNPs can be considered and the dimensionality of eQTL problem is reduced dramatically. In order to demonstrate the power and flexibility of our method, we used it to analyze two simulations and a mouse obesity and diabetes dataset. We considered the gene co-expression network, grouped markers into marker sets and treated the additive and dominant effect of each locus as a group: as a consequence, we were able to replicate results previously obtained on the mouse linkage dataset. Furthermore, we observed several possible sex-dependent loci and interactions of multiple SNPs. The proposed NGVS method is appropriate for problems with high-dimensional data and high-noise background. On eQTL problem it outperforms the classical Lasso method, which does not consider biological knowledge. Introduction of proper gene expression and loci correlation information makes detecting causal markers more accurate. With reasonable model settings, NGVS can lead to novel biological findings.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Group variable selection and estimation in the tobit censored response model
TL;DR: The proposed method selects variables significantly contributing to the regression model and presents consistent estimates of parameters in the selected groups and the asymptotic properties of the resulting estimates are similar to oracle properties.
17
On predicting regulatory genes by analysis of functional networks in C. elegans
O. Valba,Sergei Nechaev,Sergei Nechaev,Mark G. Sterken,Mark G. Sterken,Mark G. Sterken,L. Basten Snoek,Jan E. Kammenga,Olga O. Vasieva +8 more
TL;DR: It is found that genes associated with eQTLs are highly clustered in a C. elegans co-expression sub-network, and their adjacent genetic interactions provide the optimal functional connectivity environment for application of the new SPF-based algorithm.
Expression QTLs Mapping and Analysis: A Bayesian Perspective.
TL;DR: The advantages of the Bayesian approach over frequentist methods are reviewed and an empirical example of polygenic eQTL mapping is provided to illustrate the different properties of frequentist and Bayesian methods.
Interpreting Functional Impact of Genetic Variations by Network QTL for Genotype–Phenotype Association Study
K. Yuan,Tao Zeng,Luonan Chen +2 more
TL;DR: The results show that the nQTL framework can efficiently capture associations between SNPs and network traits (i.e., edge traits) in various simulated data scenarios, compared with traditional eQTL methods and could also identify many network traits from human bulk expression data, validated by matched single-cell RNA-seq data in an independent or unsupervised manner.
5
•Dissertation
Statistical analysis of networks and biophysical systems of complex architecture
Olga Valba
- 15 Oct 2013
TL;DR: In this article, the authors developed a set of methods for studying statistical and dynamic objects of complex architecture and, in particular, scale-free structures, which have no characteristic spatial and/or time scale.
5
References
Regression Shrinkage and Selection via the Lasso
TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Gene Ontology: tool for the unification of biology
M Ashburner,Catherine A. Ball,Judith A. Blake,David Botstein,Heather Butler,J. M. Cherry,Allan Peter Davis,Kara Dolinski,Selina S. Dwight,J.T. Eppig,Midori A. Harris,David P. Hill,Laurie Issel-Tarver,Andrew Kasarskis,Suzanna E. Lewis,John C. Matese,Joel E. Richardson,M. Ringwald,Gerald M. Rubin,Gavin Sherlock +19 more
TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
KEGG: Kyoto Encyclopedia of Genes and Genomes
Minoru Kanehisa,Susumu Goto +1 more
TL;DR: The Kyoto Encyclopedia of Genes and Genomes (KEGG) as discussed by the authors is a knowledge base for systematic analysis of gene functions in terms of the networks of genes and molecules.
Regularization and variable selection via the elastic net
Hui Zou,Trevor Hastie +1 more
TL;DR: It is shown that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation, and an algorithm called LARS‐EN is proposed for computing elastic net regularization paths efficiently, much like algorithm LARS does for the lamba.
Exploration, normalization, and summaries of high density oligonucleotide array probe level data
Rafael A. Irizarry,Bridget G. Hobbs,Francois Collin,Yasmin Beazer-Barclay,Kristen J. Antonellis,Uwe Scherf,Terence P. Speed +6 more
TL;DR: There is no obvious downside to using RMA and attaching a standard error (SE) to this quantity using a linear model which removes probe-specific affinities, and the exploratory data analyses of the probe level data motivate a new summary measure that is a robust multi-array average (RMA) of background-adjusted, normalized, and log-transformed PM values.