Top 68 papers published in the topic of Linear model in 2025

Showing papers on "Linear model published in 2025"

Journal Article•10.1017/psy.2024.2•

Deriving models of change with interpretable parameters: linear estimation with nonlinear inference

[...]

Ethan M. McCormick

03 Jan 2025-Psychometrika

Repository•10.1177/15578666251370766•

Separating Biological Variance from Noise by Applying Expectation-Maximization Algorithm to Modified General Linear Model.

[...]

Tien‐Wen Lee

5 Sep 2025

TL;DR: A modified General Linear Model (GLM) with Expectation-Maximization (EM) algorithm, EMSEV, distinguishes biological variance from noise, outperforming traditional GLM, with promising applications in biological science and statistical inference, despite deviations in noise estimation at similar variance levels.

...read moreread less

Abstract: The general linear model (GLM) has been widely used in research, where the error term has been treated as noise. However, compelling evidence suggests that in biological systems, the target variables may possess their innate variances. A modified GLM was proposed to explicitly model biological variance and nonbiological noise. Using the expectation and maximization (EM) scheme can distinguish biological variance from noise, termed EMSEV (EM for separating variances). The performance of EMSEV was evaluated by varying noise levels, dimensions of the design matrix, and covariance structures of the target variables. The deviation between EMSEV outputs and the predefined distribution parameters increased with noise level. With a proper initial guess, when the noise magnitude and the variance of the target variables were similar, there were deviations of 3% and 10%-16% in the estimated mean and covariance of the target variables, respectively, along with a 1.7% deviation in noise estimation. EMSEV appears promising for distinguishing signal variance from noise in biological systems. The potential applications and implications in biological science and statistical inference are discussed.

...read moreread less

Journal Article•10.3934/math.2025621•

An efficient iterative model averaging framework for ultrahigh-dimensional linear regression models with missing data

[...]

Xianwen Ding, Su Tong, Yunqi Zhang

01 Jan 2025-AIMS mathematics

Journal Article•10.1017/psy.2024.19•

Estimation of Linear models from Coarsened Observations: A Method of Moments Approach

[...]

Bernard M. S. van Praag, J. Peter Hop, William H. Greene

10 Mar 2025-Psychometrika

Journal Article•10.1017/s0266466624000355•

From model selection to model averaging: a comparison for nested linear models

[...]

Wenchao Xu, Xinyu Zhang

07 Jan 2025-Econometric Theory

TL;DR: This study compares model selection and model averaging for nested linear models, showing that model averaging can significantly improve estimation risk under certain conditions, particularly with heteroscedastic and autocorrelated errors and sparse coefficients.

...read moreread less

Abstract: Model selection (MS) and model averaging (MA) are two popular approaches when many candidate models exist. Theoretically, the estimation risk of an oracle MA is not larger than that of an oracle MS because the former is more flexible, but a foundational issue is this: Does MA offer a substantial improvement over MS? Recently, seminal work by Peng and Yang (2022) has answered this question under nested models with linear orthonormal series expansion. In the current paper, we further respond to this question under linear nested regression models. A more general nested framework, heteroscedastic and autocorrelated random errors, and sparse coefficients are allowed in the current paper, giving a scenario that is more common in practice. A remarkable implication is that MS can be significantly improved by MA under certain conditions. In addition, we further compare MA techniques with different weight sets. Simulation studies illustrate the theoretical findings in a variety of settings.

...read moreread less

Journal Article•10.48550/arxiv.2505.24293•

Equivalent Linear Mappings of Large Language Models

[...]

2 Jun 2025

TL;DR: Researchers develop an equivalent linear mapping for large language models, revealing low-dimensional semantic structures in next-token predictions, and demonstrate that LLMs operate in extremely low-dimensional subspaces, enabling interpretable semantic concept decoding.

...read moreread less

Abstract: Despite significant progress in transformer interpretability, an understanding of the computational mechanisms of large language models (LLMs) remains a fundamental challenge. Many approaches interpret a network's hidden representations but remain agnostic about how those representations are generated. We address this by mapping LLM inference for a given input sequence to an equivalent and interpretable linear system which reconstructs the predicted output embedding with relative error below $10^{-13}$ at double floating-point precision, requiring no additional model training. We exploit a property of transformers wherein every operation (gated activations, attention, and normalization) can be expressed as $A(x) \cdot x$, where $A(x)$ represents an input-dependent linear transform and $x$ preserves the linear pathway. To expose this linear structure, we strategically detach components of the gradient computation with respect to an input sequence, freezing the $A(x)$ terms at their values computed during inference, such that the Jacobian yields an equivalent linear mapping. This detached Jacobian of the model reconstructs the output with one linear operator per input token, which is shown for Qwen 3, Gemma 3 and Llama 3, up to Qwen 3 14B. These linear representations demonstrate that LLMs operate in extremely low-dimensional subspaces where the singular vectors can be decoded to interpretable semantic concepts. The computation for each intermediate output also has a linear equivalent, and we examine how the linear representations of individual layers and their attention and multilayer perceptron modules build predictions, and use these as steering operators to insert semantic concepts into unrelated text. Despite their global nonlinearity, LLMs can be interpreted through equivalent linear representations that reveal low-dimensional semantic structures in the next-token prediction process.

...read moreread less

Journal Article•10.5281/zenodo.15411541•

From Linear Regression to Deep Learning

[...]

14 May 2025

TL;DR: This comprehensive guide to linear regression covers its conceptual roots, practical implementation, and real-world applications, spanning from basic concepts to advanced topics, including deep learning, to equip readers with skills for diverse analytical scenarios.

...read moreread less

Abstract: In the rapidly evolving field of data science and machine learning, Linear Regression remains one of the most foundational and widely applied statistical modeling techniques. Despite the emergence of advanced algorithms and deep learning architectures, linear regression continues to serve as the first step in understanding relationships among variables, making predictions, and drawing meaningful insights from data. This book is a comprehensive guide to linear regression— starting from its conceptual roots to practical implementation and real-world applications. The content has been meticulously structured to support both beginners and intermediate learners in gaining a deep understanding of linear regression. Chapter 1 introduces the basic concept of linear regression, tracing its historical development and emphasizing its relevance across various real-life domains such as economics, healthcare, and social sciences. Chapter 2 delves into Simple Linear Regression, explaining the mathematical formulation, the least squares approach, and essential assumptions underlying the model. Chapter 3 expands the discussion to Multiple Linear Regression, enabling readers to understand how models evolve when multiple predictors are introduced. Key concepts such as multicollinearity and model evaluation are covered to build a more robust analytical mindset. Chapter 4 provides the theoretical underpinnings of linear regression, including linear algebraic formulations, matrix operations, and solution techniques such as the normal equation and an introduction to gradient descent. In the practical section, Chapter 5 focuses on implementation, guiding the reader through real coding exercises using Python libraries such as NumPy and Scikit-learn. From preprocessing data to evaluating models and visualizing predictions, this chapter translates theory into hands-on learning. This is followed by Chapter 6, which presents a case study on house price prediction, demonstrating how the principles learned can be applied to a real-world dataset to build a predictive model. Chapter 7 offers a balanced view of the advantages and limitations of linear regression, helping readers critically assess when and how to use this technique effectively. Finally, Chapter 8 concludes the book by summarizing key insights and discussing the transition from linear to non- linear models and modern techniques such as deep learning, offering a bridge to more advanced topics in machine learning. This book is designed not only to explain linear regression but also to inspire critical thinking about model selection, performance evaluation, and the broader implications of statistical modeling. Whether you are a student, researcher, data analyst, or practitioner, the journey through these chapters will enhance your understanding and equip you with the skills to apply linear regression confidently in diverse analytical scenarios.

...read moreread less

Journal Article•10.1371/journal.pone.0322101•

Modelling in-hospital length of stay: A comparison of linear and ensemble models for competing risk analysis

[...]

Juan Carlos Espinosa Moreno, Fernando García-García, Naia Mas-Bilbao, Susana García‐Gutiérrez, María José Legarreta, Dae‐Jin Lee - Show less +2 more

26 Aug 2025-PLOS ONE

TL;DR: This study compares linear and ensemble models for in-hospital length of stay prediction, employing competing risk analysis and evaluating four models, with Random Survival Forest using Gray's test split outperforming clinical early warning scores NEWS and MEWS.

...read moreread less

Abstract: Length of Stay (LoS) for in-hospital patients is a relevant indicator of efficiency in healthcare. Moreover, it is often related to the occurrence of hospital-acquired complications. In this work, we aim to explore time-to-event analysis for modelling LoS. We employed competing risk models (CR), as we considered two mutually exclusive outcomes: favorable discharge and deterioration. The explanatory variables included the patient's sex, age, and longitudinal vital signs collected from a dataset comprising [Formula: see text] admissions. To address sparse measurements, we transformed longitudinal vital signs into cross-sectional statistics. Our approach involves data pre-processing, imputation of missing data, and variable selection. We proposed four types of CR models: Cause-specific Cox, Sub-distribution hazard, and two variants of Random Survival Forests, with both generalised Log-Rank test (cause-specific hazard estimates) and Gray's test (cumulative incidences estimations) as node splitting rules. Performance in LoS CR models was evaluated over a time frame from 2 to 15 days. Additionally, we considered baselines with two well-established clinical early warning scores the National Early Warning Score (NEWS) and the Modified Early Warning Score (MEWS). The best model was Random Survival Forest using Gray's test split, with Integrated Brier Score[×100] of 0.386, C-Index above 99%, and Brier Score below 0.006, along the entire time frame. Employing cross-sectional statistics derived from vital signs, along with rigorous data pre-processing, outperformed the degree of correctness of modelling LoS, compared to NEWS and MEWS.

...read moreread less

Repository•10.1371/journal.pone.0316728.s006•

Results of ANOVA following Generalized linear models (GLMs) for the behavioural outcome variables across months during the non-experimental year (2015) in vervet monkeys (<i>Chlorocebus pygerythrus</i>) at Lake Nabugabo, Uganda.

[...]

30 Jan 2025

Abstract: Predictors include month (June – December), and sex. The outcome variables include proportion of moving, feeding, grooming, and resting scans. (DOCX)

...read moreread less

Journal Article•10.1080/03610918.2025.2558693•

Empirical likelihood based tests for the linear hypothesis in partially linear spatial autoregressive models

[...]

Peixin Zhao, Xiaoyan Liu, Xinrong Tang, Suli Cheng, Xiaoshuang Zhou - Show less +1 more

16 Sep 2025-Communications in Statistics - Simulation and Computation

Journal Article•10.1080/00031305.2025.2566251•

Linear Model Estimation and Prediction for p>n

[...]

Ronald Christensen

26 Sep 2025-The American Statistician

Journal Article•10.1007/s10985-024-09645-8•

A global kernel estimator for partially linear varying coefficient additive hazards models

[...]

Hoi Min Ng, Kin Yau Wong

09 Jan 2025-Lifetime Data Analysis

TL;DR: This paper develops a global kernel estimator for partially linear varying coefficient additive hazards models, leveraging non-varying nuisance parameters, and establishes consistency and asymptotic normality, outperforming local methods in simulations and a cancer genomic study.

...read moreread less

Abstract: We study kernel-based estimation methods for partially linear varying coefficient additive hazards models, where the effects of one type of covariates can be modified by another. Existing kernel estimation methods for varying coefficient models often use a “local” approach, where only a small local neighborhood of subjects are used for estimating the varying coefficient functions. Such a local approach, however, is generally inefficient as information about some non-varying nuisance parameter from subjects outside the neighborhood is discarded. In this paper, we develop a “global” kernel estimator that simultaneously estimates the varying coefficients over the entire domains of the functions, leveraging the non-varying nature of the nuisance parameter. We establish the consistency and asymptotic normality of the proposed estimators. The theoretical developments are substantially more challenging than those of the local methods, as the dimension of the global estimator increases with the sample size. We conduct extensive simulation studies to demonstrate the feasibility and superior performance of the proposed methods compared with existing local methods and provide an application to a motivating cancer genomic study.

...read moreread less

Journal Article•10.1038/s41598-025-13088-y•

Investigating the wave profiles to the linear quadratic model in mathematical biology

[...]

Usman Younas, Jan Muhammad, Hajar F. Ismael, Tukur Abdulkadir Sulaıman, Mohamed R. Ali, Flah Aymen - Show less +2 more

31 Jul 2025-Dental science reports

TL;DR: This study investigates the linear quadratic model in radiation biology, employing advanced analytical techniques to derive exact solutions for wave profiles, revealing the model's nonlinear dynamics and enhancing understanding of cancer progression and treatment optimization.

...read moreread less

Abstract: This study investigates the dynamic behavior of the linear quadratic model (LQM), a fundamental framework in radiation biology that describes cellular response to radiation, particularly in the context of DNA damage and cancer progression. The LQM was originally developed to quantify radiation-induced cell death and repair mechanisms, with a focus on double-stranded DNA breaks, the most critical type of radiation damage. Despite advances in tracking tumor cell dissemination, the mechanisms underlying cancer invasion remain poorly understood. Mathematical modeling, particularly through partial differential equations, has become an essential tool for simulating tumor growth and optimizing therapeutic strategies, bridging the gap between theoretical biology and clinical applications. In this work, we employ advanced analytical techniques, including the generalized Arnous method, modified F-expansion method, and generalized exponential rational function approaches to solve the model for the first time. By transforming the governing PDE into an ordinary differential equation using β-derivative and wave transformations, we derive exact solutions in the form of dark, bright, singular, mixed, complex, and combined soliton waves. These solutions, visualized through 2D and 3D plots, reveal the system's behavior under varying parameters, demonstrating the computational power and effectiveness of the applied methods. The results not only validate the proposed techniques but also enhance our understanding of the model's nonlinear dynamics. The novel findings presented here are expected to advance future research in radiation biology and cancer treatment optimization.

...read moreread less

Journal Article•10.48550/arxiv.2510.08661•

CATS-Linear: Classification Auxiliary Linear Model for Time Series Forecasting

[...]

FU Yingyi, Chen Xinyang, Chen, Guoting

13 Oct 2025

Abstract: Recent research demonstrates that linear models achieve forecasting performance competitive with complex architectures, yet methodologies for enhancing linear models remain underexplored. Motivated by the hypothesis that distinct time series instances may follow heterogeneous linear mappings, we propose the Classification Auxiliary Trend-Seasonal Decoupling Linear Model CATS-Linear, employing Classification Auxiliary Channel-Independence (CACI). CACI dynamically routes instances to dedicated predictors via classification, enabling supervised channel design. We further analyze the theoretical expected risks of different channel settings. Additionally, we redesign the trend-seasonal decomposition architecture by adding a decoupling -- linear mapping -- recoupling framework for trend components and complex-domain linear projections for seasonal components. Extensive experiments validate that CATS-Linear with fixed hyperparameters achieves state-of-the-art accuracy comparable to hyperparameter-tuned baselines while delivering SOTA accuracy against fixed-hyperparameter counterparts.

...read moreread less

Repository•10.5281/zenodo.15704688•

Integrating Linear and Nonlinear Models for Enhanced Process Monitoring

[...]

Klauco Martin, Ľubušký, Karol, Paulen Radoslav

20 Jun 2025

Abstract: This study explores data-driven approaches for modeling industrial processes by employing linear and nonlinear techniques to predict output variables based on available input measurements. Linear regression-based techniques are compared with nonlinear machine learning models to evaluate their predictive capabilities. The analysis considers models trained on high-accuracy & low-frequency laboratory data alongside models leveraging low-accuracy & high-frequency sensor measurements. A hybrid methodology enhances predictive performance by integrating additional process information in the training process. Our findings show that this hybrid approach reduces the RMSE from 0.74 to 0.38 compared to models that rely solely on sensor measurements.

...read moreread less

Journal Article•10.3389/fnins.2025.1582080•

Exploring the suitability of piecewise-linear dynamical system models for cognitive neural dynamics

[...]

Jiemin Wu, Boateng Asamoah, Zhaodan Kong, Jochen Ditterich

12 May 2025-Frontiers in neuroscience

TL;DR: This study explores the suitability of piecewise-linear dynamical system models for cognitive neural dynamics, demonstrating their potential for modeling brain activity, particularly in controlled settings, and outperforming linear models in predicting future states.

...read moreread less

Abstract: Dynamical system models have proven useful for decoding the current brain state from neural activity. So far, neuroscience has largely relied on either linear models or non-linear models based on artificial neural networks (ANNs). Piecewise linear approximations of non-linear dynamics have proven useful in other technical applications. Moreover, such explicit models provide a clear advantage over ANN-based models when the dynamical system is not only supposed to be observed, but also controlled, in particular when a controller with guarantees is needed. Here we explore whether piecewise-linear dynamical system models (recurrent Switching Linear Dynamical System or rSLDS models) could be useful for modeling brain dynamics, in particular in the context of cognitive tasks. These models have the advantage that they can be estimated not only from continuous observations like field potentials or smoothed firing rates, but also from sparser single-unit spiking data. We first generate artificial neural data based on a non-linear computational model of perceptual decision-making and demonstrate that piecewise-linear dynamics can be successfully recovered from these observations. We then demonstrate that the piecewise-linear model outperforms a linear model in terms of predicting future states of the system and associated neural activity. Finally, we apply our approach to a publicly available dataset recorded from monkeys performing perceptual decisions. Much to our surprise, the piecewise-linear model did not provide a significant advantage over a linear model for these particular data, although linear models that were estimated from different trial epochs showed qualitatively different dynamics. In summary, we present a dynamical system modeling approach that could prove useful in situations, where the brain state needs to be controlled in a closed-loop fashion, for example, in new neuromodulation applications for treating cognitive deficits. Future work will have to show under what conditions the brain dynamics are sufficiently non-linear to warrant the use of a piecewise-linear model over a linear one.

...read moreread less

Journal Article•10.1016/j.xpro.2025.104113•

Protocol to detect dilution cycles in chemostat experiments and estimate growth rate slopes with linear modeling with R software chemostat_regression

[...]

Samuel I. Koehler, Jennifer T. Pentz, Earl A. Middlebrook, Blake T. Hovde, Erik R. Hanschen - Show less +1 more

27 Sep 2025-STAR protocols

Journal Article•10.55214/2576-8484.v9i9.10056•

Modelling and prediction of economic growth for Nigeria under the violation of linear model assumptions: A robust principal component regression approach

[...]

Ayooluwade Ebiwonjumi, Adefemi A. Obalade

17 Sep 2025-Edelweiss applied science and technology

TL;DR: This study models Nigeria's economic growth using robust principal component regression, addressing multicollinearity and outliers, and finds that internal and external debt, interest rate, exchange rate, and economic openness significantly influence growth, with M-estimation providing the most reliable predictions.

...read moreread less

Abstract: This study was conducted to model, estimate, and predict Nigeria’s economic growth (RGDP) by examining the influence of key macroeconomic drivers: internal debt (INDT), external debt (EXDT), interest rate (RINR), exchange rate (REXR), and the degree of economic openness (OPEN). Preliminary exploratory and diagnostic analyses revealed significant challenges to classical linear regression assumptions, particularly the presence of multicollinearity and outliers. To address these issues, robust principal component regression (PCR) estimation methods were employed. Principal component analysis (PCA) extracted two uncorrelated predictors (PC1 and PC2), which captured the joint variability of the original determinants while addressing collinearity. Subsequently, robust estimation techniques—namely M-estimation, S-estimation, and MM-estimation—were used to generate efficient estimated parameters. A comparative evaluation based on root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and Theil’s inequality coefficient established that the M-estimation method outperformed its alternatives, providing the most stable and reliable predictions of RGDP. Empirical findings revealed that both PC1 and PC2 had positive and statistically significant influences on RGDP, with contributions of 35.39% and 22.15%, respectively. These results highlight the importance of robust PCR in addressing econometric anomalies and offer valuable policy insights into how structural shocks—such as exchange rate volatility, oil price fluctuations, and COVID-19 disruptions—affected Nigeria’s economic performance.

...read moreread less

Journal Article•10.1134/s0040579524601821•

Identification of Parameters of a Linear Regression Model by Simultaneous Optimization of Two Heterogeneous Criteria

[...]

С. И. Носков, Ivan Vladimirovich Ovsyannikov

10 Jan 2025-Theoretical Foundations of Chemical Engineering

Journal Article•10.1007/s11749-025-00978-6•

Robust penalized estimators for high-dimensional generalized linear models

[...]

Marina Valdora, Claudio Agostinelli

08 Jul 2025-Test

TL;DR: This paper proposes a robust penalized estimator for high-dimensional generalized linear models, providing consistency and asymptotic normality under suitable assumptions, and evaluates its performance through Monte Carlo simulations and an empirical application.

...read moreread less

Abstract: Robust estimators for generalized linear models (GLMs) are not easy to develop due to the nature of the distributions involved. Recently, there has been growing interest in robust estimation methods, particularly in contexts involving a potentially large number of explanatory variables. Transformed M-estimators (MT-estimators) provide a natural extension of M-estimation techniques to the GLM framework, offering robust methodologies. We propose a penalized variant of MT-estimators to address high-dimensional data scenarios. Under suitable assumptions, we demonstrate the consistency and asymptotic normality of this novel class of estimators. Our theoretical development focuses on redescending ρ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rho $$\end{document}-functions and penalization functions that satisfy specific regularity conditions. We present an Iterative re-weighted least-squares algorithm, together with a deterministic initialization procedure, which is crucial since the estimating equations may have multiple solutions. We evaluate the finite-sample performance of this method for Poisson distribution and well-known penalization functions through Monte Carlo simulations that consider various types of contamination, as well as an empirical application using a real dataset.

...read moreread less

Journal Article•10.48550/arxiv.2502.11144•

Consistency of heritability estimation from summary statistics in high-dimensional linear models

[...]

David Azriel, Samuel Davenport, Armin Schwartzman

16 Feb 2025

TL;DR: This study examines the consistency of heritability estimation from summary statistics in high-dimensional linear models, specifically LDSC regression and GWAS heritability, under various conditions and modifications, including weighting and standardization, and population stratification.

...read moreread less

Abstract: In Genome-Wide Association Studies (GWAS), heritability is defined as the fraction of variance of an outcome explained by a large number of genetic predictors in a high-dimensional polygenic linear model. This work studies the asymptotic properties of the most common estimator of heritability from summary statistics called linkage disequilibrium score (LDSC) regression, together with a simpler and closely related estimator called GWAS heritability (GWASH). These estimators are analyzed in their basic versions and under various modifications used in practice including weighting and standardization. We show that, with some variations, two conditions which we call weak dependence (WD) and bounded-kurtosis effects (BKE) are sufficient for consistency of both the basic LDSC with fixed intercept and GWASH estimators, for both Gaussian and non-Gaussian predictors. For Gaussian predictors it is shown that these conditions are also necessary for consistency of GWASH (with truncation) and simulations suggest that necessity holds too when the predictors are non-Gaussian. We also show that, with properly truncated weights, weighting does not change the consistency results, but standardization of the predictors and outcome, as done in practice, introduces bias in both LDSC and GWASH if the two essential conditions are violated. Finally, we show that, when population stratification is present, all the estimators considered are biased, and the bias is not remedied by using the LDSC regression estimator with free intercept, as originally suggested by the authors of that estimator.

...read moreread less

Repository•10.1371/journal.pone.0317279.s004•

Results from the linear mixed-effects regression model for the sample not supplementing vitamin D.

[...]

24 Jan 2025

Abstract: Results from the linear mixed-effects regression model for the sample not supplementing vitamin D.

...read moreread less

Repository•10.5167/uzh-278847•

Understanding diurnal changes in land surface temperature trends using MODIS and IMIS data

[...]

2 Jul 2025

Abstract: This study investigates the credibility of using land surface temperature (LST) data retrieved from MODIS (Moderate Resolution Imaging Spectroradiometer) satellite images, and this requires comparing the acquired MODIS data with data from ground-based stations named as Intercantonal Measurement and Information System (IMIS). The study was applied for Swiss Alps covering the period between 2000 and 2023 including four MODIS observation times (i.e., MOD21A1D, MOD21A1N, MYD21A1D, and MYD21A1N). The comparative analysis based mainly on a Harmonic Regression Model which combines harmonic and linear regressions, and enables calculating the trends for both data sources. Therefore, analytical approaches were applied to support the comparison and data analysis. Five research questions were primarily identified as a reference for the achievement of the study. In order to compare the actual measurements, plots were created to determine the median absolute deviation "MAD-1", R-squared, slope and mean deviation values. While, comparison between the data means was made using the mean absolute difference "MAD-2" and Pearson’s correlation values, and a compatibility was found between both data with a preference for nighttime. For the comparison of trends, it was performed by comparing the trends of the two data at each specific hour of the day during the four MODIS observation times using the MAD-2 and Pearson’s correlation values. The comparison was also made between the mean of the trend data of both datasets using the mean absolute error (MAE) and standard deviation values. Determining the most representative observation time required to compare IMIS trend data at each of the MODIS observation times with the overall trends, the comparison was made using MAD-2 and Pearson’s correlation values. It revealed that MOD21A1D observation time has the best representativeness of trends. In order to investigate factors resulting changes in data; however, changes in IMIS data was compared with the elevation and aspect of ground stations; while MODIS data was compared with the view angle of satellites’ sensors. Therefore, elevation does not show any noticeable effect on IMIS data, except limited LST trend means which is almost low (< 0.05) at altitudes above 2000 m. This is also the case for the aspect where no relationship with IMIS trends has been reported. Besides, an effect of the view angle on MODIS measurements was noticed, but it differs between various observation times. In addition, Landsat 5, 7, and 8 observation times were utilized for comparison with the representativeness of MODIS observation times; especially that Landsat images are not acquired at nighttime which is a limitation effecting its accuracy. This study performed a comprehensive analytical approach that facilitates understanding the relation between MODIS LST and IMIS data and trends. It supports adopting MODIS data for calculating LST which is significant for future researches on hydroclimate analysis notably MODIS is a daily source of LST data.

...read moreread less

Repository•10.5281/zenodo.17467584•

Precise Linear Dependence: Calibrating Regression and Correlation Under Distributional Shifts

[...]

28 Oct 2025

Abstract: This paper investigates the degradation of linear regression and correlation models under distributional shifts, a prevalent issue in real-world data applications. Distributional shift, particularly covariate shift, occurs when the input data distribution changes between training and deployment, violating the standard assumption of independent and identically distributed data. This violation can lead to unreliable predictions and inaccurate measures of linear dependence. We propose a comprehensive framework for calibrating these models to maintain their precision and robustness. The methodology involves three core components: (1) a shift detection module using statistical distance measures and discriminative classifiers to identify the presence and severity of covariate shift; (2) an importance-weighting scheme for recalibrating the regression model by re-weighting the training samples to better reflect the target distribution; and (3) a covariate-adjusted correlation technique that recalibrates the Pearson correlation coefficient by accounting for the distorting effects of the distributional shift. We demonstrate through theoretical exposition and simulated experiments that standard models fail significantly when faced with even moderate shifts, whereas our proposed calibration techniques effectively restore predictive accuracy and the fidelity of correlation analysis. The findings underscore the critical need for explicit calibration mechanisms when deploying linear models in non-stationary environments, ensuring that statistical inferences remain valid and reliable over time.

...read moreread less

Repository•10.5281/zenodo.17437270•

Sparse Factorial Designs: A Regularized Linear Model Approach to High-Dimensional ANOVA

[...]

24 Oct 2025

Abstract: Traditional Analysis of Variance (ANOVA) faces significant challenges in high-dimensional settings where the number of factors and their interactions exceeds the number of observations. This paper introduces a framework for analyzing factorial designs using regularized linear models, specifically leveraging the Least Absolute Shrinkage and Selection Operator (LASSO) to enforce sparsity. By representing the full factorial model, including main effects and multi-way interactions, as a linear model, we can apply L1 regularization to simultaneously perform variable selection and parameter estimation. This approach is predicated on the sparsity-of-effects principle, which posits that only a small fraction of potential effects are active. The proposed method effectively identifies significant main effects and interactions from a vast pool of potential candidates, thus providing a tractable solution to the curse of dimensionality in experimental design. We discuss the formulation of the design matrix for a full factorial model, the application of the LASSO penalty, and strategies for interpreting the results in the context of ANOVA. The methodology is particularly suited for screening experiments in fields like genomics, manufacturing, and computer science, where identifying the vital few factors from the trivial many is paramount. We demonstrate through a detailed simulation that the regularized approach can recover the true sparse structure of effects with high probability, offering a powerful alternative to classical techniques.

...read moreread less

Journal Article•10.1155/jom/6421789•

Parameter Estimation of the Partially Linear Quantile Regression Model Under Monotonic Constraints

[...]

Shujin Wu, Z. C. Yu, Shanshan Liang, Yanke Ren

01 Jan 2025-Journal of Mathematics

TL;DR: This paper introduces a partially linear quantile regression model with monotonic constraints, proposing two novel estimation methods: coordinate descent and profile likelihood, which simplify the estimation process and outperform traditional approaches in estimating nonparametric components.

...read moreread less

Abstract: The paper brings forward the partially linear quantile regression model by incorporating monotonic constraints, which are common in real‐world relationships between variables. It introduces two novel parameter estimation methods, that is, the coordinate descent method and the profile likelihood method, which eliminate the extensive tuning and simplify the estimation process. Theoretical analysis confirms the estimator’s consistency and a convergence rate of n −1/3 . Numerical simulations and case studies demonstrate the superiority of these methods over traditional approaches, particularly in estimating the nonparametric components of the model, highlighting their potential for practical use in various fields.

...read moreread less

Journal Article•10.54097/3f4g6n32•

Construction and Application of a Hybrid Predictive Model Incorporating Random Forest and Multiple Linear Regression

[...]

Z. Zhang, You Zhou, Zhicheng Ye

26 Jun 2025-Deleted Journal

TL;DR: A hybrid predictive model combining random forest and multiple linear regression is constructed to analyze and predict multidimensional features, enhancing stability and generalization ability through multi-level model collaboration and data processing optimization.

...read moreread less

Abstract: In this study, a prediction framework integrating random forest and multiple linear regression is constructed to focus on the quantitative analysis and prediction of multidimensional features. Firstly, a random forest decision tree is constructed based on the Gini index, and the classification and regression tasks are achieved by ranking the importance of features, completing the model parameter setting and error validation, so as to achieve the modelling and prediction of the non-linear relationship of multi-dimensional features. In addition, the study introduces independent variables such as dichotomous variables, occupancy category indicators and capacity values, fits a multiple linear regression model relying on the least squares method, and tests for multiple covariances through variance-inflated factor tests to quantify the extent to which specific factors influence the results. The framework enhances the stability and generalisation ability of multivariate system modelling through multi-level model collaboration and data processing optimisation, and provides a scalable technical paradigm for related fields.

...read moreread less

Journal Article•10.21276/aatccreview.2025.13.03.323•

Evaluation of Predictive models for tomato production and cultivation area in Himachal Pradesh: A Linear and Non- linear approach

[...]

Sukhdeep Kaur, Ashu Chandel, Rajesh Kumar Gupta, Geeta Verma, Pawan Kumar, Nandni Vashish - Show less +2 more

1 Sep 2025

TL;DR: This study evaluates predictive models for tomato production and cultivation area in Himachal Pradesh, finding cubic and quadratic models best fit area and production, respectively, with 4.60% and 5.90% annual growth rates from 1995-2023.

...read moreread less

Abstract: This study analyzed the trend in area and production of tomatoes over a time period is important for understanding past behavior and for future planning. Tomato cultivation is highly sensitive to seasonal fluctuations and climatic factors. Therefore, to understand the prior and posterior patterns of tomato cultivation area and production, these statistical models were applied. The statistical study was carried out on different growth models viz. linear, quadratic, cubic, compound, and power for the area and production of tomatoes in Himachal Pradesh for the study period 1995 -2023. The study revealed the cubic and quadratic model was found to best fit the model for area and production, respectively. The highest value of CDVI for the area is 5.40 which indicates higher level of instability in which the variable is more erratic and has less area over time. The increasing annual growth rate for tomato areas is 4.60 percent and 5.90 percent with respect to production of tomatoes over the studied period of time using the compound model. The best-fit statistical models can be used to predict future values with greater accuracy.

...read moreread less

Journal Article•10.2139/ssrn.5142785•

Bert with Linear Regression Model for Captions Based Lecture Video Summarization

[...]

Vignesh Kumar, Balasundaram Ramakrishnan Sadhu

1 Jan 2025

Journal Article•10.2139/ssrn.5230376•

Kgmlp: A Domain-Knowledge Augmented Linear Model for Remaining Useful Life Estimation in Rotating Machinery

[...]

Peichao Qiu, Jianzhuo Yan, Yongchuan Yu, Hongxia Xu

1 Jan 2025

Showing papers on "Linear model published in 2025"

Deriving models of change with interpretable parameters: linear estimation with nonlinear inference

Separating Biological Variance from Noise by Applying Expectation-Maximization Algorithm to Modified General Linear Model.

An efficient iterative model averaging framework for ultrahigh-dimensional linear regression models with missing data

Estimation of Linear models from Coarsened Observations: A Method of Moments Approach

From model selection to model averaging: a comparison for nested linear models

Equivalent Linear Mappings of Large Language Models

From Linear Regression to Deep Learning

Modelling in-hospital length of stay: A comparison of linear and ensemble models for competing risk analysis

Results of ANOVA following Generalized linear models (GLMs) for the behavioural outcome variables across months during the non-experimental year (2015) in vervet monkeys (<i>Chlorocebus pygerythrus</i>) at Lake Nabugabo, Uganda.

Empirical likelihood based tests for the linear hypothesis in partially linear spatial autoregressive models

Linear Model Estimation and Prediction for p&gt;n

A global kernel estimator for partially linear varying coefficient additive hazards models

Investigating the wave profiles to the linear quadratic model in mathematical biology

CATS-Linear: Classification Auxiliary Linear Model for Time Series Forecasting

Integrating Linear and Nonlinear Models for Enhanced Process Monitoring

Exploring the suitability of piecewise-linear dynamical system models for cognitive neural dynamics

Protocol to detect dilution cycles in chemostat experiments and estimate growth rate slopes with linear modeling with R software chemostat_regression

Modelling and prediction of economic growth for Nigeria under the violation of linear model assumptions: A robust principal component regression approach

Identification of Parameters of a Linear Regression Model by Simultaneous Optimization of Two Heterogeneous Criteria

Robust penalized estimators for high-dimensional generalized linear models

Consistency of heritability estimation from summary statistics in high-dimensional linear models

Results from the linear mixed-effects regression model for the sample not supplementing vitamin D.

Understanding diurnal changes in land surface temperature trends using MODIS and IMIS data

Precise Linear Dependence: Calibrating Regression and Correlation Under Distributional Shifts

Sparse Factorial Designs: A Regularized Linear Model Approach to High-Dimensional ANOVA

Parameter Estimation of the Partially Linear Quantile Regression Model Under Monotonic Constraints

Construction and Application of a Hybrid Predictive Model Incorporating Random Forest and Multiple Linear Regression

Evaluation of Predictive models for tomato production and cultivation area in Himachal Pradesh: A Linear and Non- linear approach

Bert with Linear Regression Model for Captions Based Lecture Video Summarization

Kgmlp: A Domain-Knowledge Augmented Linear Model for Remaining Useful Life Estimation in Rotating Machinery

Linear Model Estimation and Prediction for p>n