Journal Article10.1007/PL00022715
Parallel MARS Algorithm Based on B-splines
12
TL;DR: In this paper, the authors proposed the use of B-splines instead of truncated power basis functions for flexible modeling of high-dimensional data, which allows to generate models competitive with those of the original MARS.
read more
Abstract: We investigate one of the possible ways for improving Friedman’s Multivariate Adaptive Regression Splines (MARS) algorithm designed for flexible modelling of high-dimensional data. In our version of MARS called BMARS we use B-splines instead of truncated power basis functions. The fact that B-splines have compact support allows us to introduce the notion of a “scale” of a basis function. The algorithm starts building up models by using large-scale basis functions and switches over to a smaller scale after the fitting ability of the large scale splines has been exhausted. The process is repeated until the prespecified number of basis functions has been produced. In addition, we discuss a parallelisation of BMARS as well as an application of the algorithm to processing of a large commercial data set. The results demonstrate the computational efficiency of our algorithm and its ability to generate models competitive with those of the original MARS.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
A review on design, modeling and applications of computer experiments
TL;DR: This paper provides a review of statistical methods that are useful in conducting computer experiments and describes approaches for the two primary tasks of metamodeling: selecting an experimental design and fitting a statistical model.
337
Adaptive sparse grids
TL;DR: It is observed in first tests that these general adaptive sparse grids allow the identification of the ANOVA structure and thus provide comprehensible models, very important for data mining applications.
Ch. 7. A review of design and modeling in computer experiments
TL;DR: This chapter provides a review of statistical methods that are useful in conducting computer experiments and focuses on the task of metamodeling, which is driven by the goal of optimizing a complex system via a deterministic simulation model.
84
Regional vertical total electron content (VTEC) modeling together with satellite and receiver differential code biases (DCBs) using semi-parametric multivariate adaptive regression B-splines (SP-BMARS)
TL;DR: The results show that the SP-BMARS algorithm can be used to estimate satellite and receiver DCBs while adaptively and flexibly modeling the daily regional VTEC.
31
A Generalized Estimating Equation Approach to Multivariate Adaptive Regression Splines
Jakub Stoklosa,David I. Warton +1 more
TL;DR: The proposed MARGE algorithm has improved predictive performance than the original MARS algorithm when using correlated and/or nonnormal response data and is also competitive with alternatives in the literature, especially for problems with multiple interacting predictors.
15
References
•Book
Classification and regression trees
Leo Breiman
- 01 Jan 1983
TL;DR: The methodology used to construct tree structured rules is the focus of a monograph as mentioned in this paper, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.
22.7K
Multivariate Adaptive Regression Splines
TL;DR: In this article, a new method is presented for flexible regression modeling of high dimensional data, which takes the form of an expansion in product spline basis functions, where the number of basis functions as well as the parameters associated with each one (product degree and knot locations) are automatically determined by the data.
•Book
Spline models for observational data
Grace Wahba
- 01 Mar 1990
TL;DR: In this paper, a theory and practice for the estimation of functions from noisy data on functionals is developed, where convergence properties, data based smoothing parameter selection, confidence intervals, and numerical methods are established which are appropriate to a number of problems within this framework.
6.9K
Variable selection via Gibbs sampling
TL;DR: In this paper, the Gibbs sampler is used to indirectly sample from the multinomial posterior distribution on the set of possible subset choices to identify the promising subsets by their more frequent appearance in the Gibbs sample.
3.1K