TL;DR: In this article, the Lipschitz constant is viewed as a weighting parameter that indicates how much emphasis to place on global versus local search, which accounts for the fast convergence of the new algorithm on the test functions.
Abstract: We present a new algorithm for finding the global minimum of a multivariate function subject to simple bounds. The algorithm is a modification of the standard Lipschitzian approach that eliminates the need to specify a Lipschitz constant. This is done by carrying out simultaneous searches using all possible constants from zero to infinity. On nine standard test functions, the new algorithm converges in fewer function evaluations than most competing methods.
The motivation for the new algorithm stems from a different way of looking at the Lipschitz constant. In particular, the Lipschitz constant is viewed as a weighting parameter that indicates how much emphasis to place on global versus local search. In standard Lipschitzian methods, this constant is usually large because it must equal or exceed the maximum rate of change of the objective function. As a result, these methods place a high emphasis on global search and exhibit slow convergence. In contrast, the new algorithm carries out simultaneous searches using all possible constants, and therefore operates at both the global and local level. Once the global part of the algorithm finds the basin of convergence of the optimum, the local part of the algorithm quickly and automatically exploits it. This accounts for the fast convergence of the new algorithm on the test functions.
TL;DR: In this article, the authors show that most of the characterizations that were reported thus far in the literature are special cases of the following general result: a standard multilayer feedforward network with a locally bounded piecewise continuous activation function can approximate any continuous function to any degree of accuracy if and only if the network's activation function is not a polynomial.
TL;DR: It is shown that an ACO Boolean function has almost all of its "power spectrum" on the low-order coefficients, implying several new properties of functions in -4C(': Functions in AC() have low "average sensitivity;" they may be approximated well by a real polynomial of low degree and they cannot be pseudorandom function generators.
Abstract: In this paper, Boolean functions in ,4C0 are studied using harmonic analysis on the cube. The main result is that an ACO Boolean function has almost all of its "power spectrum" on the low-order coefficients. An important ingredient of the proof is Hastad's switching lemma (8). This result implies several new properties of functions in -4C(': Functions in AC() have low "average sensitivity;" they may be approximated well by a real polynomial of low degree and they cannot be pseudorandom function generators. Perhaps the most interesting application is an O(n POIYIOg(n ')-time algorithm for learning func- tions in ACO. The algorithm observes the behavior of an AC'" function on O(nPO'Y'Og(n)) randomly chosen inputs, and derives a good approximation for the Fourier transform of the function. This approximation allows the algorithm to predict, with high probability, the value of the function on other randomly chosen inputs.
TL;DR: An algorithm based on a traditional genetic algorithm that involves iterating the GA but uses knowledge gained during one iteration to avoid re-searching, on subsequent iterations, regions of problem space where solutions have already been found.
Abstract: A technique is described that allows unimodal function optimization methods to be extended to locate all optima of multimodal problems efficiently. We describe an algorithm based on a traditional genetic algorithm (GA). This technique involves iterating the GA but uses knowledge gained during one iteration to avoid re-searching, on subsequent iterations, regions of problem space where solutions have already been found. This gain is achieved by applying a fitness derating function to the raw fitness function, so that fitness values are depressed in the regions of the problem space where solutions have already been found. Consequently, the likelihood of discovering a new solution on each iteration is dramatically increased. The technique may be used with various styles of GAs or with other optimization methods, such as simulated annealing. The effectiveness of the algorithm is demonstrated on a number of multimodal test functions. The technique is at least as fast as fitness sharing methods. It provides an acceleration of between 1 and l0p on a problem with p optima, depending on the value of p and the convergence time complexity.
TL;DR: The Sumudu transform as discussed by the authors is a new integral transform that makes its visualization easier and has many interesting properties, such as: (1) the differentiation and integration in the tdomain is equivalent to division and multiplication of the transformed function F(u) by uin the udomain.
Abstract: A new integral transform called the Sumudu transform is introduced. This transform possesses many interesting properties which make its visualization easier. Some of these properties are: (1) The differentiation and integration in the t‐domain is equivalent to division and multiplication of the transformed function F(u)by uin the u‐domain. (2) The unit‐step function in the t‐domain is transformed to unity in the u‐domain. (3) Scaling of the function f(t)in the t‐domain is equivalent to scaling of F(u) in the u‐domain by the same scale factor. (4) The limit of f(t) as ttends to zero is equal to the limit of F(u)as utends to zero. (5) For several cases, the limit of F(t)as ttends to infinity is the same as the limit of F(u)as u tends to infinity. (6) The slope of the function f(t) at t =0is the same as the slope of F(u) at u = 0. Hence uand F(u)are no longer dummies but can be treated as replicas of tand f(t).It is even possible to express uand F(u)using the units of tand f(t) respectively.
TL;DR: An alternative convergence proof of a proximal-like minimization algorithm using Bregman functions, recently proposed by Censor and Zenios, is presented and allows the establishment of a global convergence rate of the algorithm expressed in terms of function values.
Abstract: An alternative convergence proof of a proximal-like minimization algorithm using Bregman functions, recently proposed by Censor and Zenios, is presented. The analysis allows the establishment of a global convergence rate of the algorithm expressed in terms of function values.
TL;DR: A simple and effective method for finding good hinges is presented and it is shown that use of sums of hinge functions gives a powerful and efficient alternative to neural networks with computation times several orders of magnitude less than is obtained by fitting neural Networks with a comparable number of parameters.
Abstract: A hinge function y=h(x) consists of two hyperplanes continuously joined together at a hinge. In regression (prediction), classification (pattern recognition), and noiseless function approximation, use of sums of hinge functions gives a powerful and efficient alternative to neural networks with computation times several orders of magnitude less than is obtained by fitting neural networks with a comparable number of parameters. A simple and effective method for finding good hinges is presented. >
TL;DR: In this article, a suitable variational formulation for the local error of scattered data intepolation by radial basis functions φ(r) was proposed, where the error can be bounded by a term depending on the Fourier transform of the interpolated function f and a certain Kriging function.
Abstract: Introducing a suitable variational formulation for the local error of scattered data intepolation by radial basis functions φ(r), the error can be bounded by a term depending on the Fourier transform of the interpolated function f and a certain «Kriging function», which allows a formulation as an integral involving the Fourier transform of φ. The explicit construction of locally well-behaving admissible coefficient vectors makes the Kriging function bounded by some power of the local density h of data points
TL;DR: The problem of optimal sequential learning is investigated, viewed as a problem of estimating an underlying function sequentially rather than estimating a set of parameters of the neural network, and a suboptimal solution to the sequential estimate is arrived at by a growing gaussian radial basis function (GaRBF) network.
Abstract: In this paper, we investigate the problem of optimal sequential learning, viewed as a problem of estimating an underlying function sequentially rather than estimating a set of parameters of the neural network. First, we arrive at a suboptimal solution to the sequential estimate that can be mapped by a growing gaussian radial basis function (GaRBF) network. This network adds hidden units for each observation. The function space approach in which the estimates are represented as vectors in a function space is used in developing a growth criterion to limit its growth. A simplification of the criterion leads to two joint criteria on the distance of the present pattern and the existing unit centers in the input space and on the approximation error of the network for the given observation to be satisfied together. This network is similar to the resource allocating network (RAN) (Platt 1991a) and hence RAN can be interpreted from a function space approach to sequential learning. Second, we present an enhancement to the RAN. The RAN either allocates a new unit based on the novelty of an observation or adapts the network parameters by the LMS algorithm. The function space interpretation of the RAN lends itself to an enhancement of the RAN in which the extended Kalman filter (EKF) algorithm is used in place of the LMS algorithm. The performance of the RAN and the enhanced network are compared in the experimental tasks of function approximation and time-series prediction demonstrating the superior performance of the enhanced network with fewer number of hidden units. The approach adopted here has led us toward the minimal network required for a sequential learning problem.
TL;DR: In this paper, a controller called a "supervisor" switches into feedback with a SISO process, a sequence of linear positioning or set-point controllers from a family of /spl Fscr//sub C/ of candidate controllers, so as to cause the output of the process to approach and track a constant reference input.
Abstract: This paper describes a simple, 'high-level' controller called a 'supervisor' which is capable of switching into feedback with a SISO process, a sequence of linear positioning or set-point controllers from a family /spl Fscr//sub C/ of candidate controllers, so as to cause the output of the process to approach and track a constant reference input. The process is assumed to be modeled by a SISO linear system whose transfer function is in the union of a number of subclasses, each subclass being small enough so that one of the controllers in /spl Fscr//sub C/ would solve the positioning problem, were the process's transfer function to be one of the subclasses members. The supervisor decides which controller to put in feedback with the process, not by an exhaustive search-i.e., by experimentally evaluating each and every candidate controller's performance by briefly applying it to the process-but rather by continuously comparing in real time suitably defined 'output estimation errors' generated by the candidate controllers, whether or not they are in feedback with the process. It is shown that under reasonably mild conditions, the supervisor can successfully perform its function in spite of modeling errors, provided the errors are sufficiently small. It is also shown that the supervisor will invariably correctly classify the process in finite time, so long as the reference input is nonzero and the "dc gains" of the "nominal" candidate process model transfer functions are distinct. >
TL;DR: The main results are the necessary and sufficient condition for a function of one variable to be qualified as an activation function in RBF network is that the function is not an even polynomial, and the capability of approximation to nonlinear functionals and operators by RBF networks is revealed.
Abstract: The purpose of this paper is to explore the representation capability of radial basis function (RBF) neural networks. The main results are: 1) the necessary and sufficient condition for a function of one variable to be qualified as an activation function in RBF network is that the function is not an even polynomial, and 2) the capability of approximation to nonlinear functionals and operators by RBF networks is revealed, using sample data either in frequency domain or in time domain, which can be used in system identification by neural networks. >
TL;DR: In this paper, the authors specify the singularities of a function f that are visible in a stable way from limited X-ray tomographic data and determine which singularities can be stably recovere...
Abstract: Given a function f, the author specifies the singularities of f that are visible in a stable way from limited X-ray tomographic data. This determines which singularities of f can be stably recovere...
TL;DR: With the number of potential energy function evaluations as a measure, the genetic algorithm is more economical than either a set of traditional, local minimizations or a molecular dynamics simulated annealing approach.
Abstract: A genetic algorithm is used to find the global minimum energy structure for Si 4 on an empirical potential energy surface. Given a suitable encoding of the cluster geometry, and an exponential scaling of the potential energy values to obtain a fitness function, the genetic algorithm can successfully optimize all degrees of freedom. With the number of potential energy function evaluations as a measure, the genetic algorithm is more economical than either a set of traditional, local minimizations or a molecular dynamics simulated annealing approach
TL;DR: In this article, the Schrodinger invariance criterion for strongly anisotropic or dynamical scaling to local scale invariance is investigated and the scaling forms for the two-point function close to a free surface are derived.
Abstract: The extension of strongly anisotropic or dynamical scaling to local scale invariance is investigated. For the special case of an anisotropy or dynamical exponent $\theta=z=2$, the group of local scale transformation considered is the Schrodinger group, which can be obtained as the non-relativistic limit of the conformal group. The requirement of Schrodinger invariance determines the two-point function in the bulk and reduces the three-point function to a scaling form of a single variable. Scaling forms are also derived for the two-point function close to a free surface which can be either space-like or time-like. These results are reproduced in several exactly solvable statistical systems, namely the kinetic Ising model with Glauber dynamics, lattice diffusion, Lifshitz points in the spherical model and critical dynamics of the spherical model with a non-conserved order parameter. For generic values of $\theta$, evidence from higher order Lifshitz points in the spherical model and from directed percolation suggests a simple scaling form of the two-point function.
TL;DR: It is shown how false operator responses due to missing or uncertain data can be significantly reduced or eliminated and how operators having a higher degree of selectivity and higher tolerance against noise can be constructed using simple combinations of appropriately chosen convolutions.
Abstract: It is shown how false operator responses due to missing or uncertain data can be significantly reduced or eliminated. It is shown how operators having a higher degree of selectivity and higher tolerance against noise can be constructed using simple combinations of appropriately chosen convolutions. The theory is based on linear operations and is general in that it allows for both data and operators to be scalars, vectors or tensors of higher order. Three new methods are represented: normalized convolution, differential convolution and normalized differential convolution. All three methods are examples of the power of the signal/certainty-philosophy, i.e., the separation of both data and operator into a signal part and a certainty part. Missing data are handled simply by setting the certainty to zero. In the case of uncertain data, an estimate of the certainty must accompany the data. Localization or windowing of operators is done using an applicability function, the operator equivalent to certainty, not by changing the actual operator coefficients. Spatially or temporally limited operators are handled by setting the applicability function to zero outside the window. >
TL;DR: In this article, the epsilon budget of the k-epsilon model for fully developed channel flow is derived from direct numerical simulation (DNS) data for developed channel and boundary layer flow at two Reynolds numbers each.
Abstract: The constant C sub mu and the near-wall damping function f sub mu in the eddy-viscosity relation of the k-epsilon model are evaluated from direct numerical simulation (DNS) data for developed channel and boundary layer flow at two Reynolds numbers each. Various existing f sub mu model functions are compared with the DNS data, and a new function is fitted to the high-Reynolds-number channel flow data. The epsilon-budget is computed for the fully developed channel flow. The relative magnitude of the terms in the epsilon-equation is analyzed with the aid of scaling arguments, and the parameter governing this magnitude is established. Models for the sum of all source and sink terms in the epsilon-equation are tested against the DNS data, and an improved model is proposed.
TL;DR: In this paper, the authors show that the minimum degree greedy algorithm achieves a performance ratio of (Δ+2)/3 for approximating independent sets in graphs with degree bounded by Δ.
Abstract: Theminimum-degree greedy algorithm, or Greedy for short, is a simple and well-studied method for finding independent sets in graphs. We show that it achieves a performance ratio of (Δ+2)/3 for approximating independent sets in graphs with degree bounded by Δ. The analysis yields a precise characterization of the size of the independent sets found by the algorithm as a function of the independence number, as well as a generalization of Turan's bound. We also analyze the algorithm when run in combination with a known preprocessing technique, and obtain an improved $$(2\bar d + 3)/5$$ performance ratio on graphs with average degree $$\bar d$$ , improving on the previous best $$(\bar d + 1)/2$$ of Hochbaum. Finally, we present an efficient parallel and distributed algorithm attaining the performance guarantees of Greedy.
TL;DR: It is shown that one can instead do with pointwise relative compactness in the set of real numbers if one makes use of a generalized lower limit of functions in a space of real-valued functions on the state space.
Abstract: A Markovian decision model with general state space, compact action space, and the average cost as criterion is considered. The existence of an optimal policy is shown via an optimality inequality in terms of the minimal average cost g and a relative value function w. The existence of some w is usually shown via relative compactness in a space of real-valued functions on the state space. Here it shall be shown that one can instead do with pointwise relative compactness in the set of real numbers if one makes use of a generalized lower limit of functions. An application to an inventory model is given.
TL;DR: Inflationary models predict a definite, model-independent, angular dependence for the three-point correlation function of ΔT/T at large angles (≥ 1°) as mentioned in this paper.
Abstract: Inflationary models predict a definite, model-independent, angular dependence for the three-point correlation function of ΔT/T at large angles (≥1°) which we calculate. The overall amplitude is model dependent and generically unobservably small, but may be large in some specific models. We compare our results with other models of non-Gaussian fluctuations
TL;DR: A recursive algorithm for solving the dynamical equations of motion for molecular systems using internal variable models which have been shown to reduce the computation times of molecular dynamics simulations by an order of magnitude when compared with Cartesian models.
TL;DR: In this article, the authors derived exact series representations of the chord-length distribution function for media comprised of spheres with a polydispersivity in size for arbitrary space dimension D. For the special case of spatially uncorrelated spheres (i.e., fully penetrable spheres), the first moment of p(z) was determined.
Abstract: A statistical correlation function of basic importance in the study of two-phase random media (such as suspensions, porous media, and composites) is the chord-length distribution function p(z). We show that p(z) is related to another fundamentally important morphological descriptor studied by us previously, namely, the lineal-path function L(z), which gives the probability of finding a line segment of length z wholly in one of the phases when randomly thrown into the sample. We derive exact series representations of the chord-length distribution function for media comprised of spheres with a polydispersivity in size for arbitrary space dimension D. For the special case of spatially uncorrelated spheres (i.e., fully penetrable spheres), we determine exactly p(z) and the mean chord length l C , the first moment of p(z)
TL;DR: An algorithm for finding the global maximum of a multimodal, multivariate function for which derivatives are available that assumes a bound on the second derivatives of the function and uses this to construct an upper envelope.
Abstract: We present an algorithm for finding the global maximum of a multimodal, multivariate function for which derivatives are available. The algorithm assumes a bound on the second derivatives of the function and uses this to construct an upper envelope. Successive function evaluations lower this envelope until the value of the global maximum is known to the required degree of accuracy. The algorithm has been implemented in RATFOR and execution times for standard test functions are presented at the end of the paper.
TL;DR: In this article, an occurrence capability which allows a first function to "go to sleep" while waiting for a second function to produce a result was proposed. In this manner, the first function does not consume any CPU time while waiting on the second function.
Abstract: An occurrence capability which allows a first function to "go to sleep" while waiting for a second function to produce a result. In this manner, the first function does not consume any CPU time while waiting for the second function. Three icons are provided with associated control software which implement the occurrence function. A Wait on Occurrence function icon is associated with the first function that is waiting on the result from the second function. A Set Occurrence function icon is typically associated with the second function icon and triggers an occurrence when the second function produces the desired result. A Generate Occurrence function icon is used to pass identifier values linking multiple sources and destinations having Set Occurrence and Wait on Occurrence function icons, respectively.
TL;DR: This paper proposes an architecture of neural networks that have interval weights and interval biases, and develops a learning algorithm derived from the cost function in a similar manner as the BP (Back-Propagation) algorithm.
TL;DR: A modified three-flat method in a Cartesian coordinate system, where a flat can be expressed as the sum of even-odd, odd- even, even-even, and odd-odd functions is described.
Abstract: We describe a modified three-flat method. In a Cartesian coordinate system, a flat can be expressed as the sum of even-odd, odd-even, even-even, and odd-odd functions. The even-odd and the odd-even functions of each flat are obtained first, and then the even-even function is calculated. All three functions are exact. The odd-odd function is difficult to obtain. In theory, this function can be solved by rotating the flat 90°, 45°, 22.5°, etc. The components of the Fourier series of this odd-odd function are derived and extracted from each rotation of the flat. A flat is approximated by the sum of the first three functions and the known components of the odd-odd function. In the experiments, the flats are oriented in six configurations by rotating the flats 180°, 90°, and 45° with respect to one another, and six measurements are performed. The exact profiles along every 45° diameter are obtained, and the profile in the area between two adjacent diameters of these diameters is also obtained with some approximation. The theoretical derivation, experiment results, and error analysis are presented.
TL;DR: A general solution of the Ornstein-Zernike (OZ) equation was obtained by combining the perturbation theory with the application of the Hilbert transform as mentioned in this paper, which was based on the Percus-Yevick approximation or the mean spherical approximation for potentials consisting of a hard core and an arbitrary tail function.
Abstract: A general solution of the Ornstein–Zernike (OZ) equation was obtained by combining the perturbation theory with the application of the Hilbert transform. The development was based on the Percus–Yevick approximation or the mean spherical approximation for potentials consisting of a hard core and an arbitrary tail function. All terms of the solution are analytical and involve only the potential function and the hard sphere quantities.
TL;DR: This work examines the composition problem for the case when two functions are given in either Be´zier or B-spline form, and develops efficient, tightly codable algorithms for this problem.
Abstract: In view of the fundamental role that functional composition plays in mathematics, it is not surprising that a variety of problems in geometric modeling can be viewed as instances of the following composition problem: given representations for two functions F and G, compute a representation of the function H = F o G. We examine this problem in detail for the case when F and G are given in either Be´zier or B-spline form. Blossoming techniques are used to gain theoretical insight into the structure of the solution which is then used to develop efficient, tightly codable algorithms. From a practical point of view, if the composition algorithms are implemented as library routines, a number of geometric-modeling problems can be solved with a small amount of additional software.
TL;DR: In this article, the orthogonal expansion of the functions that map the input vector to the output vector is used to approximate any mapping function between the input and output vectors without the use of hidden layers.
Abstract: An architecture and data processing method for a neural network that can approximate any mapping function between the input and output vectors without the use of hidden layers. The data processing is done at the sibling nodes (second row). It is based on the orthogonal expansion of the functions that map the input vector to the output vector. Because the nodes of the second row are simply data processing stations, they remain passive during training. As a result the system is basically a single-layer linear network with a filter at its entrance. Because of this it is free from the problems of local minima. The invention also includes a method that reduces the sum of the square of errors over all the output nodes to zero (0.000000) in fewer than ten cycles. This is done by initialization of the synaptic links with the coefficients of the orthogonal expansion. This feature makes it possible to design a computer chip which can perform the training process in real time. Similarly, the ability to train in real time allows the system to retrain itself and improve its performance while executing its normal testing functions.
TL;DR: In this paper, a class of completely monotonic functions involving the gamma function as well as the derivative of the psi function are presented and new upper and lower bounds for the ratio F(x + 1)/F(x+ s) are obtained and compared with related bounds given in part by J. D. Keckic and P. M. Vasic.
Abstract: A class of completely monotonic functions are presented involving the gamma function as well as the derivative of the psi function. As a consequence, new upper and lower bounds for the ratio F(x + 1)/F(x + s) are obtained and compared with related bounds given in part by J. D. Keckic and P. M. Vasic. Our results are further applied to obtain functions which are Laplace transforms of infinitely divisible probability measures.