About: Differentiable function is a research topic. Over the lifetime, 10449 publications have been published within this topic receiving 241636 citations. The topic is also known as: derivable function.
TL;DR: This paper proves for the first time that a version of policy iteration with arbitrary differentiable function approximation is convergent to a locally optimal policy.
Abstract: Function approximation is essential to reinforcement learning, but the standard approach of approximating a value function and determining a policy from it has so far proven theoretically intractable. In this paper we explore an alternative approach in which the policy is explicitly represented by its own function approximator, independent of the value function, and is updated according to the gradient of expected reward with respect to the policy parameters. Williams's REINFORCE method and actor-critic methods are examples of this approach. Our main new result is to show that the gradient can be written in a form suitable for estimation from experience aided by an approximate action-value or advantage function. Using this result, we prove for the first time that a version of policy iteration with arbitrary differentiable function approximation is convergent to a locally optimal policy.
TL;DR: In this article, a second order algorithm for finding points on a steepest descent path from the transition state of the reactants and products is presented. But the points are optimized so that the segment of the reaction path between any two adjacent points is given by an arc of a circle, and the gradient at each point is tangent to the path.
Abstract: A new algorithm is presented for obtaining points on a steepest descent path from the transition state of the reactants and products. In mass‐weighted coordinates, this path corresponds to the intrinsic reaction coordinate. Points on the reaction path are found by constrained optimizations involving all internal degrees of freedom of the molecule. The points are optimized so that the segment of the reaction path between any two adjacent points is given by an arc of a circle, and so that the gradient at each point is tangent to the path. Only the transition vector and the energy gradients are needed to construct the path. The resulting path is continuous, differentiable and piecewise quadratic. In the limit of small step size, the present algorithm is shown to take a step with the correct tangent vector and curvature vector; hence, it is a second order algorithm. The method has been tested on the following reactions: HCN→CNH, SiH2+H2→SiH4, CH4+H→CH3+H2, F−+CH3F→FCH3+F−, and C2H5F→C2H4+HF. Reaction paths calculated with a step size of 0.4 a.u. are almost identical to those computed with a step size of 0.1 a.u. or smaller.
TL;DR: In this article, the authors extended their previous algorithm for following reaction paths downhill to use mass-weighted internal coordinates, which has the correct tangent vector and curvature vectors in the limit or small step size but requires only the transition vector and the energy gradients.
Abstract: Our previous algorithm for following reaction paths downhill (J. Chem. Phys. 1989, 90, 2154), has been extended to use mass-weighted internal coordinates. Points on the reaction path are round by constrained optimizations involving the internal degrees or freedom or the molecule. The points are optimized so that the segment or the reaction path between any two adjacent points is described by an arc or a circle in mass-weighted internal coordinates, and so that the gradients (in mass-weighted internals) at the end points or the arc are tangent to the path. The algorithm has the correct tangent vector and curvature vectors in the limit or small step size but requires only the transition vector and the energy gradients; the resulting path is continuous, differentiable, and piecewise quadratic
TL;DR: In this paper, the basic sample statistics are used for Parametric Inference, and the Asymptotic Theory in Parametric Induction (ATIP) is used to estimate the relative efficiency of given statistics.
Abstract: Preliminary Tools and Foundations. The Basic Sample Statistics. Transformations of Given Statistics. Asymptotic Theory in Parametric Inference. U--Statistics. Von Mises Differentiable Statistical Functions. M--Estimates. L--Estimates. R--Estimates. Asymptotic Relative Efficiency. Appendix. References. Author Index. Subject Index.
TL;DR: In this article, a large class of recursion relations xn+l = Af(xn) exhibiting infinite bifurcation is shown to possess a rich quantitative structure essentially independent of the recursion function.
Abstract: A large class of recursion relations xn+l = Af(xn) exhibiting infinite bifurcation is shown to possess a rich quantitative structure essentially independent of the recursion function. The functions considered all have a unique differentiable maximum 2. With f(2) - f(x) ~ Ix - 21" (for Ix - 21 sufficiently small), z > 1, the universal details depend only upon z. In particular, the local structure of high-order stability sets is shown to approach universality, rescaling in successive bifurcations, asymptotically by the ratio c~ (a = 2.5029078750957... for z = 2). This structure is determined by a universal function g*(x), where the 2"th iterate off, f("~, converges locally to ~-"g*(~nx) for large n. For ithe class of f's considered, there exists a A~ such that a 2"-point stable limit cycle including :7 exists; A~ - ~ ~ ~-" (~ = 4.669201609103... for z = 2). The numbers = and have been computationally determined for a range of z through their definitions, for a variety off's for each z. We present a recursive mechanism that explains these results by determining g* as the fixed-point (function) of a transformation on the class off's. At present our treatment is heuristic. In a sequel, an exact theory is formulated and specific problems of rigor isolated.