TL;DR: An efficient technique for supervised learning of universal value function approximators (UVFAs) V (s, g; θ) that generalise not just over states s but also over goals g is developed and it is demonstrated that a UVFA can successfully generalise to previously unseen goals.
Abstract: Value functions are a core component of reinforcement learning systems. The main idea is to to construct a single function approximator V (s; θ) that estimates the long-term reward from any state s, using parameters θ. In this paper we introduce universal value function approximators (UVFAs) V (s, g; θ) that generalise not just over states s but also over goals g. We develop an efficient technique for supervised learning of UVFAs, by factoring observed values into separate embedding vectors for state and goal, and then learning a mapping from s and g to these factored embedding vectors. We show how this technique may be incorporated into a reinforcement learning algorithm that updates the UVFA solely from observed rewards. Finally, we demonstrate that a UVFA can successfully generalise to previously unseen goals.
TL;DR: Deep Recurrent Q-Network (DRQN) as discussed by the authors replaces the first post-convolutional fully-connected layer with a recurrent LSTM, which integrates information through time and replicates DQN's performance on standard Atari games and partially observed equivalents featuring flickering game screens.
Abstract: Deep Reinforcement Learning has yielded proficient controllers for complex tasks. However, these controllers have limited memory and rely on being able to perceive the complete game screen at each decision point. To address these shortcomings, this article investigates the effects of adding recurrency to a Deep Q-Network (DQN) by replacing the first post-convolutional fully-connected layer with a recurrent LSTM. The resulting Deep Recurrent Q-Network (DRQN), although capable of seeing only a single frame at each timestep, successfully integrates information through time and replicates DQN's performance on standard Atari games and partially observed equivalents featuring flickering game screens. Additionally, when trained with partial observations and evaluated with incrementally more complete observations, DRQN's performance scales as a function of observability. Conversely, when trained with full observations and evaluated with partial observations, DRQN's performance degrades less than DQN's. Thus, given the same length of history, recurrency is a viable alternative to stacking a history of frames in the DQN's input layer and while recurrency confers no systematic advantage when learning to play the game, the recurrent net can better adapt at evaluation time if the quality of observations changes.
TL;DR: This paper investigates the partial and exact recovery of communities in the general SBM (in the constant and logarithmic degree regimes), and uses the generality of the results to tackle overlapping communities.
Abstract: New phase transition phenomena have recently been discovered for the stochastic block model, for the special case of two non-overlapping symmetric communities. This gives raise in particular to new algorithmic challenges driven by the thresholds. This paper investigates whether a general phenomenon takes place for multiple communities, without imposing symmetry. In the general stochastic block model SBM (n, p, W), n vertices are split into k communities of relative siz{pi} ia#x03B5;[k], and vertices in community i and j connect independently with probability {Wij}i,j a#x03B5;[k]. This paper investigates the partial and exact recovery of communities in the general SBM (in the constant and logarithmic degree regimes), and uses the generality of the results to tackle overlapping communities. The contributions of the paper are: (i) an explicit characterization of the recovery threshold in the general SBM in terms of a new f-divergence function D+, which generalizes the Hellinger and Chern off divergences, and which provides an operational meaning to a divergence function analog to the KL-divergence in the channel coding theorem, (ii) the development of an algorithm that recovers the communities all the way down to the optimal threshold and runs in quasi-linear time, showing that exact recovery has no information-theoretic to computational gap for multiple communities, (iii) the development of an efficient algorithm that detects communities in the constant degree regime with an explicit accuracy bound that can be made arbitrarily close to 1 when a prescribed signal-to-noise ratio (defined in term of the spectrum of diag(p)W tends to infinity.
TL;DR: The experimental results demonstrate that the KG2E method can effectively model the (un)certainties of entities and relations in a KG, and it significantly outperforms state-of-the-art methods (including TransH and TransR).
Abstract: The representation of a knowledge graph (KG) in a latent space recently has attracted more and more attention. To this end, some proposed models (e.g., TransE) embed entities and relations of a KG into a "point" vector space by optimizing a global loss function which ensures the scores of positive triplets are higher than negative ones. We notice that these models always regard all entities and relations in a same manner and ignore their (un)certainties. In fact, different entities and relations may contain different certainties, which makes identical certainty insufficient for modeling. Therefore, this paper switches to density-based embedding and propose KG2E for explicitly modeling the certainty of entities and relations, which learn the representations of KGs in the space of multi-dimensional Gaussian distributions. Each entity/relation is represented by a Gaussian distribution, where the mean denotes its position and the covariance (currently with diagonal covariance) can properly represent its certainty. In addition, compared with the symmetric measures used in point-based methods, we employ the KL-divergence for scoring triplets, which is a natural asymmetry function for effectively modeling multiple types of relations. We have conducted extensive experiments on link prediction and triplet classification with multiple benchmark datasets (WordNet and Freebase). Our experimental results demonstrate that our method can effectively model the (un)certainties of entities and relations in a KG, and it significantly outperforms state-of-the-art methods (including TransH and TransR).
TL;DR: This paper examined two types of splitting methods for solving this nonconvex optimization problem: alternating direction method of multipliers and proximal gradient algorithm and gives simple sufficient conditions to guarantee boundedness of the sequence generated.
Abstract: We consider the problem of minimizing the sum of a smooth function $h$ with a bounded Hessian and a nonsmooth function. We assume that the latter function is a composition of a proper closed function $P$ and a surjective linear map $\mathcal{M}$, with the proximal mappings of $\tau P$, $\tau > 0$, simple to compute. This problem is nonconvex in general and encompasses many important applications in engineering and machine learning. In this paper, we examined two types of splitting methods for solving this nonconvex optimization problem: the alternating direction method of multipliers and the proximal gradient algorithm. For the direct adaptation of the alternating direction method of multipliers, we show that if the penalty parameter is chosen sufficiently large and the sequence generated has a cluster point, then it gives a stationary point of the nonconvex problem. We also establish convergence of the whole sequence under an additional assumption that the functions $h$ and $P$ are semialgebraic. Further...
TL;DR: The end result is the generation of stable walking satisfying physical realizability constraints for a model of the bipedal robot AMBER2.
Abstract: This paper presents a methodology for the development of control barrier functions (CBFs) through a backstepping inspired approach. Given a set defined as the superlevel set of a function, h, the main result is a constructive means for generating control barrier functions that guarantee forward invariance of this set. In particular, if the function defining the set has relative degree n, an iterative methodology utilizing higher order derivatives of h provably results in a control barrier function that can be explicitly derived. To demonstrate these formal results, they are applied in the context of bipedal robotic walking. Physical constraints, e.g., joint limits, are represented by control barrier functions and unified with control objectives expressed through control Lyapunov functions (CLFs) via quadratic program (QP) based controllers. The end result is the generation of stable walking satisfying physical realizability constraints for a model of the bipedal robot AMBER2.
TL;DR: In this article, an optimal parabolic contour is selected on the basis of the distance and the strength of the singularities of the Laplace transform, with the aim of minimizing the computational effort and reducing the propagation of errors.
Abstract: The Mittag-Leffler (ML) function plays a fundamental role in fractional calculus but very few methods are available for its numerical evaluation. In this work we present a method for the efficient computation of the ML function based on the numerical inversion of its Laplace transform (LT): an optimal parabolic contour is selected on the basis of the distance and the strength of the singularities of the LT, with the aim of minimizing the computational effort and reducing the propagation of errors. Numerical experiments are presented to show accuracy and efficiency of the proposed approach. The application to the three parameter ML (also known as Prabhakar) function is also presented.
TL;DR: In this paper, the authors propose a model that is based on decoding an image into a set of people detections and uses a recurrent LSTM layer for sequence generation and train their model end-to-end with a new loss function that operates on sets of detections.
Abstract: Current people detectors operate either by scanning an image in a sliding window fashion or by classifying a discrete set of proposals. We propose a model that is based on decoding an image into a set of people detections. Our system takes an image as input and directly outputs a set of distinct detection hypotheses. Because we generate predictions jointly, common post-processing steps such as non-maximum suppression are unnecessary. We use a recurrent LSTM layer for sequence generation and train our model end-to-end with a new loss function that operates on sets of detections. We demonstrate the effectiveness of our approach on the challenging task of detecting people in crowded scenes.
TL;DR: In this article, the authors proposed a model predictive control (MPC) method for modular-multilevel-converter (MMC) high-voltage direct current (HVDC) systems.
Abstract: This paper proposes a model predictive control (MPC) method for modular-multilevel-converter (MMC) high-voltage direct current (HVDC). To control the MMC-HVDC system properly, the ac current, circulating current, and submodule (SM) capacitor voltage are taken into consideration. The existing MPC methods for the MMC-HVDC system utilize weighting factors to configure the cost function in combinations of the SM capacitor voltage balancing algorithm, ac current control, and circulating current control. Because all combinations of the switch states are considered in order to minimize the cost function, their possible combinations increase geometrically according to the increase of the level of the MMC, which is a significant disadvantage. This paper proposes a new MPC method with a reduced number of states for ac current control, circulating current control, and the SM capacitor voltage-balancing algorithm. The proposed cost functions are divided into three types according to their control purposes. Each cost function determines the minimum number of states for controlling the ac current, circulating current, and SM capacitor voltage. The efficacy of the proposed controlling method is verified through simulation results using PSCAD/EMTDC.
TL;DR: In this paper, a time-varying distributed convex optimization problem is studied for continuous-time multi-agent systems, and two discontinuous algorithms based on the signum function are proposed to solve the problem in each case.
Abstract: In this paper, a time-varying distributed convex optimization problem is studied for continuous-time multi-agent systems. Control algorithms are designed for the cases of single-integrator and double-integrator dynamics. Two discontinuous algorithms based on the signum function are proposed to solve the problem in each case. Then in the case of double-integrator dynamics, two continuous algorithms based on, respectively, a time-varying and a fixed boundary layer are proposed as continuous approximations of the signum function. Also, to account for inter-agent collision for physical agents, a distributed convex optimization problem with swarm tracking behavior is introduced for both single-integrator and double-integrator dynamics.
TL;DR: In this paper, the authors study the private distributed optimization problem (PDOP) with the additional requirement that the cost function of the individual agents should remain differentially private, and propose a class of iterative algorithms for solving PDOP, which achieves differential privacy and convergence to a common value.
Abstract: In distributed optimization and iterative consensus literature, a standard problem is for N agents to minimize a function f over a subset of Euclidean space, where the cost function is expressed as a sum Σ fi. In this paper, we study the private distributed optimization problem (PDOP) with the additional requirement that the cost function of the individual agents should remain differentially private. The adversary attempts to infer information about the private cost functions from the messages that the agents exchange. Achieving differential privacy requires that any change of an individual's cost function only results in unsubstantial changes in the statistics of the messages. We propose a class of iterative algorithms for solving PDOP, which achieves differential privacy and convergence to a common value. Our analysis reveals the dependence of the achieved accuracy and the privacy levels on the the parameters of the algorithm. We observe that to achieve e-differential privacy the accuracy of the algorithm has the order of O(1/e2).
TL;DR: It is found that, under mild conditions on the objective function, the Chebyshev scalarizing function has an almost identical effect to Pareto-dominance relations when the authors consider the probability of finding superior solutions for algorithms that follow a balanced trajectory.
TL;DR: In this article, the existence and the asymptotic behavior of non-negative solutions for a class of stationary Kirchhoff problems driven by a fractional integro-differential operator LK and involving a critical nonlinearity were analyzed.
Abstract: This paper deals with the existence and the asymptotic behavior of non-negative solutions for a class of stationary Kirchhoff problems driven by a fractional integro-differential operator LK and involving a critical nonlinearity. In particular, we consider the problem
−M(||u||2)LKu=λf(x,u)+|u|2s∗−2uin Ω,u=0in Rn∖Ω,
where Ω⊂Rn is a bounded domain, 2s∗ is the critical exponent of the fractional Sobolev space Hs(Rn), the function f is a subcritical term and λ is a positive parameter. The main feature, as well as the main difficulty, of the analysis is the fact that the Kirchhoff function M could be zero at zero, that is the problem is degenerate. The adopted techniques are variational and the main theorems extend in several directions previous results recently appeared in the literature.
TL;DR: Informally, a statistical functional T on a set of probability measures M on the real line is elicitable if it can be defined as the minimizer of a suitable expected scoring function if there exists a scoring function S.
Abstract: Informally, a statistical functional T on a set of probability measures M on the real line is elicitable if it can be defined as the minimizer of a suitable expected scoring function. The simplest ...
TL;DR: In this paper, a simple Fourier transform (FT) method is presented for obtaining a Distribution Function of Relaxation Times (DFRT) for electrochemical impedance spectroscopy (EIS) data.
TL;DR: A distributed cooperative optimization problem encountered in a computational multiagent network with delay is considered, where each agent has local access to its convex cost function, and jointly minimizes the cost function over the whole network.
Abstract: In this technical correspondence, we consider a distributed cooperative optimization problem encountered in a computational multiagent network with delay, where each agent has local access to its convex cost function, and jointly minimizes the cost function over the whole network. To solve this problem, we develop an algorithm that is based on dual averaging updates and delayed subgradient information, and analyze its convergence properties for a diminishing step-size by utilizing Bregman-distance functions. Moreover, we provide sharp bounds on the convergence rates as a function of the network size and topology embodied in the inverse spectral gap. Finally, we present a numerical example to evaluate our algorithm and compare its performance with several similar algorithms.
TL;DR: In this article, a new analogue of Bernstein operators is introduced, called (p, q)-Bernstein operators, which is a generalization of q-Bernstein operator and also study approximation properties based on Korovkin's type approximation theorem.
Abstract: In this paper, we introduce a new analogue of Bernstein operators and we call it as (p, q)-Bernstein operators which is a generalization of q-Bernstein operators. We also study approximation properties based on Korovkin's type approximation theorem of (p, q)-Bernstein operators and establish some direct theorems. Furthermore, we show comparisons and some illustrative graphics for the convergence of operators to a function.
TL;DR: In this article, the authors consider the logarithmic negativity of a finite interval embedded in an infinite one-dimensional system at finite temperature and show that the naive approach based on the calculation of a two-point function of twist fields in a cylindrical geometry yields a wrong result.
Abstract: We consider the logarithmic negativity of a finite interval embedded in an infinite one dimensional system at finite temperature. We focus on conformal invariant systems and we show that the naive approach based on the calculation of a two-point function of twist fields in a cylindrical geometry yields a wrong result. The correct result is obtained through a four-point function of twist fields in which two auxiliary fields are inserted far away from the interval, and they are sent to infinity only after having taken the replica limit. In this way, we find a universal scaling form for the finite temperature negativity which depends on the full operator content of the theory and not only on the central charge. In the limit of low and high temperatures, the expansion of this universal form can be obtained by means of the operator product expansion. We check our results against exact numerical computations for the critical harmonic chain.
TL;DR: In this paper, the authors derive the Ward identities which relate the three point function of scalar perturbations produced during inflation to the scalar four point function, in a particular limit.
Abstract: Using symmetry considerations, we derive Ward identities which relate the three point function of scalar perturbations produced during inflation to the scalar four point function, in a particular limit. The derivation assumes approximate conformal invariance, and the conditions for the slow roll approximation, but is otherwise model independent. The Ward identities allow us to deduce that the three point function must be suppressed in general, being of the same order of magnitude as in the slow roll model. They also fix the three point function in terms of the four point function, upto one constant which we argue is generically suppressed. Our approach is based on analyzing the wave function of the universe, and the Ward identities arise by imposing the requirements of spatial and time reparametrization invariance on it.
TL;DR: A bottom-up visual saliency detection algorithm that takes both background and foreground into consideration and the two saliency maps are integrated by the proposed unified function is proposed.
TL;DR: A method for the efficient computation of the ML function based on the numerical inversion of its Laplace transform (LT): an optimal parabolic contour is selected on the basis of the distance and the strength of the singularities of the LT, with the aim of minimizing the computational effort and reduce the propagation of errors.
Abstract: The Mittag-Leffler (ML) function plays a fundamental role in fractional calculus but very few methods are available for its numerical evaluation. In this work we present a method for the efficient computation of the ML function based on the numerical inversion of its Laplace transform (LT): an optimal parabolic contour is selected on the basis of the distance and the strength of the singularities of the LT, with the aim of minimizing the computational effort and reduce the propagation of errors. Numerical experiments are presented to show accuracy and efficiency of the proposed approach. The application to the three parameter ML (also known as Prabhakar) function is also presented.
TL;DR: In this paper, the authors studied the problem of estimating the number of samples required to answer a sequence of adaptive queries about an unknown distribution, as a function of the type of queries and the desired level of accuracy.
Abstract: Adaptivity is an important feature of data analysis---the choice of questions to ask about a dataset often depends on previous interactions with the same dataset. However, statistical validity is typically studied in a nonadaptive model, where all questions are specified before the dataset is drawn. Recent work by Dwork et al. (STOC, 2015) and Hardt and Ullman (FOCS, 2014) initiated the formal study of this problem, and gave the first upper and lower bounds on the achievable generalization error for adaptive data analysis.
Specifically, suppose there is an unknown distribution $\mathbf{P}$ and a set of $n$ independent samples $\mathbf{x}$ is drawn from $\mathbf{P}$. We seek an algorithm that, given $\mathbf{x}$ as input, accurately answers a sequence of adaptively chosen queries about the unknown distribution $\mathbf{P}$. How many samples $n$ must we draw from the distribution, as a function of the type of queries, the number of queries, and the desired level of accuracy?
In this work we make two new contributions:
(i) We give upper bounds on the number of samples $n$ that are needed to answer statistical queries. The bounds improve and simplify the work of Dwork et al. (STOC, 2015), and have been applied in subsequent work by those authors (Science, 2015, NIPS, 2015).
(ii) We prove the first upper bounds on the number of samples required to answer more general families of queries. These include arbitrary low-sensitivity queries and an important class of optimization queries.
As in Dwork et al., our algorithms are based on a connection with algorithmic stability in the form of differential privacy. We extend their work by giving a quantitatively optimal, more general, and simpler proof of their main theorem that stability implies low generalization error. We also study weaker stability guarantees such as bounded KL divergence and total variation distance.
TL;DR: In this article, the fundamental limits on learning latent community structure in dynamic networks were studied, where nodes change their community membership over time, but where edges are generated independently at each time step.
Abstract: We study the fundamental limits on learning latent community structure in dynamic networks. Specifically, we study dynamic stochastic block models where nodes change their community membership over time, but where edges are generated independently at each time step. In this setting (which is a special case of several existing models), we are able to derive the detectability threshold exactly, as a function of the rate of change and the strength of the communities. Below this threshold, we claim that no algorithm can identify the communities better than chance. We then give two algorithms that are optimal in the sense that they succeed all the way down to this limit. The first uses belief propagation (BP), which gives asymptotically optimal accuracy, and the second is a fast spectral clustering algorithm, based on linearizing the BP equations. We verify our analytic and algorithmic results via numerical simulation, and close with a brief discussion of extensions and open questions.
TL;DR: In this article, the authors used the hexagon function bootstrap to compute the ratio function which characterizes the next-to-maximally-helicity-violating (NMHV) six-point amplitude in planar super-Yang-Mills theory at four loops.
Abstract: We use the hexagon function bootstrap to compute the ratio function which characterizes the next-to-maximally-helicity-violating (NMHV) six-point amplitude in planar $\mathcal{N} = 4$ super-Yang-Mills theory at four loops. A powerful constraint comes from dual superconformal invariance, in the form of a $\bar{Q}$ differential equation, which heavily constrains the first derivatives of the transcendental functions entering the ratio function. At four loops, it leaves only a 34-parameter space of functions. Constraints from the collinear limits, and from the multi-Regge limit at the leading-logarithmic (LL) and next-to-leading-logarithmic (NLL) order, suffice to fix these parameters and obtain a unique result. We test the result against multi-Regge predictions at NNLL and N$^3$LL, and against predictions from the operator product expansion involving one and two flux-tube excitations; all cross-checks are satisfied. We study the analytical and numerical behavior of the parity-even and parity-odd parts on various lines and surfaces traversing the three-dimensional space of cross ratios. As part of this program, we characterize all irreducible hexagon functions through weight eight in terms of their coproduct. We also provide representations of the ratio function in particular kinematic regions in terms of multiple polylogarithms.
TL;DR: A method for the problem of learning the structure of a Bayesian network using the quantum adiabatic algorithm is introduced by introducing an efficient reformulation of a standard posterior-probability scoring function on graphs as a pseudo-Boolean function, which is equivalent to a system of 2-body Ising spins.
Abstract: We introduce a method for the problem of learning the structure of a Bayesian network using the quantum adiabatic algorithm. We do so by introducing an efficient reformulation of a standard posterior-probability scoring function on graphs as a pseudo-Boolean function, which is equivalent to a system of 2-body Ising spins, as well as suitable penalty terms for enforcing the constraints necessary for the reformulation; our proposed method requires 𝓞(n2) qubits for n Bayesian network variables. Furthermore, we prove lower bounds on the necessary weighting of these penalty terms. The logical structure resulting from the mapping has the appealing property that it is instance-independent for a given number of Bayesian network variables, as well as being independent of the number of data cases.
TL;DR: In this paper, the authors considered an optimal insurance design problem for an individual whose preferences are dictated by the rank-dependent expected utility theory with a concave utility function and an inverse-S shaped probability distortion function.
Abstract: We consider an optimal insurance design problem for an individual whose preferences are dictated by the rank-dependent expected utility (RDEU) theory with a concave utility function and an inverse-S shaped probability distortion function. This type of RDEU is known to describe human behavior better than the classical expected utility. By applying the technique of quantile formulation, we solve the problem explicitly. We show that the optimal contract not only insures large losses above a deductible but also insures small losses fully. This is consistent, for instance, with the demand for warranties. Finally, we compare our results, analytically and numerically, both to those in the expected utility framework and to cases in which the distortion function is convex or concave.
TL;DR: The SLP level-set method as discussed by the authors uses discretized boundary integrals to estimate function changes and the formulation of an optimization sub-problem to attain the velocity function, which is solved using sequential linear programming.
Abstract: This paper introduces an approach to level-set topology optimization that can handle multiple constraints and simultaneously optimize non-level-set design variables. The key features of the new method are discretized boundary integrals to estimate function changes and the formulation of an optimization sub-problem to attain the velocity function. The sub-problem is solved using sequential linear programming (SLP) and the new method is called the SLP level-set method. The new approach is developed in the context of the Hamilton-Jacobi type level-set method, where shape derivatives are employed to optimize a structure represented by an implicit level-set function. This approach is sometimes referred to as the conventional level-set method. The SLP level-set method is demonstrated via a range of problems that include volume, compliance, eigenvalue and displacement constraints and simultaneous optimization of non-level-set design variables.
TL;DR: An algorithm that computes the multipole coefficients of the galaxy three-point correlation function (3PCF) without explicitly considering triplets of galaxies is presented, allowing runtimes that are comparable, and 500 times faster than a naive triplet count.
Abstract: We present an algorithm that computes the multipole coefficients of the galaxy three-point correlation function (3PCF) without explicitly considering triplets of galaxies. Rather, centering on each galaxy in the survey, it expands the radially-binned density field in spherical harmonics and combines these to form the multipoles without ever requiring the relative angle between a pair about the central. This approach scales with number and number density in the same way as the two-point correlation function, allowing runtimes that are comparable, and 500 times faster than a naive triplet count. It is exact in angle and easily handles edge correction. We demonstrate the algorithm on the LasDamas SDSS-DR7 mock catalogs, computing an edge corrected 3PCF out to $90\;{\rm Mpc}/h$ in under an hour on modest computing resources. We expect this algorithm will render it possible to obtain the large-scale 3PCF for upcoming surveys such as Euclid, LSST, and DESI.
TL;DR: The Fuzzy C-means++ algorithm is introduced, which, by utilizing the seeding mechanism of the K-mean++ algorithm, improves the effectiveness and speed of FuzzY C- means.
Abstract: Paper proposes the Fuzzy C-means++ method for improving the effectiveness and speed of the Fuzzy C-means algorithm.This method works by spreading the initial cluster representatives in the data space at initialization.The proposed algorithm achieves superior results on both artificially generated and real world data sets. Fuzzy C-means has been utilized successfully in a wide range of applications, extending the clustering capability of the K-means to datasets that are uncertain, vague and otherwise hard to cluster. This paper introduces the Fuzzy C-means++ algorithm which, by utilizing the seeding mechanism of the K-means++ algorithm, improves the effectiveness and speed of Fuzzy C-means. By careful seeding that disperses the initial cluster centers through the data space, the resulting Fuzzy C-means++ approach samples starting cluster representatives during the initialization phase. The cluster representatives are well spread in the input space, resulting in both faster convergence times and higher quality solutions. Implementations in R of standard Fuzzy C-means and Fuzzy C-means++ are evaluated on various data sets. We investigate the cluster quality and iteration count as we vary the spreading factor on a series of synthetic data sets. We run the algorithm on real world data sets and to account for the non-determinism inherent in these algorithms we record multiple runs while choosing different k parameter values. The results show that the proposed method gives significant improvement in convergence times (the number of iterations) of up to 40 (2.1 on average) times the standard on synthetic datasets and, in general, an associated lower cost function value and Xie-Beni value. A proof sketch of the logarithmically bounded expected cost function value is given.
TL;DR: In this paper, it was shown that a function f : R → R of bounded variation satisfies VarMf ≤ C Var f, where Mf is the centered Hardy-Littlewood maximal function of f.
Abstract: We show that a function f : R → R of bounded variation satisfies VarMf ≤ C Var f, where Mf is the centered Hardy–Littlewood maximal function of f . Consequently, the operator f 7→ (Mf) is bounded from W (R) to L(R). This answers a question of Hajlasz and Onninen in the one-dimensional case.