TL;DR: Most changes to the variables are an approximate solution to a trust region subproblem, using the current quadratic model, with a lower bound on the trust region radius that is reduced cautiously, in order to keep the interpolation points well separated until late in the calculation, which lessens damage from computer rounding errors.
Abstract: BOBYQA is an iterative algorithm for finding a minimum of a function F(x), x2R n , subject to bounds axb on the variables, F being specified by a "black box" that returns the value F(x) for any feasible x. Each iteration employs a quadratic approximation Q to F that satisfies Q(y j )= F(y j ), j =1 ,2,...,m, the interpolation points y j being chosen and adjusted automatically, but m is a prescribed constant, the value m =2 n+1 being typical. These conditions leave much freedom in Q, taken up when the model is updated by the highly successful technique of minimizing the Frobenius norm of the change to the second derivative matrix of Q. Thus no first derivatives of F are required explicitly. Most changes to the variables are an approximate solution to a trust region subproblem, using the current quadratic model, with a lower bound on the trust region radius that is reduced cautiously, in order to keep the interpolation points well separated until late in the calculation, which lessens damage from computer rounding errors. Some other changes to the variables are designed to improve the model without reducing F. These techniques are described. Other topics include the starting procedure that is given an initial vector of variables, the value of m and the initial trust region radius. There is also a new device called RESCUE that tries to restore normality if severe loss of accuracy occurs in the matrix calculations of the updating of the model. Numerical results are reported and discussed for two test problems, the numbers of variables being between 10 and 320.
TL;DR: Results indicate that IS-NMF correctly captures the semantics of audio and is better suited to the representation of music signals than NMF with the usual Euclidean and KL costs.
Abstract: This letter presents theoretical, algorithmic, and experimental results about nonnegative matrix factorization (NMF) with the Itakura-Saito (IS) divergence. We describe how IS-NMF is underlaid by a well-defined statistical model of superimposed gaussian components and is equivalent to maximum likelihood estimation of variance parameters. This setting can accommodate regularization constraints on the factors through Bayesian priors. In particular, inverse-gamma and gamma Markov chain priors are considered in this work. Estimation can be carried out using a space-alternating generalized expectation-maximization (SAGE) algorithm; this leads to a novel type of NMF algorithm, whose convergence to a stationary point of the IS cost function is guaranteed.
We also discuss the links between the IS divergence and other cost functions used in NMF, in particular, the Euclidean distance and the generalized Kullback-Leibler (KL) divergence. As such, we describe how IS-NMF can also be performed using a gradient multiplicative algorithm (a standard algorithm structure in NMF) whose convergence is observed in practice, though not proven.
Finally, we report a furnished experimental comparative study of Euclidean-NMF, KL-NMF, and IS-NMF algorithms applied to the power spectrogram of a short piano sequence recorded in real conditions, with various initializations and model orders. Then we show how IS-NMF can successfully be employed for denoising and upmix (mono to stereo conversion) of an original piece of early jazz music. These experiments indicate that IS-NMF correctly captures the semantics of audio and is better suited to the representation of music signals than NMF with the usual Euclidean and KL costs.
TL;DR: A unified framework for establishing consistency and convergence rates for regularized M-estimators under high-dimensional scaling is provided and one main theorem is state and shown how it can be used to re-derive several existing results, and also to obtain several new results.
Abstract: High-dimensional statistical inference deals with models in which the the number of parameters p is comparable to or larger than the sample size n. Since it is usually impossible to obtain consistent procedures unless p/n → 0, a line of recent work has studied models with various types of structure (e.g., sparse vectors; block-structured matrices; low-rank matrices; Markov assumptions). In such settings, a general approach to estimation is to solve a regularized convex program (known as a regularized M-estimator) which combines a loss function (measuring how well the model fits the data) with some regularization function that encourages the assumed structure. The goal of this paper is to provide a unified framework for establishing consistency and convergence rates for such regularized M-estimators under high-dimensional scaling. We state one main theorem and show how it can be used to re-derive several existing results, and also to obtain several new results on consistency and convergence rates. Our analysis also identifies two key properties of loss and regularization functions, referred to as restricted strong convexity and decomposability, that ensure the corresponding regularized M-estimators have fast convergence rates.
TL;DR: In this paper, the authors introduced two new related algorithms with better convergence rates: linear TD with gradient correction (TDC) and TDC with zero term update rule, which can be used for off-policy TD.
Abstract: Sutton, Szepesvari and Maei (2009) recently introduced the first temporal-difference learning algorithm compatible with both linear function approximation and off-policy training, and whose complexity scales only linearly in the size of the function approximator. Although their gradient temporal difference (GTD) algorithm converges reliably, it can be very slow compared to conventional linear TD (on on-policy problems where TD is convergent), calling into question its practical utility. In this paper we introduce two new related algorithms with better convergence rates. The first algorithm, GTD2, is derived and proved convergent just as GTD was, but uses a different objective function and converges significantly faster (but still not as fast as conventional TD). The second new algorithm, linear TD with gradient correction, or TDC, uses the same update rule as conventional TD except for an additional term which is initially zero. In our experiments on small test problems and in a Computer Go application with a million features, the learning rate of this algorithm was comparable to that of conventional TD. This algorithm appears to extend linear TD to off-policy learning with no penalty in performance while only doubling computational requirements.
TL;DR: This paper introduces a generic dynamic programming function for Matlab that solves discretetime optimal-control problems using Bellman's dynamic programming algorithm.
Abstract: This paper introduces a generic dynamic programming function for Matlab. This function solves discretetime optimal-control problems using Bellman's dynamic programming algorithm. The function is implemented such that the user only needs to provide the objective function and the model equations. The function includes several options for solving optimal-control problems. The model equations can include several state variables and input variables. Furthermore, the model equations can be time-variant and include time-variant state and input constraints. The syntax of the function is explained using two examples. The first is the well-known Lotka-Volterra fishery problem and the second is a parallel hybrid-electric vehicle optimization problem.
TL;DR: In this paper, a method and apparatus for displaying a view of an application on a touch-sensitive display include detecting a touch on the touch sensitive display when the touch is at a first force, first feedback is provided and a first function is performed When the touch was at a second force, second feedback was provided and second function was performed The first function and the second function are different
Abstract: Method and apparatus for displaying a view of an application on a touch-sensitive display include detecting a touch on the touch-sensitive display When the touch is at a first force, first feedback is provided and a first function is performed When the touch is at a second force, second feedback is provided and a second function is performed The first function and the second function are different The first force and the second force are different
TL;DR: This chapter describes the primer of mathematics and statistics of R(t) and discusses other similar markers of transmissibility as a function of time.
Abstract: Although the basic reproduction number, R 0, is useful for understanding the transmissibility of a disease and designing various intervention strategies, the classic threshold quantity theoretically assumes that the epidemic first occurs in a fully susceptible population, and hence, R 0 is essentially a mathematically defined quantity. In many instances, it is of practical importance to evaluate time-dependent variations in the transmission potential of infectious diseases. Explanation of the time course of an epidemic can be partly achieved by estimating the effective reproduction number, R(t), defined as the actual average number of secondary cases per primary case at calendar time t (for t >0). R(t) shows time-dependent variation due to the decline in susceptible individuals (intrinsic factors) and the implementation of control measures (extrinsic factors). If R(t) 1). This chapter describes the primer of mathematics and statistics of R(t) and discusses other similar markers of transmissibility as a function of time.
TL;DR: An optimization algorithm for minimizing a smooth function over a convex set by minimizing a diagonal plus lowrank quadratic approximation to the function, which substantially improves on state-of-the-art methods for problems such as learning the structure of Gaussian graphical models and Markov random elds.
Abstract: An optimization algorithm for minimizing a smooth function over a convex set is described. Each iteration of the method computes a descent direction by minimizing, over the original constraints, a diagonal plus lowrank quadratic approximation to the function. The quadratic approximation is constructed using a limited-memory quasi-Newton update. The method is suitable for large-scale problems where evaluation of the function is substantially more expensive than projection onto the constraint set. Numerical experiments on one-norm regularized test problems indicate that the proposed method is competitive with state-of-the-art methods such as boundconstrained L-BFGS and orthant-wise descent. We further show that the method generalizes to a wide class of problems, and substantially improves on state-of-the-art methods for problems such as learning the structure of Gaussian graphical models and Markov random elds.
TL;DR: This work presents a Bellman error objective function and two gradient-descent TD algorithms that optimize it, and proves the asymptotic almost-sure convergence of both algorithms, for any finite Markov decision process and any smooth value function approximator, to a locally optimal solution.
Abstract: We introduce the first temporal-difference learning algorithms that converge with smooth value function approximators, such as neural networks. Conventional temporal-difference (TD) methods, such as TD(λ), Q-learning and Sarsa have been used successfully with function approximation in many applications. However, it is well known that off-policy sampling, as well as nonlinear function approximation, can cause these algorithms to become unstable (i.e., the parameters of the approximator may diverge). Sutton et al. (2009a, 2009b) solved the problem of off-policy learning with linear TD algorithms by introducing a new objective function, related to the Bellman error, and algorithms that perform stochastic gradient-descent on this function. These methods can be viewed as natural generalizations to previous TD methods, as they converge to the same limit points when used with linear function approximation methods. We generalize this work to nonlinear function approximation. We present a Bellman error objective function and two gradient-descent TD algorithms that optimize it. We prove the asymptotic almost-sure convergence of both algorithms, for any finite Markov decision process and any smooth value function approximator, to a locally optimal solution. The algorithms are incremental and the computational complexity per time step scales linearly with the number of parameters of the approximator. Empirical results obtained in the game of Go demonstrate the algorithms' effectiveness.
TL;DR: In this paper, the authors considered the properties of Green's function for the nonlinear fractional differential equation boundary value problem D 0 + α u( t ) = f ( t, u ( t ) ), 0 t 1, u ( 0 ) = u ( 1 ) = 0, where 3 α ≤ 4 is a real number, and D 0+α is the standard Riemann-Liouville differentiation.
Abstract: In this paper, we consider the properties of Green’s function for the nonlinear fractional differential equation boundary value problem D 0 + α u ( t ) = f ( t , u ( t ) ) , 0 t 1 , u ( 0 ) = u ( 1 ) = u ′ ( 0 ) = u ′ ( 1 ) = 0 , where 3 α ≤ 4 is a real number, and D 0 + α is the standard Riemann–Liouville differentiation. As an application of Green’s function, we give some multiple positive solutions for singular and nonsingular boundary value problems, and we also give the uniqueness of solution for a singular problem by means of the Leray–Schauder nonlinear alternative, a fixed-point theorem on cones and a mixed monotone method.
TL;DR: In this paper, the authors describe non-Abelian generalizations of the Kuramoto model for any classical compact Lie group and identify their main properties, including the nonlinear evolution equations maintaining the unitarity of all variables which therefore evolve on the compact manifold of U(n).
Abstract: We describe non-Abelian generalizations of the Kuramoto model for any classical compact Lie group and identify their main properties. These models may be defined on any complex network where the variable at each node is an element of the unitary group U(n), or a subgroup of U(n). The nonlinear evolution equations maintain the unitarity of all variables which therefore evolve on the compact manifold of U(n). Synchronization of trajectories with phase locking occurs as for the Kuramoto model, for values of the coupling constant larger than a critical value, and may be measured by various order and disorder parameters. Limit cycles are characterized by a frequency matrix which is independent of the node and is determined by minimizing a function which is quadratic in the variables. We perform numerical computations for n = 2, for which the SU(2) group manifold is S3, for a range of natural frequencies and all-to-all coupling, in order to confirm synchronization properties. We also describe a second generalization of the Kuramoto model which is formulated in terms of real m-vectors confined to the (m − 1)-sphere for any positive integer m, and investigate trajectories numerically for the S2 model. This model displays a variety of synchronization phenomena in which trajectories generally synchronize spatially but are not necessarily phase-locked, even for large values of the coupling constant.
TL;DR: In this article, the authors proposed a new constraint handling approach for general constraints that is applicable to a widely used class of constrained derivative-free optimization methods, such as lower triangular mesh adaptive direct search (LTMads) methods.
Abstract: We propose a new constraint-handling approach for general constraints that is applicable to a widely used class of constrained derivative-free optimization methods. As in many methods that allow infeasible iterates, constraint violations are aggregated into a single constraint violation function. As in filter methods, a threshold, or barrier, is imposed on the constraint violation function, and any trial point whose constraint violation function value exceeds this threshold is discarded from consideration. In the new algorithm, unlike the filter method, the amount of constraint violation subject to the barrier is progressively decreased adaptively as the iteration evolves. We test this progressive barrier (PB) approach versus the extreme barrier (EB) with the generalized pattern search (Gps) and the lower triangular mesh adaptive direct search (LTMads) methods for nonlinear derivative-free optimization. Tests are also conducted using the Gps-filter, which uses a version of the Fletcher-Leyffer filter approach. We know that Gps cannot be shown to yield kkt points with this strategy or the filter, but we use the Clarke nonsmooth calculus to prove Clarke stationarity of the sequences of feasible and infeasible trial points for LTMads-PB. Numerical experiments are conducted on three academic test problems with up to 50 variables and on a chemical engineering problem. The new LTMads-PB method generally outperforms our LTMads-EB in the case where no feasible initial points are known, and it does as well when feasible points are known. which leads us to recommend LTMads-PB. Thus the LTMads-PB is a useful practical extension of our earlier LTMads-EB algorithm, particularly in the common case for real problems where no feasible point is known. The same conclusions hold for Gps-PB versus Gps-EB.
TL;DR: Given multiple time sequences with missing values, DynaMMo is proposed which summarizes, compresses, and finds latent variables, making the algorithm able to function even when there are missing values.
Abstract: Given multiple time sequences with missing values, we propose DynaMMo which summarizes, compresses, and finds latent variables. The idea is to discover hidden variables and learn their dynamics, making our algorithm able to function even when there are missing values.We performed experiments on both real and synthetic datasets spanning several megabytes, including motion capture sequences and chlorine levels in drinking water. We show that our proposed DynaMMo method (a) can successfully learn the latent variables and their evolution; (b) can provide high compression for little loss of reconstruction accuracy; (c) can extract compact but powerful features for segmentation, interpretation, and forecasting; (d) has complexity linear on the duration of sequences.
TL;DR: It turns out that one should use the properties that determine in the more important way the behavior of the amino acids and that the use of the appropriate metric can help in defining the groups into groups.
TL;DR: A new general method for kinematic analysis of rigid multi body systems subject to holonomic constraints is introduced and it is shown that exact velocity and acceleration analysis can be performed by solving linear sets of equations, originating from differentiation of the Karush–Kuhn–Tucker optimality conditions.
Abstract: In this paper, we introduce a new general method for kinematic analysis of rigid multi body systems subject to holonomic constraints. The method extends the standard analysis of kinematically determinate rigid multi body systems to the over-determinate case. This is accomplished by introducing a constrained optimisation problem with the objective function given as a function of the set of system equations that are allowed to be violated while the remaining equations define the feasible set. We show that exact velocity and acceleration analysis can also be performed by solving linear sets of equations, originating from differentiation of the Karush–Kuhn–Tucker optimality conditions. The method is applied to the analysis of an 18 degrees-of-freedom gait model where the kinematical drivers are prescribed with data from a motion capture experiment. The results show that significant differences are obtained between applying standard kinematic analysis or minimising the least-square errors on the two fully equi...
TL;DR: This work proposes a novel method that allows us to directly estimate the importance from samples without going through the hard task of density estimation, and demonstrates that the proposed method is computationally more efficient than existing approaches with comparable accuracy.
Abstract: Covariate shift is a situation in supervised learning where training and test inputs follow different distributions even though the functional relation remains unchanged. A common approach to compensating for the bias caused by covariate shift is to reweight the loss function according to the importance, which is the ratio of test and training densities. We propose a novel method that allows us to directly estimate the importance from samples without going through the hard task of density estimation. An advantage of the proposed method is that the computation time is nearly independent of the number of test input samples, which is highly beneficial in recent applications with large numbers of unlabeled samples. We demonstrate through experiments that the proposed method is computationally more efficient than existing approaches with comparable accuracy. We also describe a promising result for large-scale covariate shift adaptation in a natural language processing task.
TL;DR: The main new feature is that the algorithm is a modification of Pagh’s “hash-and-displace” approach with data compression on a sequence of hash function indices, which can be used for k-perfect hashing, where at most k keys may be mapped to the same value.
Abstract: A hash function h, i.e., a function from the set U of all keys to the range range [m] = {0,...,m − 1} is called a perfect hash function (PHF) for a subset S ⊆ U of size n ≤ m if h is 1-1 on S. The important performance parameters of a PHF are representation size, evaluation time and construction time. In this paper, we present an algorithm that permits to obtain PHFs with expected representation size very close to optimal while retaining O(n) expected construction time and O(1) evaluation time in the worst case. For example in the case m = 1.23n we obtain a PHF that uses space 1.4 bits per key, and for m = 1.01n we obtain space 1.98 bits per key, which was not achievable with previously known methods. Our algorithm is inspired by several known algorithms; the main new feature is that we combine a modification of Pagh’s “hash-and-displace” approach with data compression on a sequence of hash function indices. Our algorithm can also be used for k-perfect hashing, where at most k keys may be mapped to the same value.
TL;DR: In this article, the authors elaborate on a nonlinear explicit two-step P-stable method of fourth algebraic order and varying phase-lag order for solving one-dimensional second-order linear periodic initial value problems of ordinary differential equations.
TL;DR: A Bayes Linear approach is presented in order to identify the subset of the input space that could give rise to acceptable matches between model output and measured data, and was successful in producing a large collection of model evaluations that exhibit good fits to the observed data.
Abstract: In many scientific disciplines complex computer models are used to understand
the behaviour of large scale physical systems. An uncertainty analysis of such a
computer model known as Galform is presented. Galform models the creation and
evolution of approximately one million galaxies from the beginning of the Universe
until the current day, and is regarded as a state-of-the-art model within the cosmology
community. It requires the specification of many input parameters in order
to run the simulation, takes significant time to run, and provides various outputs
that can be compared with real world data. A Bayes Linear approach is presented
in order to identify the subset of the input space that could give rise to acceptable
matches between model output and measured data. This approach takes account
of the major sources of uncertainty in a consistent and unified manner, including
input parameter uncertainty, function uncertainty, observational error, forcing
function uncertainty and structural uncertainty. The approach is known as History
Matching, and involves the use of an iterative succession of emulators (stochastic
belief specifications detailing beliefs about the Galform function), which are used
to cut down the input parameter space. The analysis was successful in producing
a large collection of model evaluations that exhibit good fits to the observed data.
TL;DR: In this paper, a non-quadratic APN function has been constructed, which is in remarkable contrast to all recently constructed APN functions which have all been quadratic.
Abstract: Following an example in [12],
we show how to change one coordinate function of an
almost perfect nonlinear
(APN) function in order to obtain new examples. It turns out that
this is a very powerful method to construct new
APN functions. In particular, we show that our approach can
be used to construct a ''non-quadratic'' APN function.
This new example
is in remarkable contrast to all recently constructed functions which
have all been quadratic.
An equivalent function has been found
independently
by Brinkmann and Leander [8]. However, they
claimed that their function is CCZ equivalent to a quadratic one. In this
paper we give several reasons
why this new function is not equivalent to a quadratic one.
TL;DR: This study extends an efficient density-based algorithm for pairwise coverage to generate t-way interaction test suites and shows that it guarantees a logarithmic upper bound on the size of the test suites as a function of the number of factors.
TL;DR: In this paper, an error was introduced in the equation used to calculate the number of clusters per age and luminosity bin (Eq. (7)), affecting the predicted age distribution for a given cluster luminosity.
Abstract: An error was introduced in the equation used to calculate the number of clusters per age- and luminosity bin (Eq. (7)), affecting the predicted age distribution for a given cluster luminosity. Correction of this error leads to small adjustments in the relation between absolute magnitude and median age (Fig. 4) and in the model luminosity functions (Fig. 7). However, the changes are small enough that the results and conclusions of the paper remain unchanged.
TL;DR: This work develops minimax optimal risk bounds for the general learning task consisting in predicting as well as the best function in a reference set G up to the smallest possible additive term, called the convergence rate.
Abstract: We develop minimax optimal risk bounds for the general learning task consisting in predicting as well as the best function in a reference set G up to the smallest possible additive term, called the convergence rate. When the reference set is finite and when n denotes the size of the training data, we provide minimax convergence rates of the form C ([log |G|]/n)^v with tight evaluation of the positive constant C and with exact v in ]0;1], the latter value depending on the convexity of the loss function and on the level of noise in the output distribution. The risk upper bounds are based on a sequential randomized algorithm, which at each step concentrates on functions having both low risk and low variance with respect to the previous step prediction function. Our analysis puts forward the links between the probabilistic and worst-case viewpoints, and allows to obtain risk bounds unachievable with the standard statistical learning approach. One of the key idea of this work is to use probabilistic inequalities with respect to appropriate (Gibbs) distributions on the prediction function space instead of using them with respect to the distribution generating the data. The risk lower bounds are based on refinements of the Assouad's lemma taking particularly into account the properties of the loss function. Our key example to illustrate the upper and lower bounds is to consider the L_q-regression setting for which an exhaustive analysis of the convergence rates is given while q describes [1;+infinity[.
TL;DR: In this article, an adaptive algorithm for computing the solution of a large system of linear ordinary dierential equations (ODEs) with polynomial in-homogeneity is presented, where the action of the matrix function is computed by constructing a Krylov subspace using Arnoldi or Lanczos iteration and projecting the function on this subspace.
Abstract: We develop an algorithm for computing the solution of a large system of linear ordinary dierential equations (ODEs) with polynomial in- homogeneity. This is equivalent to computing the action of a certain matrix function on the vector representing the initial condition. The matrix function is a linear combination of the matrix exponential and other functions related to the exponential (the so-called '-functions). Such computations are the ma- jor computational burden in the implementation of exponential integrators, which can solve general ODEs. Our approach is to compute the action of the matrix function by constructing a Krylov subspace using Arnoldi or Lanczos iteration and projecting the function on this subspace. This is combined with time-stepping to prevent the Krylov subspace from growing too large. The algorithm is fully adaptive: it varies both the size of the time steps and the dimension of the Krylov subspace to reach the required accuracy with mini- mal eort. We implement this algorithm in the matlab function phipm and we give instructions on how to obtain and use this function. Various numerical experiments show that the phipm function is often significantly more ecient than the state-of-the-art.
TL;DR: A novel feature subset selection algorithm, which utilizes a genetic algorithm (GA) to optimize the output nodes of trained artificial neural network (ANN).
Abstract: This paper describes a novel feature subset selection algorithm, which utilizes a genetic algorithm (GA) to optimize the output nodes of trained artificial neural network (ANN). The new algorithm does not depend on the ANN training algorithms or modify the training results. The two groups of weights between input-hidden and hidden-output layers are extracted after training the ANN on a given database. The general formula for each output node (class) of ANN is then generated. This formula depends only on input features because the two groups of weights are constant. This dependency is represented by a non-linear exponential function. The GA is involved to find the optimal relevant features, which maximize the output function for each class. The dominant features in all classes are the features subset to be selected from the input feature group.
TL;DR: In this paper, the authors consider sample variance penalization, a learning method which takes into account the empirical variance of the loss function, and give conditions under which the method is effective.
Abstract: We give improved constants for data dependent and variance sensitive confidence bounds, called empirical Bernstein bounds, and extend these inequalities to hold uniformly over classes of functionswhose growth function is polynomial in the sample size n. The bounds lead us to consider sample variance penalization, a novel learning method which takes into account the empirical variance of the loss function. We give conditions under which sample variance penalization is effective. In particular, we present a bound on the excess risk incurred by the method. Using this, we argue that there are situations in which the excess risk of our method is of order 1/n, while the excess risk of empirical risk minimization is of order 1/sqrt/{n}. We show some experimental results, which confirm the theory. Finally, we discuss the potential application of our results to sample compression schemes.
TL;DR: The proposed approach to establishing correspondences between two sets of visual features using higher-order constraints instead of the unary or pairwise ones used in classical methods is compared to state-of-the-art algorithms on both synthetic and real data.
Abstract: This paper addresses the problem of establishing correspondences between two sets of visual features using higher-order constraints instead of the unary or pairwise ones used in classical methods. Concretely, the corresponding hypergraph matching problem is formulated as the maximization of a multilinear objective function over all permutations of the features. This function is defined by a tensor representing the affinity between feature tuples. It is maximized using a generalization of spectral techniques where a relaxed problem is first solved by a multi-dimensional power method, and the solution is then projected onto the closest assignment matrix. The proposed approach has been implemented, and it is compared to state-of-the-art algorithms on both synthetic and real data.
TL;DR: A lower-bound method is developed that does not require symmetry and proves the conjectured negative answer for functions F of the following form: F(x, y) = fn(x1, y1, x2, y2, ..., xn, yn), where fn is a symmetric Booleanfunction on n Boolean inputs, and xi, yi are the i'th bit of x and y, respectively.
Abstract: A major open problem in communication complexity is whether or not quantum protocolscan be exponentially more efficient than classical ones for computing a total Booleanfunction in the two-party interactive model. Razborov's result (Izvestiya: Mathematics,67(1):145-159, 2002) implies the conjectured negative answer for functions F of thefollowing form: F(x, y) = fn(x1 ċ y1, x2 ċ y2, ..., xn ċ yn), where fn is a symmetric Booleanfunction on n Boolean inputs, and xi, yi are the i'th bit of x and y, respectively. Hisproof critically depends on the symmetry of fn.
We develop a lower-bound method that does not require symmetry and prove theconjecture for a broader class of functions. Each of those functions F is the "block-composition" of a "building block" gk : {0, 1}k × {0, 1}k → {0, 1}, and an fn : {0, 1}n →{0, 1}, such that F(x, y) = fn(gk(x1, y1), gk(x2, y2), ..., gk(xn, yn)), where xi and xi arethe i'th k-bit block of x, y ∈ {0, 1}nk, respectively. We show that as long as gk itselfis "hard" enough, its block-composition with an arbitrary fn has polynomially relatedquantum and classical communication complexities. For example, when gk is the InnerProduct function with k = Ω(log n), the deterministic communication complexity of itsblock-composition with any fn is asymptotically at most the quantum complexity to thepower of 7.
TL;DR: A comprehensive asymptotic theory for the estimation of a change-point in the mean function of functional observations is developed and it is shown how the limit distribution of a suitably defined change- point estimator depends on the size and location of the change.
TL;DR: A one-parameter class of fourth-order methods for solving nonlinear equations based on Steffensen’s method is derived, which agrees with the conjecture of Kung–Traub for the case n = 3.