TL;DR: In this article, the Atangana-Baleanu fractional derivative with a non-local smooth kernel in Sobolev space is studied analytically and numerically.
Abstract: In this article, a class of population growth model, the fractional nonlinear logistic system, is studied analytically and numerically. This model is investigated by means of Atangana-Baleanu fractional derivative with a non-local smooth kernel in Sobolev space. Existence and uniqueness theorem for the fractional logistic equation is provided based on the fixed-point theory. In this orientation, two numerical techniques are implemented to obtain the approximate solutions; the reproducing-kernel algorithm is based on the Schmidt orthogonalization process to construct a complete normal basis, while the successive substitution algorithm is based on an appropriate iterative scheme. Convergence analysis associated with the suggested approaches is provided to demonstrate the applicability theoretically. The impact of the fractional derivative on population growth is discussed by a class of nonlinear logistical models using the derivatives of Caputo, Caputo-Fabrizio, and Atangana-Baleanu. Using specific examples, numerical simulations are presented in tables and graphs to show the effect of the fractional operator on the population curve as . The present results confirm the theoretical predictions and depict that the suggested schemes are highly convenient, quite effective and practically simplify computational time.
TL;DR: This article proposes to combine several pre-treatments through the use of sequential and orthogonalized partial least squares (SO-PLS), thus leading to a boosting method.
TL;DR: To demonstrate behavior, efficiency, and appropriateness of the present technique, two different numerical experiments are solved numerically in this paper.
TL;DR: The proposed Gram-Schmidt process can be applied to Arnoldi iteration and result in new Krylov subspace methods for solving high-dimensional systems of equations or eigenvalue problems.
Abstract: A randomized Gram-Schmidt algorithm is developed for orthonormalization of high-dimensional vectors or QR factorization. The proposed process can be less computationally expensive than the classical Gram-Schmidt process while being at least as numerically stable as the modified Gram-Schmidt process. Our approach is based on random sketching, which is a dimension reduction technique consisting in estimation of inner products of high-dimensional vectors by inner products of their small efficiently-computable random projections, so-called sketches. This allows to perform the projection step in Gram-Schmidt process on sketches rather than high-dimensional vectors with a minor computational cost. This also provides an ability to efficiently certify the output. The proposed Gram-Schmidt algorithm can provide computational cost reduction in any architecture. The benefit of random sketching can be amplified by exploiting multi-precision arithmetic. We provide stability analysis for multi-precision model with coarse unit roundoff for standard high-dimensional operations. Numerical stability is proven for the unit roundoff independent of the (high) dimension of the problem. The proposed Gram-Schmidt process can be applied to Arnoldi iteration and result in new Krylov subspace methods for solving high-dimensional systems of equations or eigenvalue problems. Among them we chose randomized GMRES method as a practical application of the methodology.
TL;DR: In this paper, the authors proposed a computationally efficient and numerically stable orthogonalization method using Newton's iteration (ONI), to learn a layer-wise orthogonality weight matrix in DNNs.
Abstract: Orthogonality is widely used for training deep neural networks (DNNs) due to its ability to maintain all singular values of the Jacobian close to 1 and reduce redundancy in representation. This paper proposes a computationally efficient and numerically stable orthogonalization method using Newton's iteration (ONI), to learn a layer-wise orthogonal weight matrix in DNNs. ONI works by iteratively stretching the singular values of a weight matrix towards 1. This property enables it to control the orthogonality of a weight matrix by its number of iterations. We show that our method improves the performance of image classification networks by effectively controlling the orthogonality to provide an optimal tradeoff between optimization benefits and representational capacity reduction. We also show that ONI stabilizes the training of generative adversarial networks (GANs) by maintaining the Lipschitz continuity of a network, similar to spectral normalization (SN), and further outperforms SN by providing controllable orthogonality.
TL;DR: Simulation results demonstrate that the proposed virtual angular-domain channel estimation scheme provides excellent MSE performance with much reduced pilot overhead and, consequently, enjoys much larger per-user achievable rate in comparison to the conventional schemes.
Abstract: This article proposes a virtual angular-domain channel estimation scheme for massive multiple-input multiple-output systems operating in frequency division duplex (FDD) mode. Different from the conventional scheme where orthogonal pilots are transmitted on different antennas, we propose to transfer the channel estimation problem to the virtual angular domain and utilize the channel sparsity to reduce the training and feedback overhead. An orthogonal matching pursuit with Gram-Schmidt orthogonalization algorithm is proposed to construct the unitary transformation between the spatial domain and the virtual angular domain, which achieves higher sparsity than the existing approaches. Furthermore, we propose to estimate the downlink (DL) dominant angular set, which captures most of the channel power with only a few elements, by utilizing the directional reciprocity of FDD systems, where a calibration algorithm is introduced to handle the different wavelengths of uplink and DL transmissions. Based on the estimated dominant sets, we introduce a partial orthogonal criterion for virtual angular-domain pilot design and further propose two pilot assignment algorithms which minimize pilot overhead and pilot-reuse interference, respectively. Theoretical analyses on pilot overhead and the mean square error (MSE) performance are also presented. Simulation results demonstrate that our proposed virtual angular-domain channel estimation scheme provides excellent MSE performance with much reduced pilot overhead and, consequently, enjoys much larger per-user achievable rate in comparison to the conventional schemes.
TL;DR: This paper proposes a computationally efficient and numerically stable orthogonalization method using Newton's iteration (ONI), to learn a layer-wise Orthogonal weight matrix in DNNs and improves the performance of image classification networks by effectively controlling the orthogonality.
Abstract: Orthogonality is widely used for training deep neural networks (DNNs) due to its ability to maintain all singular values of the Jacobian close to 1 and reduce redundancy in representation. This paper proposes a computationally efficient and numerically stable orthogonalization method using Newton's iteration (ONI), to learn a layer-wise orthogonal weight matrix in DNNs. ONI works by iteratively stretching the singular values of a weight matrix towards 1. This property enables it to control the orthogonality of a weight matrix by its number of iterations. We show that our method improves the performance of image classification networks by effectively controlling the orthogonality to provide an optimal tradeoff between optimization benefits and representational capacity reduction. We also show that ONI stabilizes the training of generative adversarial networks (GANs) by maintaining the Lipschitz continuity of a network, similar to spectral normalization (SN), and further outperforms SN by providing controllable orthogonality.
TL;DR: In this paper, the authors applied Gram-Schmidt orthogonalization procedure for generating displacement functions, which allowed them to obtain numerically stable functions to be used in this paper.
Abstract: The objective of this study is to apply Gram-Schmidt orthogonalization procedure for generating displacement functions. This procedure allows us to obtain numerically stable functions to be used in...
TL;DR: A novel tensor decomposition method named high-order orthogonal tensor singular value decomposition (HO-OTSVD) is proposed for knowledge discovery and outperforms the existing methods.
Abstract: The development of social networks and ubiquitous sensing promotes the network space into a new stage, which integrates the cyber network, physical network, and social network into cyber–physical–social networks (CPSN). In this paper, we propose a CPSN-based service framework. The framework firstly represents CPSN as an adjacency tensor. Then, a novel tensor decomposition method named high-order orthogonal tensor singular value decomposition (HO-OTSVD) is proposed for knowledge discovery. To cope with the dynamic CPSN, an incremental HO-OTSVD (IHO-OTSVD) is developed to update the orthogonal tensor basis and the core tensor. Furthermore, we propose high-order bidiagonal Lanczos algorithm to cope with the orthogonalization of HO-OTSVD, wherein the complexity reduces from cubic execution time to quadratic execution time. Finally, we use a recommendation system as a case study to evaluate the effectiveness and efficiency of the proposed CPSN-based framework. The results show that HO-OTSVD method outperforms the existing methods.
TL;DR: This work considers algorithms for addition, elementwise multiplication, computing norms and inner products, orthogonalization, and rounding (rank truncation) that are the kernel operations for applications such as iterative Krylov solvers that exploit the TT structure.
Abstract: We present efficient and scalable parallel algorithms for performing mathematical operations for low-rank tensors represented in the tensor train (TT) format. We consider algorithms for addition, elementwise multiplication, computing norms and inner products, orthogonalization, and rounding (rank truncation). These are the kernel operations for applications such as iterative Krylov solvers that exploit the TT structure. The parallel algorithms are designed for distributed-memory computation, and we use a data distribution and strategy that parallelizes computations for individual cores within the TT format. We analyze the computation and communication costs of the proposed algorithms to show their scalability, and we present numerical experiments that demonstrate their efficiency on both shared-memory and distributed-memory parallel systems. For example, we observe better single-core performance than the existing MATLAB TT-Toolbox in rounding a 2GB TT tensor, and our implementation achieves a $34\times$ speedup using all 40 cores of a single node. We also show nearly linear parallel scaling on larger TT tensors up to over 10,000 cores for all mathematical operations.
TL;DR: A novel orthogonal over-parameterized training (OPT) framework that can provably minimize the hyperspherical energy which characterizes the diversity of neurons on a hypersphere is proposed and reveals that learning a proper coordinate system for neurons is crucial to generalization.
Abstract: The inductive bias of a neural network is largely determined by the architecture and the training algorithm. To achieve good generalization, how to effectively train a neural network is of great importance. We propose a novel orthogonal over-parameterized training (OPT) framework that can provably minimize the hyperspherical energy which characterizes the diversity of neurons on a hypersphere. By maintaining the minimum hyperspherical energy during training, OPT can greatly improve the empirical generalization. Specifically, OPT fixes the randomly initialized weights of the neurons and learns an orthogonal transformation that applies to these neurons. We consider multiple ways to learn such an orthogonal transformation, including unrolling orthogonalization algorithms, applying orthogonal parameterization, and designing orthogonality-preserving gradient descent. For better scalability, we propose the stochastic OPT which performs orthogonal transformation stochastically for partial dimensions of neurons. Interestingly, OPT reveals that learning a proper coordinate system for neurons is crucial to generalization. We provide some insights on why OPT yields better generalization. Extensive experiments validate the superiority of OPT over the standard training.
TL;DR: A patch-based multi-manifold orthogonal neighborhood-preserving discriminant analysis algorithm, namely ONPDA, which is applied to face recognition and results compared with some state-of-the-art methods on a toy dataset and several benchmark face image databases demonstrate the effectiveness of OnPDA and KONPDA.
TL;DR: The approximate solution of the system of fuzzy Volterra integro-differential equations is obtained by the n-term intercept of the exact solution and proved to converge to the exact solutions.
TL;DR: The constructed approximations by both the schemes convert the AIEs and SIDEs into the system of algebraic equations and establish error bounds, stability and convergence analysis of the proposed schemes by considering several mild mathematical conditions.
TL;DR: In this paper, the authors explore the viability of SVD orthogonalization for 3D rotations in neural networks and present a theoretical analysis that shows SVD is the natural choice for projecting onto the rotation group.
Abstract: Symmetric orthogonalization via SVD, and closely related procedures, are well-known techniques for projecting matrices onto $O(n)$ or $SO(n)$. These tools have long been used for applications in computer vision, for example optimal 3D alignment problems solved by orthogonal Procrustes, rotation averaging, or Essential matrix decomposition. Despite its utility in different settings, SVD orthogonalization as a procedure for producing rotation matrices is typically overlooked in deep learning models, where the preferences tend toward classic representations like unit quaternions, Euler angles, and axis-angle, or more recently-introduced methods. Despite the importance of 3D rotations in computer vision and robotics, a single universally effective representation is still missing. Here, we explore the viability of SVD orthogonalization for 3D rotations in neural networks. We present a theoretical analysis that shows SVD is the natural choice for projecting onto the rotation group. Our extensive quantitative analysis shows simply replacing existing representations with the SVD orthogonalization procedure obtains state of the art performance in many deep learning applications covering both supervised and unsupervised training.
TL;DR: This work reformulates the DFT-based projection-operator diabatization method within a simple tight-binding model to generate diabats with increased localization, yielding a proper basis set convergence and improved performance for the general Hab11 benchmark set.
Abstract: We address a long-standing ambiguity in the DFT-based projection-operator diabatization method for charge transfer couplings in donor-acceptor systems. It has long been known that the original method yields diabats which are not strictly fragment-localized due to mixing arising from basis-set orthogonalization. We demonstrate that this can contribute to a severe underestimation of coupling strengths and a spurious dependence on the choice of the basis set. As a remedy, we reformulate the method within a simple tight-binding model to generate diabats with increased localization, yielding a proper basis set convergence and improved performance for the general Hab11 benchmark set. Orthogonality of diabats is ensured either through symmetric Lowdin or asymmetric Gram-Schmid procedures, the latter of which offers to extend these improvements to asymmetric systems such as adsorbates on surfaces.
TL;DR: The experimental results when transmitting multi-mode beams show that with a limited-size aperture, the power loss and crosstalk could be reduced by ∼8 and ∼23dB, respectively; and with misalignment, thePower loss andcrosstalks could be reduction by ∼15 and ∼40 dB, respectively.
Abstract: Limited-size receiver (Rx) apertures and transmitter-Rx (Tx-Rx) misalignments could induce power loss and modal crosstalk in a mode-multiplexed free-space link. We experimentally demonstrate the mitigation of these impairments in a 400 Gbit/s four-data-channel free-space optical link. To mitigate the above degradations, our approach of singular-value-decomposition-based (SVD-based) beam orthogonalization includes (1) measuring the transmission matrix H for the link given a limited-size aperture or misalignment; (2) performing SVD on the transmission matrix to find the U, Σ, and V complex matrices; (3) transmitting each data channel on a beam that is a combination of Laguerre-Gaussian modes with complex weights according to the V matrix; and (4) applying the U matrix to the channel demultiplexer at the Rx. Compared with the case of transmitting each channel on a beam using a single mode, our experimental results when transmitting multi-mode beams show that (a) with a limited-size aperture, the power loss and crosstalk could be reduced by ∼8 and ∼23dB, respectively; and (b) with misalignment, the power loss and crosstalk could be reduced by ∼15 and ∼40dB, respectively.
TL;DR: In this paper, the orthogonalization process for different inner products is applied to pairwise comparisons and properties of consistent approximations of a given inconsistent pairwise comparison matrix are examined.
Abstract: In this study, the orthogonalization process for different inner products is applied to pairwise comparisons. Properties of consistent approximations of a given inconsistent pairwise comparisons matrix are examined. A method of a derivation of a priority vector induced by a pairwise comparison matrix for a given inner product has been introduced. The mathematical elegance of orthogonalization and its universal use in most applied sciences has been the motivating factor for this study. However, the finding of this study that approximations depend on the inner product assumed, is of considerable importance.
TL;DR: In this paper, the authors constructed entanglement-assisted tensor product codes from classical tensor products over the parity check matrices of two classical codes over qudits.
Abstract: We provide a procedure to construct entanglement-assisted Calderbank–Shor–Steane (CSS) codes over qudits from the parity check matrices of two classical codes over $\mathbb {F}_q$ , where $q=p^k$ , $p$ is prime, and $k$ is a positive integer. The construction procedure involves the proposed Euclidean Gram–Schmidt orthogonalization algorithm, followed by a procedure to extend the quantum operators to obtain stabilizers of the code. Using this construction, we provide a construction of entanglement-assisted tensor product codes from classical tensor product codes over $\mathbb {F}_q$ . We further show that a nonzero rate entanglement-assisted tensor product code can be obtained from a classical tensor product code whose component codes yield zero rate entanglement-assisted CSS codes. We view this result as the coding analog of superadditivity.
TL;DR: In this paper, the authors presented a simplified version of the Strongly Stable Generalized Finite Element Method (SSGFEM) that does not require two enrichment functions at any of the nodes.
TL;DR: The proposed method can compensate IQ imbalance in the optical coherent system by two steps of phase orthogonalization and amplitude balance and it is easier to implement in terms of resources and complexity.
TL;DR: Techniques for orthogonalization and computing Rayleigh-Ritz problems are introduced to improve the stability, efficiency and scalability of the proposed generalized parallel conjugate gradient method for large scale eigenvalue problems.
Abstract: Based on damping blocked inverse power method, a type of generalized parallel conjugate gradient method is proposed for large scale eigenvalue problems. Techniques for orthogonalization and computing Rayleigh-Ritz problems are introduced to improve the stability, efficiency and scalability. Furthermore, a computing package is built based on the proposed method here. Some numerical tests are provided to validate the stability, efficiency and scalability of the method in this paper. The corresponding computing package can be downloaded from the web site:
https://github.com/pase2017/GCGE-1.0
.
TL;DR: The experimental results show that non-normal RNNs outperform their orthogonal counterparts in a diverse range of benchmarks and find evidence for increased non-normality and hidden chain-like feedforward motifs in trained RNN’s initialized with Orthogonal recurrent connectivity matrices.
Abstract: Training recurrent neural networks (RNNs) is a hard problem due to degeneracies in the optimization landscape, a problem also known as the vanishing/exploding gradients problem. Short of designing new RNN architectures, various methods that have been proposed for dealing with this problem usually boil down to orthogonalization of the recurrent dynamics, either at initialization or during the entire training period. The basic motivation behind these methods is that orthogonal transformations are isometries of the Euclidean space, hence they preserve (Euclidean) norms and effectively deal with the vanishing/exploding gradients problem. However, this idea ignores the crucial effects of non-linearity and noise. In the presence of a non-linearity, orthogonal transformations no longer preserve norms, suggesting that alternative transformations might be better suited to non-linear networks. Moreover, in the presence of noise, norm preservation itself ceases to be the ideal objective. A more sensible objective is maximizing the signal-to-noise ratio (SNR) of the propagated signal instead. Previous work has shown that in the linear case, recurrent networks that maximize the SNR display strongly non-normal, sequential dynamics and orthogonal networks are highly suboptimal by this measure. Motivated by this finding, we investigate the potential of non-normal RNNs, i.e. RNNs with a non-normal recurrent connectivity matrix, in sequential processing tasks. Our experimental results show that non-normal RNNs outperform their orthogonal counterparts in a diverse range of benchmarks. We also find evidence for increased non-normality and hidden chain-like feedforward structures in trained RNNs initialized with orthogonal recurrent connectivity matrices.
TL;DR: In this paper, a new algorithm is presented to solve the nonlinear impulsive dif- ferential equations, which combines the reproducing kernel method with the least squares method.
TL;DR: It is shown that the desired orthogonality can be gradually achieved without invoking orthogonalization in each iteration and this framework fully consists of Basic Linear Algebra Subprograms (BLAS) operations and thus can be naturally parallelized.
Abstract: All-electron calculations play an important role in density functional theory, in which improving computational efficiency is one of the most needed and challenging tasks. In the model formulations, both nonlinear eigenvalue problem and total energy minimization problem pursue orthogonal solutions. Most existing algorithms for solving these two models invoke orthogonalization process either explicitly or implicitly in each iteration. Their efficiency suffers from this process in view of its cubic complexity and low parallel scalability in terms of the number of electrons for large scale systems. To break through this bottleneck, we propose an orthogonalization-free algorithm framework based on the total energy minimization problem. It is shown that the desired orthogonality can be gradually achieved without invoking orthogonalization in each iteration. Moreover, this framework fully consists of Basic Linear Algebra Subprograms (BLAS) operations and thus can be naturally parallelized. The global convergence of the proposed algorithm is established. We also present a precondition technique which can dramatically accelerate the convergence of the algorithm. The numerical experiments on all-electron calculations show the efficiency and high scalability of the proposed algorithm.
TL;DR: The finding of this study that approximations depend on the inner product assumed, is of considerable importance.
Abstract: In this study, the orthogonalization process for different inner products is applied to pairwise comparisons. Properties of consistent approximations of a given inconsistent pairwise comparisons matrix are examined. A method of a derivation of a priority vector induced by a pairwise comparison matrix for a given inner product has been introduced. The mathematical elegance of orthogonalization and its universal use in most applied sciences has been the motivating factor for this study. However, the finding of this study that approximations depend on the inner product assumed, is of considerable importance.
TL;DR: A novel orthogonalization-free method together with two specific algorithms are proposed to solve extreme eigenvalue problems by modifying the multi-column gradient such that earlier columns are decoupled from later ones.
Abstract: A novel orthogonalization-free method together with two specific algorithms are proposed to solve extreme eigenvalue problems. On top of gradient-based algorithms, the proposed algorithms modify the multi-column gradient such that earlier columns are decoupled from later ones. Global convergence to eigenvectors instead of eigenspace is guaranteed almost surely. Locally, algorithms converge linearly with convergence rate depending on eigengaps. Momentum acceleration, exact linesearch, and column locking are incorporated to further accelerate both algorithms and reduce their computational costs. We demonstrate the efficiency of both algorithms on several random matrices with different spectrum distribution and matrices from computational chemistry.
TL;DR: The revised Arnoldi algorithm for matrix exponential in the time domain simulation of large-scale power delivery networks (PDNs), which are formulated as semi-explicit differential-algebraic equations (DAEs), introduces a new structured orthogonalization process to construct the Krylov subspace.
Abstract: We propose a stability preserved Arnoldi algorithm for matrix exponential in the time domain simulation of large-scale power delivery networks (PDNs), which are formulated as semi-explicit differential-algebraic equations (DAEs). The matrix exponential and vector products (MEVPs) compose the solution of DAEs in multistep integration methods and can be efficiently approximated with the rational Krylov subspace. To produce stable simulation results for the ill-conditioned system from semi-explicit DAEs, the revised Arnoldi algorithm introduces a new structured orthogonalization process to construct the Krylov subspace. We demonstrate the performance of the new algorithm with theoretical proof and experiments. In the computation of MEVPs, we utilize the exponential related $\varphi $ functions to improve the numerical accuracy. We further explore the optimal ratio to confine the spectrum in the rational Krylov subspace. Finally, the transient framework is tested on a group of system-level PDNs, showing that matrix exponential-based algorithms could achieve high efficiency and accuracy.
TL;DR: It is shown that if the graph is chordal it is possible to sample uniformly from the set of correlation matrices compatible with the graph, while for general undirected graphs the authors rely on a partial orthogonalization method.