1. How can computational strategies be used to solve mixed model equations efficiently in large animal breeding programs?
Computational strategies can be used to solve mixed model equations efficiently in large animal breeding programs by utilizing high-performance computing (HPC) techniques. One proposed approach is the algorithm for Proven and Young Animals (APY), which approximates the inverse of the genomic relationship matrix (GRM) through genomic recursion on a subset of core animals. The original ssGBLUP model was reformulated to allow the use of established numerical software and avoid the explicit construction and inversion of the GRM. Single-step GT(A)BLUP models and single-step SNP BLUP models were proposed to estimate SNP effects directly, avoiding the GRM and its inverse. The use of sparse matrix operations and iterative-solver algorithms, such as the preconditioned conjugate gradient (PCG), can improve convergence speed and reduce computational load. Tailored algorithms for the multiplication of SNP matrices by real-valued matrices have been developed for CPUs and Nvidia (r) GPUs, utilizing the Nvidia (r) CUTLASS library and optimized for various instruction set architectures. These advancements can significantly reduce computation times and memory requirements, enabling the inclusion of larger population sizes in genomic evaluations.
read more
2. What is the ssSNPBLUP approach and how is it represented in the equation system?
The ssSNPBLUP approach is a single-step model introduced by Liu et al. (2014) for estimating breeding values. It is represented in the equation system as X'R^-1XX'R^-1nR^-1nWnX'R^-1gR^-1gWgW'nR^-1nXnW'nR^-1nWn + S11S12W'gR^-1gXgS21W'gR^-1gWg + S22S23S33-1S32.. The equation system consists of matrices and vectors that relate records, genotypes, and effects to estimate breeding values. The ssSNPBLUP system of equations is used to estimate the breeding values of animals based on their genotypes and phenotypes, considering the additive genetic effects and residual polygenic effects. The equation system allows for the estimation of breeding values in a single-step model, making it computationally efficient for large datasets.
read more
3. What computational bottleneck exists in equation systems?
The multiplication of Z by a matrix of low width L has been a computational bottleneck. This operation can be reformulated into ZL = ML - 1 ng p ' L, which consists of a vector-matrix multiplication and subsequent rank-one updates, making it cheap computationally. However, due to the low cost of genotyping, the matrix M can have extremely large dimensions, capturing the genomic information of millions of animals. Additionally, the matrix M is usually stored in compressed format, preventing naive calls to BLAS routines and making decompression inefficient and memory-intensive. Kim et al. (2022) propose a decompress-on-the-fly approach to address this issue, unpacking tiles of submatrices of M small enough to store the result in L1 cache and performing matrix multiplication on these tiles, taking advantage of the fast access times of the L1 cache.
read more
4. How does the 5codes algorithm reduce data streams?
The 5codes algorithm reduces data streams by utilizing a novel approach for CPU computations that aims to reduce data streams through the cache hierarchy. It views the problem of storing SNP data through the lens of combinatorics, where each realized vector can be stored in one 8-bit unsigned integer while preserving the order of the SNPs. During preprocessing, the input data is converted to a more compressed format called 5codes. At multiplication time, the algorithm loads a vector and stores all possible results of the scalar product in a hash table. This allows for efficient computation and storage of SNP data, reducing the number of memory accesses and accumulation errors in precision. The algorithm also parallelizes computations among available processor cores, further optimizing performance.
read more