TL;DR: A description is given of Phaser-2.1: software for phasing macromolecular crystal structures by molecular replacement and single-wavelength anomalous dispersion phasing.
Abstract: Phaser is a program for phasing macromolecular crystal structures by both molecular replacement and experimental phasing methods. The novel phasing algorithms implemented in Phaser have been developed using maximum likelihood and multivariate statistics. For molecular replacement, the new algorithms have proved to be significantly better than traditional methods in discriminating correct solutions from noise, and for single-wavelength anomalous dispersion experimental phasing, the new algorithms, which account for correlations between F+ and F−, give better phases (lower mean phase error with respect to the phases given by the refined structure) than those that use mean F and anomalous differences ΔF. One of the design concepts of Phaser was that it be capable of a high degree of automation. To this end, Phaser (written in C++) can be called directly from Python, although it can also be called using traditional CCP4 keyword-style input. Phaser is a platform for future development of improved phasing methods and their release, including source code, to the crystallographic community.
TL;DR: Four case studies in using maximum-likelihood molecular replacement, as implemented in the program Phaser, to solve structures of protein complexes are described.
Abstract: Molecular replacement (MR) generally becomes more difficult as the number of components in the asymmetric unit requiring separate MR models (i.e. the dimensionality of the search) increases. When the proportion of the total scattering contributed by each search component is small, the signal in the search for each component in isolation is weak or non-existent. Maximum-likelihood MR functions enable complex asymmetric units to be built up from individual components with a `tree search with pruning' approach. This method, as implemented in the automated search procedure of the program Phaser, has been very successful in solving many previously intractable MR problems. However, there are a number of cases in which the automated search procedure of Phaser is suboptimal or encounters difficulties. These include cases where there are a large number of copies of the same component in the asymmetric unit or where the components of the asymmetric unit have greatly varying B factors. Two case studies are presented to illustrate how Phaser can be used to best advantage in the standard `automated MR' mode and two case studies are used to show how to modify the automated search strategy for problematic cases.
TL;DR: ShAPEIT5 as discussed by the authors is a new phasing method that quickly and accurately processes large sequencing datasets and applied it to UK Biobank (UKB) whole-genome and whole-exome sequencing data.
Abstract: Abstract Phasing involves distinguishing the two parentally inherited copies of each chromosome into haplotypes. Here, we introduce SHAPEIT5, a new phasing method that quickly and accurately processes large sequencing datasets and applied it to UK Biobank (UKB) whole-genome and whole-exome sequencing data. We demonstrate that SHAPEIT5 phases rare variants with low switch error rates of below 5% for variants present in just 1 sample out of 100,000. Furthermore, we outline a method for phasing singletons, which, although less precise, constitutes an important step towards future developments. We then demonstrate that the use of UKB as a reference panel improves the accuracy of genotype imputation, which is even more pronounced when phased with SHAPEIT5 compared with other methods. Finally, we screen the UKB data for loss-of-function compound heterozygous events and identify 549 genes where both gene copies are knocked out. These genes complement current knowledge of gene essentiality in the human genome.
TL;DR: Molecular replacement is a technique used to determine crystal structures by computationally inserting known molecular fragments into a preliminary model. It involves various positioning techniques and evaluation methods to refine the solution.
Abstract: Abstract This paper will discuss the theory, implementation, and limitations of the computations used in phasing crystal structures using the molecular replacement method. It will describe the rotation function, the various positioning techniques (translation function, correlation searches, and packing analysis), and the methods used to improve upon a molecular replacement solution prior to refinement Means for evaluating the correctness of a solution will be discussed and some illustrative examples of structures phased by molecular replacement will be given. Finally, some recent work in developing a methodology for using the rotation function to find erroneous regions in a molecular replacement probe structure will be presented.
TL;DR: The AlphaFold 2 results in the 14th edition of Critical Assessment of Structure Prediction (CASP14) showed that accurate (low root-mean-square deviation) in silico models of protein structure domains are on the horizon, whether or not the protein is related to known structures through high-coverage sequence similarity as mentioned in this paper .
Abstract: The AlphaFold 2 results in the 14th edition of Critical Assessment of Structure Prediction (CASP14) showed that accurate (low root-mean-square deviation) in silico models of protein structure domains are on the horizon, whether or not the protein is related to known structures through high-coverage sequence similarity. As highly accurate models become available, generated by harnessing the power of correlated mutations and deep learning, one of the aspects of structural biology to be impacted will be methods of phasing in crystallography. Here, the data from CASP14 are used to explore the prospects for changes in phasing methods, and in particular to explore the prospects for molecular-replacement phasing using in silico models.