Personal computer

Topic Tools

Papers published on a yearly basis

1 / 2

Papers

Journal Article•10.1107/S0108767307043930•

A short history of SHELX

[...]

George M. Sheldrick¹•Institutions (1)

University of Göttingen¹

01 Jan 2008-Acta Crystallographica Section A

TL;DR: This paper could serve as a general literature citation when one or more of the open-source SH ELX programs (and the Bruker AXS version SHELXTL) are employed in the course of a crystal-structure determination.

...read moreread less

Abstract: An account is given of the development of the SHELX system of computer programs from SHELX-76 to the present day. In addition to identifying useful innovations that have come into general use through their implementation in SHELX, a critical analysis is presented of the less-successful features, missed opportunities and desirable improvements for future releases of the software. An attempt is made to understand how a program originally designed for photographic intensity data, punched cards and computers over 10000 times slower than an average modern personal computer has managed to survive for so long. SHELXL is the most widely used program for small-molecule refinement and SHELXS and SHELXD are often employed for structure solution despite the availability of objectively superior programs. SHELXL also finds a niche for the refinement of macromolecules against high-resolution or twinned data; SHELXPRO acts as an interface for macromolecular applications. SHELXC, SHELXD and SHELXE are proving useful for the experimental phasing of macromolecules, especially because they are fast and robust and so are often employed in pipelines for high-throughput phasing. This paper could serve as a general literature citation when one or more of the open-source SHELX programs (and the Bruker AXS version SHELXTL) are employed in the course of a crystal-structure determination.

...read moreread less

89,145 citations

Journal Article•10.1080/10635150390235520•

A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood.

[...]

Stéphane Guindon¹, Olivier Gascuel¹•Institutions (1)

Centre national de la recherche scientifique¹

01 Oct 2003-Systematic Biology

TL;DR: This work has used extensive and realistic computer simulations to show that the topological accuracy of this new method is at least as high as that of the existing maximum-likelihood programs and much higher than the performance of distance-based and parsimony approaches.

...read moreread less

Abstract: The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximum- likelihood principle, which clearly satisfies these requirements. The core of this method is a simple hill-climbing algorithm that adjusts tree topology and branch lengths simultaneously. This algorithm starts from an initial tree built by a fast distance-based method and modifies this tree to improve its likelihood at each iteration. Due to this simultaneous adjustment of the topology and branch lengths, only a few iterations are sufficient to reach an optimum. We used extensive and realistic computer simulations to show that the topological accuracy of this new method is at least as high as that of the existing maximum-likelihood programs and much higher than the performance of distance-based and parsimony approaches. The reduction of computing time is dramatic in comparison with other maximum-likelihood packages, while the likelihood maximization ability tends to be higher. For example, only 12 min were required on a standard personal computer to analyze a data set consisting of 500 rbcL sequences with 1,428 base pairs from plant plastids, thus reaching a speed of the same order as some popular distance-based and parsimony algorithms. This new method is implemented in the PHYML program, which is freely available on our web page: http://www.lirmm.fr/w3ifa/MAAS/. (Algorithm; computer simulations; maximum likelihood; phylogeny; rbcL; RDPII project.) The size of homologous sequence data sets has in- creased dramatically in recent years, and many of these data sets now involve several hundreds of taxa. More- over, current probabilistic sequence evolution models (Swofford et al., 1996 ; Page and Holmes, 1998 ), notably those including rate variation among sites (Uzzell and Corbin, 1971 ; Jin and Nei, 1990 ; Yang, 1996 ), require an increasing number of calculations. Therefore, the speed of phylogeny reconstruction methods is becoming a sig- nificant requirement and good compromises between speed and accuracy must be found. The maximum likelihood (ML) approach is especially accurate for building molecular phylogenies. Felsenstein (1981) brought this framework to nucleotide-based phy- logenetic inference, and it was later also applied to amino acid sequences (Kishino et al., 1990). Several vari- ants were proposed, most notably the Bayesian meth- ods (Rannala and Yang 1996; and see below), and the discrete Fourier analysis of Hendy et al. (1994), for ex- ample. Numerous computer studies (Huelsenbeck and Hillis, 1993; Kuhner and Felsenstein, 1994; Huelsenbeck, 1995; Rosenberg and Kumar, 2001; Ranwez and Gascuel, 2002) have shown that ML programs can recover the cor- rect tree from simulated data sets more frequently than other methods can. Another important advantage of the ML approach is the ability to compare different trees and evolutionary models within a statistical framework (see Whelan et al., 2001, for a review). However, like all optimality criterion-based phylogenetic reconstruction approaches, ML is hampered by computational difficul- ties, making it impossible to obtain the optimal tree with certainty from even moderate data sets (Swofford et al., 1996). Therefore, all practical methods rely on heuristics that obtain near-optimal trees in reasonable computing time. Moreover, the computation problem is especially difficult with ML, because the tree likelihood not only depends on the tree topology but also on numerical pa- rameters, including branch lengths. Even computing the optimal values of these parameters on a single tree is not an easy task, particularly because of possible local optima (Chor et al., 2000). The usual heuristic method, implemented in the pop- ular PHYLIP (Felsenstein, 1993 ) and PAUP ∗ (Swofford, 1999 ) packages, is based on hill climbing. It combines stepwise insertion of taxa in a growing tree and topolog- ical rearrangement. For each possible insertion position and rearrangement, the branch lengths of the resulting tree are optimized and the tree likelihood is computed. When the rearrangement improves the current tree or when the position insertion is the best among all pos- sible positions, the corresponding tree becomes the new current tree. Simple rearrangements are used during tree growing, namely "nearest neighbor interchanges" (see below), while more intense rearrangements can be used once all taxa have been inserted. The procedure stops when no rearrangement improves the current best tree. Despite significant decreases in computing times, no- tably in fastDNAml (Olsen et al., 1994 ), this heuristic becomes impracticable with several hundreds of taxa. This is mainly due to the two-level strategy, which sepa- rates branch lengths and tree topology optimization. In- deed, most calculations are done to optimize the branch lengths and evaluate the likelihood of trees that are finally rejected. New methods have thus been proposed. Strimmer and von Haeseler (1996) and others have assembled four- taxon (quartet) trees inferred by ML, in order to recon- struct a complete tree. However, the results of this ap- proach have not been very satisfactory to date (Ranwez and Gascuel, 2001 ). Ota and Li (2000, 2001) described

...read moreread less

17,549 citations

Journal Article•10.1002/JCC.21224•

PACKMOL: a package for building initial configurations for molecular dynamics simulations.

[...]

Leandro Martínez¹, Ricardo Andrade², Ernesto G. Birgin², José Mario Martínez¹•Institutions (2)

State University of Campinas¹, University of São Paulo²

01 Oct 2009-Journal of Computational Chemistry

TL;DR: This work has developed a code able to pack millions of atoms, grouped in arbitrarily complex molecules, inside a variety of three‐dimensional regions, which can be intersections of spheres, ellipses, cylinders, planes, or boxes.

...read moreread less

Abstract: Adequate initial configurations for molecular dynamics simulations consist of arrangements of molecules distributed in space in such a way to approximately represent the system's overall structure. In order that the simulations are not disrupted by large van der Waals repulsive interactions, atoms from different molecules must keep safe pairwise distances. Obtaining such a molecular arrangement can be considered a packing problem: Each type molecule must satisfy spatial constraints related to the geometry of the system, and the distance between atoms of different molecules must be greater than some specified tolerance. We have developed a code able to pack millions of atoms, grouped in arbitrarily complex molecules, inside a variety of three-dimensional regions. The regions may be intersections of spheres, ellipses, cylinders, planes, or boxes. The user must provide only the structure of one molecule of each type and the geometrical constraints that each type of molecule must satisfy. Building complex mixtures, interfaces, solvating biomolecules in water, other solvents, or mixtures of solvents, is straightforward. In addition, different atoms belonging to the same molecule may also be restricted to different spatial regions, in such a way that more ordered molecular arrangements can be built, as micelles, lipid double-layers, etc. The packing time for state-of-the-art molecular dynamics systems varies from a few seconds to a few minutes in a personal computer. The input files are simple and currently compatible with PDB, Tinker, Molden, or Moldy coordinate files. The package is distributed as free software and can be downloaded from http://www.ime.unicamp.br/~martinez/packmol/.

...read moreread less

7,596 citations

Journal Article•10.1038/S41592-019-0619-0•

Fast, sensitive and accurate integration of single-cell data with Harmony.

[...]

Ilya Korsunsky, Nghia Millard, Jean Fan¹, Kamil Slowikowski, Fan Zhang, Kevin Wei², Yuriy Baglaenko, Michael B. Brenner², Po-Ru Loh², Po-Ru Loh¹, Po-Ru Loh³, Soumya Raychaudhuri - Show less +8 more•Institutions (3)

Harvard University¹, Brigham and Women's Hospital², Broad Institute³

18 Nov 2019-Nature Methods

TL;DR: Harmony, for the integration of single-cell transcriptomic data, identifies broad and fine-grained populations, scales to large datasets, and can integrate sequencing- and imaging-based data.

...read moreread less

Abstract: The emerging diversity of single-cell RNA-seq datasets allows for the full transcriptional characterization of cell types across a wide variety of biological and clinical conditions. However, it is challenging to analyze them together, particularly when datasets are assayed with different technologies, because biological and technical differences are interspersed. We present Harmony ( https://github.com/immunogenomics/harmony ), an algorithm that projects cells into a shared embedding in which cells group by cell type rather than dataset-specific conditions. Harmony simultaneously accounts for multiple experimental and biological factors. In six analyses, we demonstrate the superior performance of Harmony to previously published algorithms while requiring fewer computational resources. Harmony enables the integration of ~106 cells on a personal computer. We apply Harmony to peripheral blood mononuclear cells from datasets with large experimental differences, five studies of pancreatic islet cells, mouse embryogenesis datasets and the integration of scRNA-seq with spatial transcriptomics data. Harmony, for the integration of single-cell transcriptomic data, identifies broad and fine-grained populations, scales to large datasets, and can integrate sequencing- and imaging-based data.

...read moreread less

6,282 citations

Journal Article•10.2307/2937941•

The Penn World Table (Mark 5): An Expanded Set of International Comparisons, 1950–1988

[...]

Robert Summers¹, Alan Heston¹•Institutions (1)

University of Pennsylvania¹

01 May 1991-Quarterly Journal of Economics

TL;DR: The Penn World Table as discussed by the authors is a set of national accounts economic time series covering many countries and its expenditure entries are denominated in common set of prices in a common currency so that real quantity comparisons can be made, both between countries and over time.

...read moreread less

Abstract: The Penn World Table displays a set of national accounts economic time series covering many countries. Its expenditure entries are denominated in a common set of prices in a common currency so that real quantity comparisons can be made, both between countries and over time. It also provides information about relative prices within and between countries, as well as demographic data and capital stock estimates. This updated, revised, and expanded Mark 5 version of the table includes more countries, years, and variables of interest to economic researchers. The Table is available on personal computer diskettes and through BITNET.

...read moreread less

3,642 citations

...

Expand

Year	Papers
2025	4
2024	9
2023	16
2022	31
2021	420
2020	959

Topic Tools

Papers published on a yearly basis

Papers

A short history of SHELX

A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood.

PACKMOL: a package for building initial configurations for molecular dynamics simulations.

Fast, sensitive and accurate integration of single-cell data with Harmony.

The Penn World Table (Mark 5): An Expanded Set of International Comparisons, 1950–1988

Related Topics (5)

Performance Metrics