TL;DR: This work compares OMA standalone to other methods in the context of phylogenetic tree inference, by inferring a phylogeny of Lophotrochozoa, a challenging clade within the protostomes.
Abstract: Genomes and transcriptomes are now typically sequenced by individual laboratories but analyzing them often remains challenging. One essential step in many analyses lies in identifying orthologs-corresponding genes across multiple species-but this is far from trivial. The Orthologous MAtrix (OMA) database is a leading resource for identifying orthologs among publicly available, complete genomes. Here, we describe the OMA pipeline available as a standalone program for Linux and Mac. When run on a cluster, it has native support for the LSF, SGE, PBS Pro, and Slurm job schedulers and can scale up to thousands of parallel processes. Another key feature of OMA standalone is that users can combine their own data with existing public data by exporting genomes and precomputed alignments from the OMA database, which currently contains over 2100 complete genomes. We compare OMA standalone to other methods in the context of phylogenetic tree inference, by inferring a phylogeny of Lophotrochozoa, a challenging clade within the protostomes. We also discuss other potential applications of OMA standalone, including identifying gene families having undergone duplications/losses in specific clades, and identifying potential drug targets in nonmodel organisms. OMA standalone is available under the permissive open source Mozilla Public License Version 2.0.
TL;DR: FIESTA is a fixed-workload methodology that eliminates only sample imbalance by pre-selecting program regions for equal standalone running times rather than for equal instruction counts.
Abstract: Workload construction methodologies for multiprogram experiments are more complicated than those for single-program experiments. Fixed-workload methodologies pre-select samples from each program and use these in every experiment. They enable direct comparisons between experiments, but may also yield runs of which significant portions are spent executing only the slowest program(s). Variable-workload methodologies eliminate this load imbalance by using the multi-program run to define the workload, normalizing performance to the performance of the resulting individual program regions. However, they make direct comparisons difficult and tend to produce workloads that over-estimate throughput and speedup. We propose a multi-program workload methodology called FIESTA which is based on the observation that there are two kinds of load imbalance. Sample imbalance is due to differences in standalone program running times. Schedule imbalance is due to asymmetric contention during multi-program execution. Sample imbalance is harmful because it dilutes multi-program behaviors. Schedule imbalance is a characteristic of concurrent execution that should be preserved and measured. Traditional fixed-workload methodologies admit both kinds of imbalance. Variable-workload methodologies eliminate both kinds of imbalance. FIESTA is a fixed-workload methodology that eliminates only sample imbalance. It does so by pre-selecting program regions for equal standalone running times rather than for equal instruction counts.
TL;DR: Eucb is a standalone program for geometrical analysis of molecular dynamics trajectories of protein systems written in GNU C++ and it can be installed in any operating system running a C++ compiler.
TL;DR: The objectives in the implementation described here are to remain as close to the current definition of Ada as possible, and to learn through experience what changes are necessary in future versions of the language.
Abstract: The paper describes the design and implementation of a distributed Ada system. Ada is not well defined with respect to distribution, and any implementation for distributed execution must make a number of decisions about the language. The objectives in the implementation described here are to remain as close to the current definition of Ada as possible, and to learn through experience what changes are necessary in future versions of the language. The approach taken to distributing a single program is to assign library units that compose it to nodes of the distributed system. In a formal sense the semantics of a program is independent of the distribution because the semantics is interpreted to include all possible behaviours that result from different distributions. However, the functionality of the distributed program may then depend on the distribution in the sense that program behaviour may be impacted by the time required for communication among the distributed modules, or parts of the program may continue to function in presence of failures. The implementation technique converts each distributed module into a standalone program that communicates with its correspondents; each of these may then be compiled by an existing Ada compiler. Issues discussed include the ramifications of sharing of data types, objects, subprograms, tasks, and task types. The implementation techniques used in the translator are described.
TL;DR: In this paper, a method of translating clinical information into one or more standardised systems of coding or nomenclature processes received clinical information relating to a patient, which includes at least one free text description of a clinical status of the patient.
Abstract: A method of translating clinical information into one or more standardised systems of coding or nomenclature processes received clinical information (202) relating to a patient, which includes at least one free text description of a clinical status of the patient. The free text description is analysed (208-218) to identify one or more terms relevant to the clinical status of the patient. One or more translation sets are constructed (220), each of which includes one or more sequential identified terms. Each translation set is translated (234-252) into one or more standardised health codes or terms selected from a predetermined system of classification and/or nomenclature, and the selected standardised health codes or terms are output (254). The method may be computer- implemented, either as a standalone program, or in a networked configuration supporting access from remote terminals.