TL;DR: A new framework to combine tree search with Monte-Carlo evaluation, that does not separate between a min-max phase and a Monte- carlo phase is presented, that provides finegrained control of the tree growth, at the level of individual simulations, and allows efficient selectivity.
Abstract: A Monte-Carlo evaluation consists in estimating a position by averaging the outcome of several random continuations. The method can serve as an evaluation function at the leaves of a min-max tree. This paper presents a new framework to combine tree search with Monte-Carlo evaluation, that does not separate between a min-max phase and a Monte-Carlo phase. Instead of backing-up the min-max value close to the root, and the average value at some depth, a more general backup operator is defined that progressively changes from averaging to minmax as the number of simulations grows. This approach provides a finegrained control of the tree growth, at the level of individual simulations, and allows efficient selectivity. The resulting algorithm was implemented in a 9 × 9 Go-playing program, Crazy Stone, that won the 10th KGS computer-Go tournament.
TL;DR: Both a mathematical method for computing the number of trees with a given value of topological difference from the NJ tree and a computer algorithm for identifying all the topologies are developed.
Abstract: A simple method for estimating and testing phylogenetic trees under the principle of minimum evolution (ME) is presented. The basic procedure of this method is first to obtain the neighbor-joining (NJ) tree by Saitou and Nei’s method and then to search for a tree with the minimum value of the sum (S) of branch lengths by examining all trees that are closely related to the NJ tree. Once the ME tree is identified, a statistical test is conducted for the difference in S between this tree and other closely related trees. The mathematical method required for conducting this test is developed by using the least-squares approach. Computer simulation has shown that this method identifies the correct tree with a high probability, as long as the number of nucleotides examined is sufficiently large. It has also been shown that the topology of the NJ tree is almost always identical with that of the ME tree. A method for obtaining least-squares estimates (and their standard errors) of branch lengths for a given topology is also presented. This method can be used for testing the reliability of the branching pattern of the ME tree. However, the statistical test of S values is more powerful in rejecting incorrect trees than is the branch-length test or bootstrapping. Furthermore, both a mathematical method for computing the number of trees with a given value of topological difference from the NJ tree and a computer algorithm for identifying all the topologies are developed.
TL;DR: The heuristic algorithm has a worst case time complexity of O(¦S¦¦V¦2) on a random access computer and it guarantees to output a tree that spans S with total distance on its edges no more than 2(1−1/l) times that of the optimal tree.
Abstract: Given an undirected distance graph G=(V, E, d) and a set S, where V is the set of vertices in G, E is the set of edges in G, d is a distance function which maps E into the set of nonnegative numbers and S?V is a subset of the vertices of V, the Steiner tree problem is to find a tree of G that spans S with minimal total distance on its edges. In this paper, we analyze a heuristic algorithm for the Steiner tree problem. The heuristic algorithm has a worst case time complexity of O(¦S¦¦V¦ 2) on a random access computer and it guarantees to output a tree that spans S with total distance on its edges no more than 2(1?1/l) times that of the optimal tree, where l is the number of leaves in the optimal tree.
TL;DR: A direct method for calculating expected data from an evolutionary model for two state characters is described, and it is shown that for n = 4 taxa, parsimony will always converge to the correct tree, but there are examples with n = 5 where Parsimony will converge on an incorrect tree, even for equal rates of evolution.
Abstract: A direct method for calculating expected data from an evolutionary model for two state characters is described. The method uses four vectors p, q, r and s. p and q are the probabilities of a character change on the 2n - 3 edges of a tree T (n is the number of taxa). r and s are properties of the data, are independent of any tree and have 2n-1 entries. For a given tree T, and with specified probabilities (p or q), we determine r, then s, the expected probabilities of each of the 2n-1 possible partitions of taxa. For any tree T the relationship can be inverted. This allows the probabilities of change on the tree, p and q, to be estimated directly from observed data (r or s). These relationships have been used to analyse the behaviour of tree building algorithms under conditions when there are sufficient data. (This is when the tree does not change as more data are collected, i.e., convergence to a single tree.) With equal rates of evolution (i.e., with a molecular clock), we show that for n = 4 taxa, parsimony will always converge to the correct tree, but we give examples with n = 5 where parsimony will converge on an incorrect tree, even for equal rates of evolution. A further example with n = 6 shows convergence to an incorrect tree with equal but arbitrarily small rates of change. We interpret a basic difficulty with parsimony as 'long edges attract.' If there are additional taxa that intersect long edges on the tree, then this effect can be reduced. Some distance methods may also converge to an incorrect tree. (Evolu- tionary trees, evolution, parsimony.) Reconstructing evolutionary trees from events that occurred many millions of years ago is a major intellectual challenge. Two recurring problems have been: the choice of optimality criteria and the estimation of
TL;DR: This paper provides an implementation of the tree projection method which is up to one order of magnitude faster than other recent techniques in the literature and has a well-structured data access pattern which provides data locality and reuse of data for multiple levels of the cache.