About: Simple matching coefficient is a research topic. Over the lifetime, 47 publications have been published within this topic receiving 1456 citations.
TL;DR: The genotypes within the distinct PI dusters may possess nseful genetic diversity that could be exploited by soybean breeders to increase yield.
Abstract: The genetic base of soybean [Glycine max (L.) Merr.] breeding in North America is very limited. The focus of this research was to assess the diversity of 18 soybean ancestors and 17 selected plant introductions (PIs) maintained in the USDA Soybean Germplasm Collection. Estimates of genetic relationships among the 35 genotypes were calculated from 281 random amplified polymorphic DNA (RAPD) markers using the simple matching coefficient (SMC) and expressed as Euclidean distances. Two forms of hierarchical and nonhierarchical cluster analysis as well as multidimensional scaling (MDS) were employed to reveal associations among the genotypes. The average genetic distance among all genotypes was 056. All methods of cluster analysis identified distinct groups of ancestors or PIs. Grouping of the ancestors generally agreed with known pedigree, origin, and maturity data. The four methods of clustering produced similar results, and genotypes were assigned to the same cluster 87% of the time. The MDS plots displayed relationships among the genotypes and may be a useful method of selecting genetically distinct individuals. The genotypes within the distinct PI dusters may possess nseful genetic diversity that could be exploited by soybean breeders to increase yield.
TL;DR: Results indicate the high potential of microsatellites to detect genetic diversity in coconut germplasm and suggest that dwarf samples grouped separately from talls and showed less genetic diversity.
Abstract: Microsatellites or simple sequence repeats (SSRs) were isolated from coconut (Cocos nucifera) and tested for polymorphism on restricted germplasm. Sequencing of 197 clones from a cv. Tagnanan Tall-enriched genomic library showed that 75% contained a microsatellite, of which 64% were dinucleotide (GA/CT, CA/GT and GC/CG), 6% were trinucleotide, and 30% were compound repeats. Of 41 primer pairs tested on Tagnanan Tall genomic DNA, 38 gave the expected size product, two amplified two loci, and another gave a multilocus pattern. On 20 coconut samples, the 38 SSRs detected 198 alleles (average: 5.2 alleles per microsatellite). Genetic diversity (D = 1 - Sigma pi2) values ranged from 0.141 to 0.809. Heterozygotes were present at high frequencies among some dwarf samples. Analysis of similarity matrices based either on shared alleles at each locus (simple matching coefficient) or on allele bands across all loci (Jaccard coefficient) showed similar results. Dwarfs grouped separately from talls and showed less gen...
TL;DR: Empirically obtained results demonstrate that the Salton’s cosine index (SCI) provides better accuracy (in terms of MAE, RMSE, and precision) for large datasets, whereas the overlap coefficient (OLC) results in more accurate recommendations for small datasets.
Abstract: Jaccard index, originally proposed by Jaccard (Bull Soc Vaudoise Sci Nat 37:241–272, 1901), is a measure for examining the similarity (or dissimilarity) between two sample data objects. It is defined as the proportion of the intersection size to the union size of the two data samples. It provides a very simple and intuitive measure of similarity between data samples. This research examines the measures that are akin to the Jaccard index and may be used for modelling affinity between users (or items) in collaborative recommendations. Particularly, the measures such as simple matching coefficient (SMC), Sorensen–Dice coefficient (SDC), Salton’s cosine index (SCI), and overlap coefficient (OLC) are compared and analysed in both theoretical and empirical perspectives with respect to the Jaccard index. Since these measures apprehend only the structural similarity information (overlapping information) between the data samples, these are very useful in situations where only the associations between users and items are available such as browsing or buying behaviours of the users on an e-commerce portal (i.e. unary rating data, a special case of ratings). Furthermore, a theoretical relation among these measures has been established. We have also derived an equivalent expression for each of these measures so that it can be directly applied for binary data samples in data mining/machine learning jargon. In order to compare and validate the effectiveness of these structural similarity measures, several experiments have been conducted using standardized benchmark datasets (MovieLens, FilmTrust, Epinions, Yahoo! Movies, and Yahoo! Music). Empirically obtained results demonstrate that the Salton’s cosine index (SCI) provides better accuracy (in terms of MAE, RMSE, and precision) for large datasets, whereas the overlap coefficient (OLC) results in more accurate recommendations for small datasets.
TL;DR: Random amplified polymorphic DNA markers were used to assess intraspecific variability and relationships in aerial yam (Dioscorea bulbifera L.) and supported previous varietal classification based on morphological characters.
Abstract: Random amplified polymorphic DNA (RAPD) markers were used to assess intraspecific variability and relationships in aerial yam (Dioscorea bulbifera L.). A total of 23 accessions from different geographic locations in Africa, Asia, and Polynesia were analyzed by 10 arbitrarily chosen GC-rich decamer primers. Using cesium chloride purified genomic template DNA, highly reproducible polymorphic fingerprints were generated by all 10 primers, resulting in a total of 375 informative characters. Only eight bands were monomorphic among all investigated accessions. A binary character matrix was generated by scoring for presence/absence of a band at a particular position, transformed into a matrix of pairwise distances using either the Jaccard or a simple matching coefficient, and analyzed by neighbour joining, UPGMA (unweighted pair group method with arithmetic averaging) cluster analysis, or split decomposition. All methods of data evaluation resulted in similar groupings that reflected the geographical origin of t...
TL;DR: Fluorescent AFLP and automated data analysis were employed to assess the genetic conformity within a breeders’ collection of evergreen azaleas and revealed the sensitivity of ordinations obtained by both similarity coefficients for the presence of weak or intensive markers or for the degree of polymorphism of the markers.
Abstract: Fluorescent AFLP and automated data analysis were employed to assess the genetic conformity within a breeders’ collection of evergreen azaleas. The study included 75 genotypes of Belgian pot azaleas (Rhododendron simsii Planch. hybrids), Kurume and Hirado azaleas and wild ancestor species from the Tsutsusi subgenus. Fluorescent detection and addition of an internal size standard to each lane enabled the automated scoring of each fragment arising from a single AFLP primer combination (PC). The use of three PCs generated an initial data set with a total of 648 fragments ranging from 70 bp to 450 bp. Different marker selection thresholds for average fluorescent signal intensity and marker frequency were used to create eight extra restricted data subsets. Pairwise plant genetic similarity was calculated for the nine data sets using the Simple Matching coefficient (symmetrical, including double-zeros) and Jaccard coefficient (asymmetrical, excluding double zeros). The averages, the ranges and the correlation to one other (Mantel analysis) were compared for the obtained similarity matrices. This revealed the sensitivity of ordinations obtained by both similarity coefficients for the presence of weak or intensive markers or for the degree of polymorphism of the markers. For 34 cultivars, pedigree information (at maximum to the fifth ancestor generation) was available. Genetic similarity by descent (kinship coefficient) was turned into a genetic distance and correlated to the genetic conformity, as revealed by the different selections of AFLP markers (Mantel analysis). Use of a Simple Matching coefficient with no or moderate selection to signal intensity and excluding rare and abundant markers gave the best correlation with pedigree. Finally, the ordination of the studied genotypes by means of dendrograms and principal co-ordinate analysis was confronted with known or accepted relationships based on geographical origin, parentage and morphological characters. Genotypes could be assigned to three distinct groups: pot azaleas, Kurume azaleas and Hirado azaleas. Wild ancestor species appeared to be more related to the Japanese azaleas. Intermediate cultivars could be typified as crossings with Kurume or Hirado azaleas or with wild species.