Journal Article10.1093/BIOMET/ASM061
Bayesian Nonparametric Estimation of the Probability of Discovering New Species
TL;DR: In this article, a Bayesian nonparametric approach is used to evaluate the probability of discovering a certain number of new species in a new sample of population units, conditional on the number of species recorded in a basic sample.
read more
Abstract: SUMMARY We consider the problem of evaluating the probability of discovering a certain number of new species in a new sample of population units, conditional on the number of species recorded in a basic sample. We use a Bayesian nonparametric approach. The different species proportions are assumed to be random and the observations from the population exchangeable. We provide a Bayesian estimator, under quadratic loss, for the probability of discovering new species which can be compared with well-known frequentist estimators. The results we obtain are illustrated through a numerical example and an application to a genomic dataset concerning the discovery of new genes by sequencing additional single-read sequences of cDNA fragments.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Coverage‐based rarefaction and extrapolation: standardizing samples by completeness rather than size
Anne Chao,Lou Jost +1 more
TL;DR: An integrated sampling, rarefaction, and extrapolation methodology to compare species richness of a set of communities based on samples of equal completeness (as measured by sample coverage) instead of equal size is proposed.
1.8K
Models beyond the Dirichlet process
Antonio Lijoi,Igor Prünster +1 more
TL;DR: In this paper, the authors provide a review of Bayesian nonparametric models that go beyond the Dirichlet process, and show that in some cases of interest for statistical applications, the DPM is not an adequate prior choice.
Mixture Models With a Prior on the Number of Components
TL;DR: The most commonly used method of inference for MFMs is reversible jump Markov chain Monte Carlo, but it can be nontrivial to design good reversible jump moves, especially in high-dimensional spaces as discussed by the authors.
Estimating the Number of Unseen Species: How Many Words did Shakespeare Know?
Peter McCullagh
- 01 Jan 2008
TL;DR: Efron and Thisted as discussed by the authors studied the frequency distribution of words in the Shakespearean canon and found that the expected number of words that occur x ≥ 1 times in a large sample of n words is
238
•Posted Content
Mixture models with a prior on the number of components
TL;DR: It turns out that many of the essential properties of DPMs are also exhibited by MFMs, and the MFM analogues are simple enough that they can be used much like the corresponding DPM properties; this simplifies the implementation of MFMs and can substantially improve mixing.
165
References
Estimating the Number of Classes via Sample Coverage
Anne Chao,Shen-Ming Lee +1 more
TL;DR: This work generalizes the result of Esty to a nonparametric approach and extends Darroch and Ratcliff to incorporate the heterogeneity of the class probabilities to play an important role in the recommended estimation procedures.
1.3K
Estimating the Number of Species: A Review
John Bunge,M. Fitzpatrick +1 more
TL;DR: In this paper, the problem of estimating the number of kinds in a population of animals and plants is discussed. But the focus is not on estimating the relative sizes of the classes, but on the estimation of C itself.
772
Exchangeable and partially exchangeable random partitions
TL;DR: In this paper, a generalization of Ewens' partition structure, called partially exchangeable random partitions (PEBP), is presented, where a random partition of the positive integers is exchangeable iff it is partially exchangeable for a symmetric function p(n¯¯¯¯1,...,nk).
685
Estimating the Number of Species in a Stochastic Abundance Model
Anne Chao,John Bunge +1 more
TL;DR: Simulation studies show that this estimator compares well with maximum likelihood estimators (i.e., empirical Bayes estimators from the Bayesian viewpoint) for which an iterative numerical procedure is needed and may be infeasible.
549
Some developments of the Blackwell-MacQueen urn scheme
Jim Pitman
- 01 Jan 1996
TL;DR: The Blackwell-MacQueen description of sampling from a Dirichlet random distribution on an abstract space is reviewed and extended to a general family of random discrete distributions in this paper, and results are obtained by application of Kingman's theory of partition structures.