On Estimation and Selection for Topic Models

Open AccessProceedings Article

On Estimation and Selection for Topic Models

- 21 Mar 2012

- pp 1184-1193

136

TL;DR: In this paper, the authors describe posterior maximization for topic models, identifying computational and conceptual gains from inference under a non-standard parametrization, and show that fitted parameters can be used as the basis for a novel approach to marginal likelihood estimation, via block-diagonal approximation to the information matrix, that facilitates choosing the number of latent topics.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.18637/JSS.V091.I02

stm: An R Package for Structural Topic Models

Margaret E. Roberts, +3 more

- 31 Oct 2019

- Journal of Statistical Software

TL;DR: This paper demonstrates how to use the R package stm for structural topic modeling, which allows researchers to flexibly estimate a topic model that includes document-level metadata.

...read moreread less

1.3K

Proceedings Article•10.3115/V1/W14-3110

LDAvis: A method for visualizing and interpreting topics

Carson Sievert, +1 more

- 01 Jan 2014

TL;DR: LDAvis, a web-based interactive visualization of topics estimated using Latent Dirichlet Allocation that is built using a combination of R and D3, and a novel method for choosing which terms to present to a user to aid in the task of topic interpretation is proposed.

...read moreread less

1.3K

Journal Article•10.1080/01621459.2016.1141684

A model of text for experimentation in the social sciences

Margaret E. Roberts, +2 more

- 18 Oct 2016

- Journal of the American Statistical Asso...

TL;DR: A hierarchical mixed membership model for analyzing topical content of documents, in which mixing weights are parameterized by observed covariates is posit, enabling researchers to introduce elements of the experimental design that informed document collection into the model, within a generally applicable framework.

...read moreread less

652

•Journal Article•10.1038/S41592-019-0367-1

cisTopic: cis-regulatory topic modeling on single-cell ATAC-seq data

Carmen Bravo González-Blas, +8 more

- 08 Apr 2019

- Nature Methods

TL;DR: As an unsupervised Bayesian framework, cisTopic classifies regions in scATAC-seq data into regulatory topics, which are used for clustering and provides insight into the mechanisms underlying regulatory heterogeneity in cell populations.

...read moreread less

459

•Journal Article•10.1086/705331

CEO Behavior and Firm Performance

Oriana Bandiera, +3 more

- 19 Feb 2020

- Journal of Political Economy

TL;DR: A new method to measure CEO behavior in large samples via a survey that collects high-frequency, high-dimensional diary data and a machine learning algorithm that estimates behavioral types reveals two types: “leaders,” who do multifunction,High-level meetings, and “managers,’ who do individual meetings with core functions.

...read moreread less

246

...

Expand

References

•Proceedings Article

On smoothing and inference for topic models

Arthur U. Asuncion, +3 more

- 18 Jun 2009

TL;DR: In this article, the authors compare the performance of topic models with collapsed Gibbs sampling, variational inference, and maximum a posteriori estimation, and find that the main differences are attributable to the amount of smoothing applied to the counts.

...read moreread less

597

Journal Article•10.1080/01621459.1997.10474044

Practical Bayesian Density Estimation Using Mixtures of Normals

Kathryn Roeder, +1 more

- 01 Sep 1997

- Journal of the American Statistical Asso...

TL;DR: In this paper, the posterior for the number of components in a mixture of normals is not well defined, and posterior simulation does not provide a direct estimate of the posterior of the components in the mixture.

...read moreread less

590

•Posted Content

On Smoothing and Inference for Topic Models

Arthur U. Asuncion, +3 more

- 09 May 2012

- arXiv: Learning

TL;DR: Using the insights gained from this comparative study, it is shown how accurate topic models can be learned in several seconds on text corpora with thousands of documents.

...read moreread less

507

•Proceedings Article•10.1145/1557019.1557121

Efficient methods for topic model inference on streaming document collections

Limin Yao, +2 more

- 28 Jun 2009

TL;DR: Empirical results indicate that SparseLDA can be approximately 20 times faster than traditional LDA and provide twice the speedup of previously published fast sampling methods, while also using substantially less memory.

...read moreread less

491

•Journal Article•10.1214/SS/1009211804

Integrated likelihood methods for eliminating nuisance parameters

James O. Berger, +2 more

- 01 Feb 1999

- Statistical Science

TL;DR: In this paper, the authors review common integrated likelihoods and discuss their strengths and weaknesses relative to other methods, especially those arising from default or non-informative priors.

...read moreread less

373

...

Expand

On Estimation and Selection for Topic Models

Chat with Paper

AI Agents for this Paper

Citations

stm: An R Package for Structural Topic Models

LDAvis: A method for visualizing and interpreting topics

A model of text for experimentation in the social sciences

cisTopic: cis-regulatory topic modeling on single-cell ATAC-seq data

CEO Behavior and Firm Performance

References

On smoothing and inference for topic models

Practical Bayesian Density Estimation Using Mixtures of Normals

On Smoothing and Inference for Topic Models

Efficient methods for topic model inference on streaming document collections

Integrated likelihood methods for eliminating nuisance parameters

Related Papers (5)

Latent dirichlet allocation

Probabilistic topic models

Optimizing Semantic Coherence in Topic Models

Finding scientific topics

Structural topic models for open ended survey responses