Open AccessProceedings Article
Linear Text Segmentation Using Affinity Propagation
Anna Kazantseva,Stan Szpakowicz +1 more
- 27 Jul 2011
- pp 284-293
TL;DR: The results suggest that APS performs on par with or outperforms these two very competitive baselines on topical text segmentation in comparison with two state-of-the art segmenters.
read more
Abstract: This paper presents a new algorithm for linear text segmentation. It is an adaptation of Affinity Propagation, a state-of-the-art clustering algorithm in the framework of factor graphs. Affinity Propagation for Segmentation, or APS, receives a set of pairwise similarities between data points and produces segment boundaries and segment centres -- data points which best describe all other data points within the segment. APS iteratively passes messages in a cyclic factor graph, until convergence. Each iteration works with information on all available similarities, resulting in high-quality results. APS scales linearly for realistic segmentation tasks. We derive the algorithm from the original Affinity Propagation formulation, and evaluate its performance on topical text segmentation in comparison with two state-of-the art segmenters. The results suggest that APS performs on par with or outperforms these two very competitive baselines.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Proceedings Article
Topic Segmentation with a Structured Topic Model
Lan Du,Wray Buntine,Mark Johnson +2 more
- 01 Jun 2013
TL;DR: Experimental results show that the model outperforms previous unsupervised segmentation methods using only lexical information on Choi’s datasets and two meeting transcripts and has performance comparable to those previous methods on two written datasets.
107
Attention-Based Neural Text Segmentation
Pinkesh Badjatiya,Litton J. Kurisinkel,Manish Gupta,Vasudeva Varma +3 more
- 26 Mar 2018
TL;DR: This article proposed an attention-based bidirectional LSTM model where sentence embeddings are learned using CNNs and the segments are predicted based on contextual information, which can automatically handle variable sized context information.
62
Fast affinity propagation clustering based on incomplete similarity matrix
TL;DR: The results demonstrate that FastAP can achieve comparable clustering performances with the original AP algorithm, while the computational efficiency has been improved with a several-fold speed-up on small data sets and a dozens-of-fold on larger-scale data sets.
39
•Posted Content
Attention-based Neural Text Segmentation
TL;DR: This paper proposes an attention-based bidirectional LSTM model where sentence embeddings are learned using CNNs and the segments are predicted based on contextual information that can automatically handle variable sized context information.
36
Native language identification: explorations and applications
Shervin Malmasi
- 01 Jan 2016
TL;DR: A new task for finding contexts for errors that vary with the native language of the writer and propose four graph-theoretic models for doing so is defined, forming the basis of a useful research direction for developing methods to assist SLA experts develop hypotheses using large data.
35
References
Latent dirichlet allocation
TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.
Pattern Recognition and Machine Learning
TL;DR: This book covers a broad range of topics for regular factorial designs and presents all of the material in very mathematical fashion and will surely become an invaluable resource for researchers and graduate students doing research in the design of factorial experiments.
30.8K
•Proceedings Article
Latent Dirichlet Allocation
David M. Blei,Andrew Y. Ng,Michael I. Jordan +2 more
- 03 Jan 2001
TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).
•Book
Pattern Recognition and Machine Learning
Christopher M. Bishop
- 17 Aug 2006
TL;DR: Probability Distributions, linear models for Regression, Linear Models for Classification, Neural Networks, Graphical Models, Mixture Models and EM, Sampling Methods, Continuous Latent Variables, Sequential Data are studied.
•Proceedings Article
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
John Lafferty,Andrew McCallum,Fernando Pereira +2 more
- 28 Jun 2001
TL;DR: This work presents iterative parameter estimation algorithms for conditional random fields and compares the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.