Structured entity identification and document categorization: two tasks with one joint model

doi:10.1145/1401890.1401899

Proceedings Article10.1145/1401890.1401899

Structured entity identification and document categorization: two tasks with one joint model

Indrajit Bhattacharya, +2 more

- 24 Aug 2008

- pp 25-33

25

TL;DR: A probabilistic generative model for joint entity identification and document categorization is proposed and it is shown how the parameters of the model can be estimated using an EM algorithm in an unsupervised fashion.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1145/2818378

Collective Graph Identification

Galileo Namata, +2 more

- 29 Jan 2016

- ACM Transactions on Knowledge Discovery ...

TL;DR: This article introduces the problem of graph identification, i.e., discovering the latent graph structure underlying an observed network, and presents a simple, yet novel, approach to address all three subproblems simultaneously, which consists of a collection of Coupled Collective Classifiers that are applied iteratively to propagate inferred information among the subpro problems.

...read moreread less

43

Proceedings Article•10.1145/2766462.2767748

Retrieval of Relevant Opinion Sentences for New Products

Dae Hoon Park, +3 more

- 09 Aug 2015

TL;DR: This work studies the novel problem of retrieving relevant opinion sentences from the reviews of other products using specifications of a new or unpopular product as query and proposes a popular summarization method and its modified version to solve the problem.

...read moreread less

25

Proceedings Article•10.1145/1557019.1557152

Towards combining web classification and web information extraction: a case study

Ping Luo, +4 more

- 28 Jun 2009

TL;DR: This paper proposes to combine Web Classification and Web Information Extraction by using a model of Conditional Random Fields (CRFs), which can be used to simultaneously recognize the target Web pages and extract the corresponding metadata.

...read moreread less

22

Journal Article•10.1007/S10489-014-0557-6

Web metadata extraction and semantic indexing for learning objects extraction

John Atkinson, +3 more

- 01 Sep 2014

- Applied Intelligence

TL;DR: A multi-strategy approach for semantically guided extraction, indexing and search of educational metadata is described; it combines machine learning, concept analysis, and corpus-based natural language processing techniques.

...read moreread less

20

Book Chapter•10.1007/978-3-642-38577-3_14

Web metadata extraction and semantic indexing for learning objects extraction

John Atkinson, +3 more

- 17 Jun 2013

TL;DR: A new approach to automatic metadata extraction and semantic indexing for educational purposes is proposed to identify learning objects that may assist educators to prepare pedagogical materials from the Web.

...read moreread less

20

...

Expand

References

•Journal Article•10.5555/944919.944937

Latent dirichlet allocation

David M. Blei, +2 more

- 01 Mar 2003

- Journal of Machine Learning Research

TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.

...read moreread less

36.2K

•Proceedings Article

Latent Dirichlet Allocation

David M. Blei, +2 more

- 03 Jan 2001

TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).

...read moreread less

25.5K

•Book Chapter•10.1007/BFB0026683

Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

Thorsten Joachims

- 21 Apr 1998

TL;DR: This paper explores the use of Support Vector Machines for learning text classifiers from examples and analyzes the particular properties of learning with text data and identifies why SVMs are appropriate for this task.

...read moreread less

9.6K

•Journal Article•10.1023/A:1007379606734

Multitask Learning

Rich Caruana

- 01 Jul 1997

TL;DR: Multi-task Learning (MTL) as mentioned in this paper is an approach to inductive transfer that improves generalization by using the domain information contained in the training signals of related tasks as an inductive bias.

...read moreread less

8K

Proceedings Article•10.1145/279943.279962

Combining labeled and unlabeled data with co-training

Avrim Blum, +1 more

- 24 Jul 1998

TL;DR: A PAC-style analysis is provided for a problem setting motivated by the task of learning to classify web pages, in which the description of each example can be partitioned into two distinct views, to allow inexpensive unlabeled data to augment, a much smaller set of labeled examples.

...read moreread less

6.4K