Supervised Grammar Induction using Training Data with Limited Constituent Information

doi:10.3115/1034678.1034699

Open AccessProceedings Article10.3115/1034678.1034699

Supervised Grammar Induction using Training Data with Limited Constituent Information

Rebecca Hwa

- 20 Jun 1999

- pp 73-79

79

TL;DR: This paper showed that the most informative linguistic constituents are the higher nodes in the parse trees, typically denoting complex noun phrases and sentential clauses, while base noun phrases account for only 20% of all constituents.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1109/TPAMI.2009.57

Domain Adaptation Problems: A DASVM Classification Technique and a Circular Validation Strategy

Lorenzo Bruzzone, +1 more

- 01 May 2010

- IEEE Transactions on Pattern Analysis an...

TL;DR: Experimental results confirmed the effectiveness and the reliability of both the DASVM technique and the proposed circular validation strategy for validating the learning of domain adaptation classifiers when no true labels for the target--domain instances are available.

...read moreread less

705

•Proceedings Article•10.3115/1220175.1220218

Reranking and Self-Training for Parser Adaptation

David McClosky, +2 more

- 17 Jul 2006

TL;DR: The reranking parser described in Charniak and Johnson (2005) improves performance of the parser on Brown to 85.2% and use of the self-training techniques described in (McClosky et al., 2006) raise this to 87.8% (an error reduction of 28%) again without any use of labeled Brown data.

...read moreread less

302

•Dissertation

From Distributional to Semantic Similarity

James Curran

- 01 Jan 2004

TL;DR: This dissertation describes how to extract contexts from a corpus of over 2 billion words and introduces a new context-weighted approximation algorithm with bounded complexity in context vector size that significantly reduces the system runtime with only a minor performance penalty.

...read moreread less

301

Journal Article•10.1177/0278364910369190

Object Recognition in 3D Point Clouds Using Web Data and Domain Adaptation

Kevin Lai, +1 more

- 01 Jul 2010

- The International Journal of Robotics Re...

TL;DR: This paper shows how to significantly reduce the need for manually labeled training data by leveraging data sets available on the World Wide Web by using objects from Google’s 3D Warehouse to train an object detection system for 3D point clouds collected by robots navigating through both urban and indoor environments.

...read moreread less

187

•Proceedings Article•10.18653/V1/N19-1114

Unsupervised Recurrent Neural Network Grammars

Yoon Kim, +5 more

- 07 Apr 2019

TL;DR: An inference network parameterized as a neural CRF constituency parser is developed to maximize the evidence lower bound and apply amortized variational inference to unsupervised learning of RNNGs.

...read moreread less

155

...

Expand

References

•Report•10.21236/ADA273556

Building a large annotated corpus of English: the penn treebank

Mitchell Marcus, +2 more

- 01 Jun 1993

- Computational Linguistics

TL;DR: As a result of this grant, the researchers have now published on CDROM a corpus of over 4 million words of running text annotated with part-of- speech (POS) tags, which includes a fully hand-parsed version of the classic Brown corpus.

...read moreread less

9.2K

•Journal Article•10.1016/S0019-9958(67)91165-5

Language identification in the limit

E. Mark Gold

- 01 May 1967

- Information & Computation

TL;DR: It was found that theclass of context-sensitive languages is learnable from an informant, but that not even the class of regular languages is learningable from a text.

...read moreread less

3.8K

Proceedings Article•10.3115/116580.116613

The ATIS spoken language systems pilot corpus

Charles T. Hemphill, +2 more

- 24 Jun 1990

TL;DR: This pilot marks the first full-scale attempt to collect a corpus to measure progress in Spoken Language Systems that include both a speech and natural language component and provides guidelines for future efforts.

...read moreread less

1K

An Empirical Evaluation

A. Jefferson Offutt, +1 more

- 01 Jan 1994

TL;DR: This study evaluates the impact of alternative design concepts on the performance of 30 airline pilots interacting with a cooperative system designed to support enroute flight planning and develops recommendations for guiding the design of cooperative systems.

...read moreread less

455

•Proceedings Article

Tree-bank Grammars

Eugene Charniak

- 04 Aug 1996

TL;DR: This paper presents results on a tree-bank grammar based on the Penn WaII Street Journal tree bank that outperforms other non-word-based statistical parsers/grammars on this corpus and outperforms parsers that consider the input as a string of tags and ignore the actual words of the corpus.

...read moreread less

345