Semantics-based Multiword Expression Extraction

doi:10.3115/1613704.1613708

Proceedings Article10.3115/1613704.1613708

Semantics-based Multiword Expression Extraction

Tim Van de Cruys, +1 more

- 28 Jun 2007

- pp 25-32

80

TL;DR: A fully unsupervised and automated method for large-scale extraction of multiword expressions (MWEs) from large corpora that formalizes the intuition of non-compositionality of mwes.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1162/COLI.08-010-R1-07-048

Unsupervised type and token identification of idiomatic expressions

Afsaneh Fazly, +2 more

- 01 Mar 2009

- Computational Linguistics

TL;DR: This article develops statistical measures that each model a specific property of idiomatic expressions by looking at their actual usage patterns in text, and uses some of the measures in a token identification task where they distinguish idiomatic and literal usages of potentially idiomatic expression in context.

...read moreread less

233

Mining for meaning: the extraction of lexico-semantic knowledge from text

T. Van de Cruys

- 01 Jan 2010

145

•Journal Article•10.1162/COLI_A_00177

Identification of Multi-word Expressions by Combining Multiple Linguistic Information Sources

Yulia Tsvetkov, +1 more

- 27 Jul 2011

TL;DR: This work defines various linguistically motivated classification features and introduces novel ways for computing them, and manually defines interrelationships among the features, and expresses them in a Bayesian network, resulting in a powerful classifier that can identify multiword expressions of various types and multiple syntactic constructions in text corpora.

...read moreread less

58

Journal Article•10.1017/S1351324912000101

Extraction of multi-word expressions from small parallel corpora

Yulia Tsvetkov, +1 more

- 01 Oct 2011

- Natural Language Engineering

TL;DR: This article proposed a method for extracting multi-word expressions (MWEs) of various types, along with their translations, from small, word-aligned parallel corpora, focusing on misalignments; these typically indicate expressions in the source language that are translated to the target in a non-compositional way.

...read moreread less

56

•Journal Article•10.1007/S10579-009-9094-Z

DuELME: a Dutch electronic lexicon of multiword expressions

Nicole Grégoire

- 01 Apr 2010

TL;DR: It is shown that introducing parameters to the ECM optimizes the method and the extraction of candidate expressions from corpora and the selection criteria of the lexical entries are discussed.

...read moreread less

53

...

Expand

References

Some methods for classification and analysis of multivariate observations

James B. MacQueen

- 01 Jan 1967

TL;DR: The k-means algorithm as mentioned in this paper partitions an N-dimensional population into k sets on the basis of a sample, which is a generalization of the ordinary sample mean, and it is shown to give partitions which are reasonably efficient in the sense of within-class variance.

...read moreread less

28.1K

•Proceedings Article•10.3115/981732.981751

Verb semantics and lexical selection

Zhibiao Wu, +1 more

- 27 Jun 1994

Abstract: This paper will focus on the semantic representation of verbs in computer systems and its impact on lexical selection problems in machine translation (MT). Two groups of English and Chinese verbs are examined to show that lexical selection must be based on interpretation of the sentences as well as selection restrictions placed on the verb arguments. A novel representation scheme is suggested, and is compared to representations with selection restrictions used in transfer-based MT. We see our approach as closely aligned with knowledge-based MT approaches (KBMT), and as a separate component that could be incorporated into existing systems. Examples and experimental results will show that, using this scheme, inexact matches can achieve correct lexical selection.

...read moreread less

3.7K

•Posted Content

Verb Semantics and Lexical Selection

Zhibiao Wu, +1 more

- 22 Jun 1994

- arXiv: Computation and Language

TL;DR: This paper will focus on the semantic representation of verbs in computer systems and its impact on lexical selection problems in machine translation (MT), and sees the approach as closely aligned with knowledge-based MT approaches (KBMT), and as a separate component that could be incorporated into existing systems.

...read moreread less

2.7K

Some methods for classi cation and analysis of multivariate observations

J. Mcqueen

- 01 Jan 1967

2.3K

•Proceedings Article•10.3115/980691.980696

Automatic Retrieval and Clustering of Similar Words

Dekang Lin

- 10 Aug 1998

TL;DR: A word similarity measure based on the distributional pattern of words allows the automatically constructed thesaurus to be significantly closer to WordNet than Roget Thesaurus is.

...read moreread less

1.8K