Anna Korhonen
Technion – Israel Institute of Technology
26 Papers
204 Citations
Anna Korhonen is an academic researcher from Technion – Israel Institute of Technology. The author has contributed to research in topics: Computer science & Word (computer architecture). The author has an hindex of 12, co-authored 26 publications. Previous affiliations of Anna Korhonen include University of Cambridge & University of Mannheim.
Chat about Author
Papers
•Posted Content
XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning
TL;DR: This work introduces Cross-lingual Choice of Plausible Alternatives (XCOPA), a typologically diverse multilingual dataset for causal commonsense reasoning in 11 languages, revealing that current methods based on multilingual pretraining and zero-shot fine-tuning transfer suffer from the curse of multilinguality and fall short of performance in monolingual settings by a large margin.
180
XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning
Edoardo Maria Ponti,Goran Glavaš,Olga Majewska,Qianchu Liu,Ivan Vulić,Anna Korhonen +5 more
- 01 Nov 2020
TL;DR: The authors introduce a cross-lingual choice of plausible alternatives (XCOPA) dataset for causal commonsense reasoning in 11 languages, which includes resource-poor languages like Eastern Apurimac Quechua and Haitian Creole.
136
Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity
Anne Lauscher,Ivan Vulić,Edoardo Maria Ponti,Anna Korhonen,Goran Glavaš +4 more
- 01 Dec 2020
TL;DR: The experiments suggest that the standard BERT (LIBERT), specialized for the word-level semantic similarity, yields better performance than the lexically blind “vanilla” BERT on several language understanding tasks, and shows consistent gains on 3 benchmarks for lexical simplification.
74
•Posted Content
Do We Really Need Fully Unsupervised Cross-Lingual Embeddings?.
TL;DR: It is shown that fully unsupervised CLWE methods still fail for a large number of language pairs and never surpass the performance of weakly supervised methods using the same self-learning procedure in any BLI setup, and the gaps are often substantial.
55
•Posted Content
Morph-fitting: Fine-Tuning Word Vector Spaces with Simple Language-Specific Rules
TL;DR: This work proposes a novel morph-fitting procedure which moves past the use of curated semantic lexicons for improving distributional vector spaces, and injects morphological constraints generated using simple language-specific rules to improve low-frequency word estimates and boost the semantic quality of the entire word vector collection.