Non-compositional Expression Generation Based on Curriculum Learning and Continual Learning

doi:10.18653/v1/2023.findings-emnlp.286

Journal Article10.18653/v1/2023.findings-emnlp.286

Non-compositional Expression Generation Based on Curriculum Learning and Continual Learning

Jianing Zhou, +3 more

pp 4320-4335

1

TL;DR: This work proposes a dynamic curriculum learning framework, which learns training examples from easy ones to harder ones thus optimizing the learning step by step, but suffers from the forgetting problem, and applies a continual learning method into this framework.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Figures

Table 2: Performance of different methods on MAGPIE dataset. Competence represents using competence score for scheduling. SL refers to using sentence length as difficulty score. WR refers to using word rarity as difficulty score. Best performance is labeled in bold. Models trained for 5 epochs are converged. p-value refers to the results of significance test based on our method and second best method (Fixed SGCL).

Table 8: Results based on full data on MAGPIE dataset.

Table 9: A sample of the generated sentences on MAGPIE highlighting the correct idioms, and the wrong idioms. Easy represents the easy example randomly selected from the examples in the start after ranking based on difficulty levels. Medium represents the example randomly selected from the examples in the middle after ranking based on difficulty levels. Hard represents the example randomly selected from the examples in the final after ranking based on difficulty levels.

Table 1: Examples of input and output in our tasks. Noncompositional expressions are highlighted in bold red

Table 4: Ablation study on MAGPIE dataset. Diff refers to our difficulty metric. Fixed means the training examples are sorted only once before training and fixed during training. Dynamic refers to our dynamic scheduling strategy.

Table 3: Performance of different methods on MERMAID dataset. Best performance is labeled in bold.

Citations

Proceedings Article•10.18653/v1/2024.naacl-long.272

No Context Needed: Contextual Quandary In Idiomatic Reasoning With Pre-Trained Language Models

K. Cheng, +1 more

TL;DR: Pre-trained language models (PTLMs) surprisingly perform worse on idiomatic reasoning tasks when provided with context, with removal of context leading to performance gains of up to 3.89%, highlighting the need for IE-aware models.

...read moreread less

References

•Proceedings Article•10.3115/1073083.1073135

Bleu: a Method for Automatic Evaluation of Machine Translation

Kishore Papineni, +3 more

- 06 Jul 2002

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.

...read moreread less

28.9K

•Proceedings Article

ROUGE: A Package for Automatic Evaluation of Summaries

Chin-Yew Lin

- 25 Jul 2004

TL;DR: Four different RouGE measures are introduced: ROUGE-N, ROUge-L, R OUGE-W, and ROUAGE-S included in the Rouge summarization evaluation package and their evaluations.

...read moreread less

14.8K

Proceedings Article•10.1145/1553374.1553380

Curriculum learning

Yoshua Bengio, +3 more

- 14 Jun 2009

TL;DR: It is hypothesized that curriculum learning has both an effect on the speed of convergence of the training process to a minimum and on the quality of the local minima obtained: curriculum learning can be seen as a particular form of continuation method (a general strategy for global optimization of non-convex functions).

...read moreread less

6.4K

•Journal Article•10.1073/PNAS.1611835114

Overcoming catastrophic forgetting in neural networks

James Kirkpatrick, +13 more

- 28 Mar 2017

- Proceedings of the National Academy of S...

TL;DR: In this paper, the authors show that it is possible to train networks that can maintain expertise on tasks that they have not experienced for a long time by selectively slowing down learning on the weights important for those tasks.

...read moreread less

5.3K

•Proceedings Article•10.1109/CVPR.2017.587

iCaRL: Incremental Classifier and Representation Learning

Sylvestre-Alvise Rebuffi, +3 more

- 01 Jul 2017

TL;DR: In this paper, the authors introduce a new training strategy, iCaRL, that allows learning in such a class-incremental way: only the training data for a small number of classes has to be present at the same time and new classes can be added progressively.

...read moreread less

4.4K

...

Expand