Edit-Constrained Decoding for Sentence Simplification

doi:10.48550/arxiv.2409.19247

Journal Article10.48550/arxiv.2409.19247

Edit-Constrained Decoding for Sentence Simplification

Tatsuya Zetsu, +2 more

- 28 Sep 2024

TL;DR: This study proposes edit-constrained decoding for sentence simplification, introducing stricter constraints that replicate edit operations, outperforming previous methods on three English corpora, improving sentence simplification efficacy and accuracy.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Figures

Figure 1: Our method constrains generation based on edit operations during beam search. Here, ‘artisans’ is replaced by ‘craftsmen’ by a substitution constraint.

Table 2: Evaluation results with oracle constraints; the scores were measured on the single references from which the constraints were extracted (‘BS’ and ‘Len’ represent BERTScore and average output length, respectively).

Table 5: Example outputs of simplification models (bold constraints are satisfied by the proposed method.)

Table 3: Percentage of satisfied constraints; ‘Comp.’ represents (Zetsu et al., 2022).

References

•Posted Content

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

- 11 Oct 2018

- arXiv: Computation and Language

TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

81.7K

•Proceedings Article•10.3115/1073083.1073135

Bleu: a Method for Automatic Evaluation of Machine Translation

Kishore Papineni, +3 more

- 06 Jul 2002

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.

...read moreread less

28.9K

•Proceedings Article•10.18653/V1/2020.ACL-MAIN.703

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

Michael Lewis, +7 more

- 01 Jul 2020

TL;DR: BART is presented, a denoising autoencoder for pretraining sequence-to-sequence models, which matches the performance of RoBERTa on GLUE and SQuAD, and achieves new state-of-the-art results on a range of abstractive dialogue, question answering, and summarization tasks.

...read moreread less

11.5K

•Proceedings Article•10.3115/1557769.1557821

Moses: Open Source Toolkit for Statistical Machine Translation

Philipp Koehn, +13 more

- 25 Jun 2007

TL;DR: An open-source toolkit for statistical machine translation whose novel contributions are support for linguistically motivated factors, confusion network decoding, and efficient data formats for translation models and language models.

...read moreread less

6.3K

•Journal Article•10.1093/NAR/GKH061

The Unified Medical Language System (UMLS): integrating biomedical terminology

Olivier Bodenreider

- 01 Jan 2004

- Nucleic Acids Research

TL;DR: The Unified Medical Language System is a repository of biomedical vocabularies developed by the US National Library of Medicine and includes tools for customizing the Metathesaurus (MetamorphoSys), for generating lexical variants of concept names (lvg) and for extracting UMLS concepts from text (MetaMap).

...read moreread less

4.7K

...

Expand