Proceedings Article10.1145/3543826
Turkish Data-to-Text Generation Using Sequence-to-Sequence Neural Networks
Seniz Demir
- 08 Jul 2022
Vol. 22, Iss: 2, pp 1-27
3
TL;DR: It is argued that the wealth of knowledge residing in the datasets and the insights obtained from this study hold the potential to give rise to the development of new end-to-end generation approaches for Turkish and other morphologically rich languages.
read more
Abstract: End-to-end data-driven approaches lead to rapid development of language generation and dialogue systems. Despite the need for large amounts of well-organized data, these approaches jointly learn multiple components of the traditional generation pipeline without requiring costly human intervention. End-to-end approaches also enable the use of loosely aligned parallel datasets in system development by relaxing the degree of semantic correspondences between training data representations and text spans. However, their potential in Turkish language generation has not yet been fully exploited. In this work, we apply sequence-to-sequence (Seq2Seq) neural models to Turkish data-to-text generation where the input data given in the form of a meaning representation is verbalized. We explore encoder-decoder architectures with attention mechanism in unidirectional, bidirectional, and stacked recurrent neural network (RNN) models. Our models generate one-sentence biographies and dining venue descriptions using a crowdsourced dataset where all field value pairs that appear in meaning representations are fully captured in reference sentences. To support this work, we also explore the performances of our models on a more challenging dataset, where the content of a meaning representation is too large to fit into a single sentence, and hence content selection and surface realization need to be learned jointly. This dataset is retrieved by coupling introductory sentences of person-related Turkish Wikipedia articles with their contained infobox tables. Our empirical experiments on both datasets demonstrate that Seq2Seq models are capable of generating coherent and fluent biographies and venue descriptions from field value pairs. We argue that the wealth of knowledge residing in our datasets and the insights obtained from this study hold the potential to give rise to the development of new end-to-end generation approaches for Turkish and other morphologically rich languages.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
DRL-based dependent task offloading with delay-energy tradeoff in medical image edge computing
Qi Liu,Zhao Tian,Ning Wang,Yusong Lin +3 more
- 29 Jan 2024
TL;DR: This study proposes DCDO-DRL, a distributed collaborative dependent task offloading strategy using deep reinforcement learning to maximize utility of radiomics-based medical image diagnosis tasks, outperforming other algorithms by up to 23.07% in execution utility.
5
Phylogeny-Inspired Soft Prompts For Data-to-Text Generation in Low-Resource Languages
William Soto Martinez,Yannick Parmentier,Claire Gardent +2 more
TL;DR: This paper focuses on KG-to-Text generation where the output text is in Breton, Irish or Welsh and combines the strengths of a multilingual encoder-decoder model with denoising fine-tuning on monolingual data and Soft Prompt fine- Tuning on a small quantity of KG/text data.
Sentence Detailing and Its Applications
Feyza Şahin,Mehmet Fatih Amasyalı +1 more
- 11 Oct 2023
TL;DR: Sentence Detailing is a method of generating sentences by adding details to a set of words. It is a technique that utilizes the transformer model mT5 to learn commonsense knowledge from news articles and generate appropriate sentences based on the given words.
References
•Proceedings Article
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
- 01 Jan 2015
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
138.5K
•Proceedings Article
Attention is All you Need
Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Lukasz Kaiser,Illia Polosukhin +7 more
- 12 Jun 2017
TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
Bleu: a Method for Automatic Evaluation of Machine Translation
Kishore Papineni,Salim Roukos,Todd Ward,Wei-Jing Zhu +3 more
- 06 Jul 2002
TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.
•Proceedings Article
Neural Machine Translation by Jointly Learning to Align and Translate
Dzmitry Bahdanau,Kyunghyun Cho,Yoshua Bengio +2 more
- 01 Jan 2015
TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
25.7K
•Proceedings Article
ROUGE: A Package for Automatic Evaluation of Summaries
Chin-Yew Lin
- 25 Jul 2004
TL;DR: Four different RouGE measures are introduced: ROUGE-N, ROUge-L, R OUGE-W, and ROUAGE-S included in the Rouge summarization evaluation package and their evaluations.