Journal Article10.48550/arxiv.2404.13397
Retrieval-Augmented Generation-based Relation Extraction
Sefika Efeoglu,Adrian Paschke +1 more
TL;DR: Retrieval-Augmented Generation-based Relation Extraction (RAG4RE) significantly enhances relation extraction performance by leveraging retrieved information and augmented generation techniques, surpassing the performance of traditional RE approaches based solely on LLMs.
read more
Abstract: Information Extraction (IE) is a transformative process that converts unstructured text data into a structured format by employing entity and relation extraction (RE) methodologies. The identification of the relation between a pair of entities plays a crucial role within this framework. Despite the existence of various techniques for relation extraction, their efficacy heavily relies on access to labeled data and substantial computational resources. In addressing these challenges, Large Language Models (LLMs) emerge as promising solutions; however, they might return hallucinating responses due to their own training data. To overcome these limitations, Retrieved-Augmented Generation-based Relation Extraction (RAG4RE) in this work is proposed, offering a pathway to enhance the performance of relation extraction tasks. This work evaluated the effectiveness of our RAG4RE approach utilizing different LLMs. Through the utilization of established benchmarks, such as TACRED, TACREV, Re-TACRED, and SemEval RE datasets, our aim is to comprehensively evaluate the efficacy of our RAG4RE approach. In particularly, we leverage prominent LLMs including Flan T5, Llama2, and Mistral in our investigation. The results of our study demonstrate that our RAG4RE approach surpasses performance of traditional RE approaches based solely on LLMs, particularly evident in the TACRED dataset and its variations. Furthermore, our approach exhibits remarkable performance compared to previous RE methodologies across both TACRED and TACREV datasets, underscoring its efficacy and potential for advancing RE tasks in natural language processing.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Figures

Table 1 Overview of benchmark datasets. 
Table 5 Comparing RAG4RE to Relation Extraction approaches using Language Models (LLMs) 
Table 4 Comparing our best-performing results with State-of-the-Art (SoTA) Systems’ results. 
Fig. 2. RAG-based Relation Extraction pipeline. 
Fig. 6. The number of False Negatives and Positives in results of experiments conducted on different benchmark datasets. 
Fig. 4. A regenerated prompt is illustrated.
Citations
Large Language Models for Generative Information Extraction: A Survey
Derong Xu,Wei Chen,Wenjun Peng,Chao Zhang,Tong Xu,Xiangyu Zhao,Xian Wu,Yefeng Zheng,Enhong Chen +8 more
TL;DR: This study surveys the most recent advancements in generative Large Language Models efforts for IE tasks and empirically analyzes the most advanced methods to discover the emerging trend of IE tasks with LLMs.
64
Kastor: Fine-Tuned Small Language Models for Shape-Based Active Relation Extraction
Célian Ringwald,Fabien L. Gandon,Catherine Faron-Zucker,Franck Michel,Hanna Abi Akl +4 more
TL;DR: Kastor is a framework that fine-tunes small language models for shape-based active relation extraction, enhancing model generalization and performance by evaluating all possible property combinations and refining noisy knowledge bases through iterative learning.
Relation Extraction with Fine-Tuned Large Language Models in Retrieval Augmented Generation Frameworks
Şefika Efeoğlu,Adrian Paschke +1 more
- 20 Jun 2024
TL;DR: Fine-tuned large language models enhance relation extraction performance by addressing domain adaptation challenges and identifying implicit relations in sentences, particularly when integrated into the Retrieval Augmented-based (RAG) framework.
References
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers,Iryna Gurevych +1 more
- 14 Aug 2019
TL;DR: Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity is presented.
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron,Louis Martin,Kevin R. Stone,Amjad Almahairi,Soumya Batra,Prajjwal Bhargava,Shruti Bhosale,Daniel M. Bikel,Lukas Blecher,Cristian Canton-Ferrer,Moya Chen,Guillem Cucurull,David Esiobu,Jude Fernandes,Cynthia Gao,Vedanuj Goswami,Naman Goyal,Anthony S. Hartshorn,Saghar Hosseini,Rui Hou,Hakan Inan,Marcin Kardas,Viktor Kerkez,Madian Khabsa,Isabel M. Kloumann,A. V. Korenev,Punit Singh Koura,Marie-Anne Lachaux,Thibaut Lavril,Diana Liskovich,Yinghai Lu,Yuning Mao,Xavier Martinet,Todor Mihaylov,Pushkar Mishra,Igor Molybog,Yixin Nie,Andrew M. Poulton,Jeremy Reizenstein,Rashi Rungta,Kalyan Saladi,Alan Schelten,Eric A. Smith,R. Subramanian,Xia Tan,Binh Tang,Ross Taylor,Adina Williams,Zhengxu Yan,Iliyan Radev Zarov,Yuchen Zhang,Angela Fan,Melanie Rae Kambadur,Sharan Narang,Aur'elien Rodriguez,Robert Stojnic,Sergey Edunov,Thomas Scialom +57 more
- 18 Jul 2023
TL;DR: This article developed and released Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters.
•Posted Content
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Patrick S. H. Lewis,Ethan Perez,Aleksandra Piktus,Fabio Petroni,Vladimir Karpukhin,Naman Goyal,Heinrich Küttler,Michael Lewis,Wen-tau Yih,Tim Rocktäschel,Sebastian Riedel,Douwe Kiela +11 more
TL;DR: A general-purpose fine-tuning recipe for retrieval-augmented generation (RAG) -- models which combine pre-trained parametric and non-parametric memory for language generation, and finds that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.
Scaling Instruction-Finetuned Language Models
Hyung Won Chung,Le Hou,Shayne Longpre,Barret Zoph,Yi Tay,William Fedus,Eric Li,Xuezhi Wang,Mostafa Dehghani,Siddhartha Brahma,Albert Webson,Shixiang Gu,Zhuyun Dai,Mirac Suzgun,Xinyun Chen,Aakanksha Chowdhery,Dasha Valter,Sharan Narang,Gaurav Mishra,Adams Wei Yu,Vincent Zhao,Yanping Huang,Andrew M. Dai,Hongkun Yu,Slav Petrov,Ed H. Chi,Jeffrey Dean,Jacob Devlin,Adam Roberts,Denny Zhou,Quoc V. Le,Jason Loh Seong Wei +31 more
TL;DR: This result shows that instruction and UL2 continued pre-training are complementary compute-efficient methods to improve the performance of language models without increasing model scale.
Position-aware Attention and Supervised Data Improve Slot Filling
Yuhao Zhang,Victor Zhong,Danqi Chen,Gabor Angeli,Christopher D. Manning +4 more
- 01 Sep 2017
TL;DR: An effective new model is proposed, which combines an LSTM sequence model with a form of entity position-aware attention that is better suited to relation extraction that builds TACRED, a large supervised relation extraction dataset obtained via crowdsourcing and targeted towards TAC KBP relations.