Journal Article10.48550/arxiv.2311.07314
Semi-automatic Data Enhancement for Document-Level Relation Extraction with Distant Supervision from Large Language Models
Junpeng Li,Zixia Jia,Zilong Zheng +2 more
5
TL;DR: This work proposes a method integrating a large language model (LLM) and a natural language inference (NLI) module to generate relation triples, thereby augmenting document-level relation datasets and demonstrates the effectiveness of the approach by introducing an enhanced dataset known as DocGNRE, which excels in re-annotating numerous long-tail relation types.
read more
Abstract: Document-level Relation Extraction (DocRE), which aims to extract relations from a long context, is a critical challenge in achieving fine-grained structural comprehension and generating interpretable document representations. Inspired by recent advances in in-context learning capabilities emergent from large language models (LLMs), such as ChatGPT, we aim to design an automated annotation method for DocRE with minimum human effort. Unfortunately, vanilla in-context learning is infeasible for document-level relation extraction due to the plenty of predefined fine-grained relation types and the uncontrolled generations of LLMs. To tackle this issue, we propose a method integrating a large language model (LLM) and a natural language inference (NLI) module to generate relation triples, thereby augmenting document-level relation datasets. We demonstrate the effectiveness of our approach by introducing an enhanced dataset known as DocGNRE, which excels in re-annotating numerous long-tail relation types. We are confident that our method holds the potential for broader applications in domain-specific relation type definitions and offers tangible benefits in advancing generalized language semantic comprehension.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Research Trends for the Interplay between Large Language Models and Knowledge Graphs
Hanieh Khorashadizadeh,Fatima Amara,Morteza Kamaladdini Ezzabady,Frédéric Ieng,Sanju Tiwari,Nandana Mihindukulasooriya,Jinghua Groppe,Soror Sahri,Farah Benamara,Sven Groppe +9 more
- 12 Jun 2024
TL;DR: The interplay between Large Language Models and Knowledge Graphs is a key area of research for advancing AI capabilities in understanding, reasoning, and language processing. The research explores areas such as KG Question Answering, ontology generation, and KG validation. It also examines the roles of LLMs in generating descriptive texts and natural language queries for KGs.
Automatically learning linguistic structures for entity relation extraction
Weizhe Yang,Yanping Chen,Jinling Xu,Yongbin Qin,Ping Chen +4 more
1
An adaptive confidence-based data revision framework for Document-level Relation Extraction
Chao Jiang,Jinzhi Liao,Xiang Zhao,Daojian Zeng,Jianhua Dai +4 more
Large and Small models for collaborative cross-lingual data augmentation in entity relationship extraction for low-resource languages
Longjie Bao,Shuangcheng Bai +1 more
Hierarchical symmetric cross entropy for distant supervised relation extraction
Yun Liu,Xiaoheng Jiang,Pengshuai Lv,Yang Lu,Shupan Li,Kunli Zhang,Mingliang Xu +6 more
References
•Posted Content
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
81.7K
•Posted Content
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu,Myle Ott,Naman Goyal,Jingfei Du,Mandar Joshi,Danqi Chen,Omer Levy,Michael Lewis,Luke Zettlemoyer,Veselin Stoyanov +9 more
TL;DR: It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.
•Posted Content
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel,Noam Shazeer,Adam Roberts,Katherine Lee,Sharan Narang,Michael Matena,Yanqi Zhou,Wei Li,Peter J. Liu +8 more
TL;DR: This systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks and achieves state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more.
GPT-3: Its Nature, Scope, Limits, and Consequences
TL;DR: The nature of reversible and irreversible questions is discussed, that is, questions that may enable one to identify the nature of the source of their answers, and GPT-3, a third-generation, autoregressive language model that uses deep learning to produce human-like texts, is introduced.
ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks
TL;DR: This paper showed that ChatGPT outperforms crowd workers for several annotation tasks, including relevance, stance, topics, and frame detection, and demonstrated the potential of large language models to drastically increase the efficiency of text classification.
633