Multi-Modal Knowledge Graph Construction and Application: A Survey
TL;DR: Multi-modal Knowledge Graphs (MMKGs) as mentioned in this paper is a promising approach towards the realization of human-level machine intelligence, where knowledge graphs are constructed by text and images.
read more
Abstract: Recent years have witnessed the resurgence of knowledge engineering which is featured by the fast growth of knowledge graphs. However, most of existing knowledge graphs are represented with pure symbols, which hurts the machine's capability to understand the real world. The multi-modalization of knowledge graphs is an inevitable key step towards the realization of human-level machine intelligence. The results of this endeavor are Multi-modal Knowledge Graphs (MMKGs). In this survey on MMKGs constructed by texts and images, we first give definitions of MMKGs, followed with the preliminaries on multi-modal tasks and techniques. We then systematically review the challenges, progresses and opportunities on the construction and application of MMKGs respectively, with detailed analyses of the strength and weakness of different solutions. We finalize this survey with open research problems relevant to MMKGs.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Multimodal Learning With Transformers: A Survey
TL;DR: Transformer is a promising neural network learner, and has achieved great success in various machine learning tasks as discussed by the authors , thanks to the recent prevalence of multimodal applications and Big Data, Transformer-based multimodAL learning has become a hot topic in AI research.
Unifying Large Language Models and Knowledge Graphs: A Roadmap
Shirui Pan,Linhao Luo,Yufei Wang,Chen Chen,Jiapu Wang,Xindong Wu +5 more
TL;DR: Unify large language models and knowledge graphs to enhance understanding and knowledge representation.
225
Knowledge Graphs: Opportunities and Challenges
TL;DR: In this paper , the authors present a systematic overview of knowledge graph research and discuss severe technical challenges in this field, such as knowledge graph embeddings, knowledge acquisition, knowledge graph completion, knowledge fusion, and knowledge reasoning.
215
Unifying Large Language Models and Knowledge Graphs: A Roadmap
14 Jun 2023
TL;DR: This paper presented a forward-looking roadmap for the unification of LLMs and KGs, which consists of three general frameworks, namely, KG-enhanced LLMs, which incorporate KGs during the pre-training and inference phases of LLM, or for the purpose of enhancing understanding of the knowledge learned by LLMs.
A Survey of Knowledge Graph Reasoning on Graph Types: Static, Dynamic, and Multi-Modal
Ke Liang,Lingyuan Meng,Meng Li,Yue Liu,Wenxuan Tu,Siwei Wang,Sihang Zhou,Xinwang Liu,Fuchun Sun,Kunlun He +9 more
References
ImageNet: A large-scale hierarchical image database
Jia Deng,Wei Dong,Richard Socher,Li-Jia Li,Kai Li,Li Fei-Fei +5 more
- 20 Jun 2009
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Microsoft COCO: Common Objects in Context
Tsung-Yi Lin,Michael Maire,Serge Belongie,James Hays,Pietro Perona,Deva Ramanan,Piotr Dollár,C. Lawrence Zitnick +7 more
- 06 Sep 2014
TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.
WordNet: a lexical database for English
TL;DR: WordNet1 provides a more effective combination of traditional lexicographic information and modern computing, and is an online lexical database designed for use under program control.
16.9K
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
Ranjay Krishna,Yuke Zhu,Oliver Groth,Justin Johnson,Kenji Hata,Joshua Kravitz,Stephanie Chen,Yannis Kalantidis,Li-Jia Li,David A. Shamma,Michael S. Bernstein,Li Fei-Fei +11 more
TL;DR: The Visual Genome dataset as mentioned in this paper contains over 108k images where each image has an average of $35$35 objects, $26$26 attributes, and $21$21 pairwise relationships between objects.
Freebase: a collaboratively created graph database for structuring human knowledge
Kurt Bollacker,Colin Evans,Praveen Paritosh,Tim Sturge,Jamie Taylor +4 more
- 09 Jun 2008
TL;DR: MQL provides an easy-to-use object-oriented interface to the tuple data in Freebase and is designed to facilitate the creation of collaborative, Web-based data-oriented applications.
6.1K