Home

Awesome

KG-MM-Survey

Awesome License: MIT

Task

πŸ™Œ This repository collects papers integrating Knowledge Graphs (KGs) and Multi-Modal Learning, focusing on research in two principal aspects: KG-driven Multi-Modal (KG4MM) learning, where KGs support multi-modal tasks, and Multi-Modal Knowledge Graph (MM4KG), which extends KG studies into the MMKG realm.

😎 Welcome to recommend missing papers through Adding Issues or Pull Requests.

<details> <summary>πŸ‘ˆ πŸ”Ž Roadmap </summary>

Roadmap

</details> <details> <summary>πŸ‘ˆ πŸ”” News </summary> </details>

πŸ“œ Content


πŸ€–πŸŒ„ KG-driven Multi-modal Learning (KG4MM)

Understanding & Reasoning Tasks

<details> <summary>πŸ‘ˆ πŸ”Ž Pipeline </summary>

KG4MMR

</details>

Visual Question Answering

<details> <summary>πŸ‘ˆ πŸ”Ž Benchmarks </summary>

VQA

</details>

Visual Question Generation

Visual Dialog

Classification Tasks

<details> <summary>πŸ‘ˆ πŸ”Ž Comparison </summary>

IMGC

</details>

Image Classification

<details> <summary>πŸ‘ˆ πŸ”Ž Benchmarks </summary> <div align="center"> <img src="figures/imgctab.jpg" width="45%" height="auto" /> </div> </details>

Fake News Detection

Movie Genre Classification

Content Generation Tasks

<details> <summary>πŸ‘ˆ πŸ”Ž Case </summary> <div align="center"> <img src="figures/VGG.jpg" width="45%" height="auto" /> </div> </details>

Image Captioning

Visual Storytelling

Conditional Text-to-Image Generation

Scene Graph Generation

Retrieval Tasks

<details> <summary>πŸ‘ˆ πŸ”Ž Case </summary> <div align="center"> <img src="figures/CMR.jpg" width="50%" height="auto" /> </div> </details>

Cross-Modal Retrieval

Visual Referring Expressions & Grounding

KG-aware Mutli-modal Pre-training

Structure Knowledge aware Pre-training

Knowledge Graph aware Pre-training


πŸŒ„πŸ€– Multi-modal Knowledge Graph (MM4KG)

<details> <summary>πŸ‘ˆ πŸ”Ž N-MMKG Ontology </summary>

MMKGOnto

</details> <details> <summary>πŸ‘ˆ πŸ”Ž Taxonomy </summary> <div align="center"> <img src="figures/mmkgtask.jpg" width="90%" height="auto" /> </div> </details>

MMKG Resources

Public MMKGs

<details> <summary>πŸ‘ˆ πŸ”Ž MMKG Overview </summary>

MMKG

</details>

MMKG Construction Methods

MMKG Acquisition

<details> <summary>πŸ‘ˆ πŸ”Ž Case </summary> <div align="center"> <img src="figures/MMIE.jpg" width="50%" height="auto" /> </div> </details>

Multi-modal Named Entity Recognition

<details> <summary>πŸ‘ˆ πŸ”Ž Benchmarks </summary> <div align="center"> <img src="figures/mnertab.jpg" width="45%" height="auto" /> </div> </details>

Multi-modal Relation Extraction

<details> <summary>πŸ‘ˆ πŸ”Ž Benchmarks </summary> <div align="center"> <img src="figures/mmretab.jpg" width="45%" height="auto" /> </div> </details>

Multi-modal Event Extraction

<details> <summary>πŸ‘ˆ πŸ”Ž Benchmarks </summary> <div align="center"> <img src="figures/mmeetab.jpg" width="45%" height="auto" /> </div> </details>

Image-Text:

Video-Text:

MMKG Fusion

Multi-modal Entity Alignment

<details> <summary>πŸ‘ˆ πŸ”Ž Benchmarks </summary> <div align="center"> <img src="figures/mmeatab.jpg" width="45%" height="auto" /> </div> </details>

Multi-modal Entity Linking & Disambiguation

<details> <summary>πŸ‘ˆ πŸ”Ž Benchmarks </summary> <div align="center"> <img src="figures/mmeltab.jpg" width="45%" height="auto" /> </div> </details>

MMKG Inference

Multi-modal Knowledge Graph Completion

<details> <summary>πŸ‘ˆ πŸ”Ž Benchmarks </summary> <div align="center"> <img src="figures/mkgctab.jpg" width="45%" height="auto" /> </div> </details>

Multi-modal Knowledge Graphs Reasoning

MMKG-driven Tasks

<details> <summary>πŸ‘ˆ πŸ”Ž Case </summary> <div align="center"> <img src="figures/mmkg4mm.jpg" width="50%" height="auto" /> </div> </details>

Retrieval

Image Retrieval:
Cross-modal Retrieval:

Reasoning & Generation

Pre-training

Triple-level:
Graph-level:

AI for Science

Industry Application

<details> <summary>πŸ‘ˆ πŸ”Ž Case </summary> <div align="center"> <img src="figures/mmkg4indus.jpg" width="50%" height="auto" /> </div> </details>

Contribution

πŸ‘₯ Contributors

<a href="https://github.com/zjukg/KG-MM-Survey/graphs/contributors"> <img src="https://contrib.rocks/image?repo=zjukg/KG-MM-Survey" /> </a>

πŸŽ‰ Contributing ( welcome ! )

Don't worry if you put something wrong, they will be fixed for you. Just feel free to contribute and promote your awesome work here! 🀩 We'll get back to you in time ~ πŸ˜‰


πŸ”– Contact

πŸ“« zhuo.chen@zju.edu.cn

🀝 Cite:

If this Repo is helpful to you, please consider citing our paper. We would greatly appreciate it :)

@article{chen2024knowledge,
  author       = {Zhuo Chen and
                  Yichi Zhang and
                  Yin Fang and
                  Yuxia Geng and
                  Lingbing Guo and
                  Xiang Chen and
                  Qian Li and
                  Wen Zhang and
                  Jiaoyan Chen and
                  Yushan Zhu and
                  Jiaqi Li and
                  Xiaoze Liu and
                  Jeff Z. Pan and
                  Ningyu Zhang and
                  Huajun Chen},
  title        = {Knowledge Graphs Meet Multi-Modal Learning: {A} Comprehensive Survey},
  journal      = {CoRR},
  volume       = {abs/2402.05391},
  year         = {2024}
}

Star History Chart