Home

Awesome

LODEME

Pytorch ICASSP023

Entity alignment (EA) for knowledge graphs (KGs) plays a critical role in knowledge engineering. Existing EA methods mostly focus on utilizing the graph structures and entity attributes (including literals), but ignore images that are common in modern multi-modal KGs. In this study we first constructed Multi-OpenEA β€” eight large-scale, image-equipped EA benchmarks, and then evaluated some existing embedding-based methods for utilizing images. In view of the complementary nature of visual modal information and logical deduction, we further developed a new multi-modal EA method named LODEME using logical deduction and multimodal KG embedding, with state-of-the-art performance achieved on Multi-OpenEA and other existing multi-modal EA benchmarks.

πŸš€ Code for LODEME

The code is currently being organized and refined. Once the code is ready, it will be made available on this repository. Thank you for your patience and understanding.

πŸ“š Dataset (Multi-OpenEA)

We proposed a generic multi-modal EA benchmarks construction process and constructed new multi-modal EA benchmarks based on the eight existing OpenEA benchmarks by adding multiple images to each entity.

Our Multi-OpenEA benchmarks vs the existing multi-modal EA benchmarks. Ours have larger scale (#Entity), more enti- ties associated with images (Coverage), and more images per entity (Ratio).

BenchmarkKGs#Entity#ImagesRatioCoverageSimilarity
FB15K-DB15K \cite{chen2020mmea}FB15K14,95113,4440.89990.0%-
DB15K12,84212,8370.99999.9%
DBP-WD(norm) \cite{liu2021visual}DBP15,0008,5170.51757.1%-
WD15,0008,7910.58658.6%
EN-FR-15K-V1EN15K(V1)15,00044,6572.97799.7%0.757
FR15K(V1)15,00042,2862.81994.5%
EN-FR-15K-V2EN15K(V2)15,00044,9322.99599.9%0.767
FR15K(V2)15,00042,6222.84194.5%
EN-FR-100K-V1EN100K(V1)100,000296,9342.96999.6%0.751
FR100K(V1)100,000280,2882.80394.1%
EN-FR-100K-V2EN100K(V2)100,000299,4032.99499.9%0.752
FR100K(V2)100,000282,0632.82194.4%
D-W-15K-V1DBP15K(V1)15,00044,7762.98599.8%0.829
WD15K(V1)15,00044,8232.98899.8%
D-W-15K-V2DBP15K(V2)15,00044,9112.99499.9%0.820
WD15K(V2)15,00044,9452.99699.9%
D-W-100K-V1DBP100K(V1)100,000296,7492.986799.5%0.833
WD100K(V1)100,000297,3542.97499.6%
D-W-100K-V2DBP100K(V2)100,000299,3382.99399.9%0.832
WD100K(V2)100,000299,6072.99699.9%

❗NOTE: The organisation of the data is consistent with OpenEA Dataset v1.1 and the text portion can be downloaded directly from OpenEA. Download the image embedding via CLIP encoding from Baidu Cloud Drive, with the pass code tuds and the raw images from Baidu Cloud Drive, with the pass code aoo1

🀝 Cite:

Please condiser citing this paper if you use the code or data from our work. Thanks a lot :)

@inproceedings{li2023vision,
  title={Vision, Deduction and Alignment: An Empirical Study on Multi-Modal Knowledge Graph Alignment},
  author={Li, Yangning and Chen, Jiaoyan and Li, Yinghui and Xiang, Yuejia and Chen, Xi and Zheng, Hai-Tao},
  booktitle={ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  pages={1--5},
  year={2023},
  organization={IEEE}
}

πŸ’‘ Acknowledgement