Home

Awesome

<p align="center"> <a href="https://github.com/zjunlp/deepke"> <img src="pics/logo.png" width="400"/></a> <p> <p align="center"> <a href="http://deepke.zjukg.cn"> <img alt="Documentation" src="https://img.shields.io/badge/demo-website-blue"> </a> <a href="https://pypi.org/project/deepke/#files"> <img alt="PyPI" src="https://img.shields.io/pypi/v/deepke"> </a> <a href="https://github.com/zjunlp/DeepKE/blob/master/LICENSE"> <img alt="GitHub" src="https://img.shields.io/github/license/zjunlp/deepke"> </a> <a href="http://zjunlp.github.io/DeepKE"> <img alt="Documentation" src="https://img.shields.io/badge/doc-website-red"> </a> <a href="https://colab.research.google.com/drive/1vS8YJhJltzw3hpJczPt24O0Azcs3ZpRi?usp=sharing"> <img alt="Open In Colab" src="https://colab.research.google.com/assets/colab-badge.svg"> </a> </p> <p align="center"> <b> English | <a href="https://github.com/zjunlp/DeepKE/blob/main/README_CN.md">简体中文</a> </b> </p> <h1 align="center"> <p>A Deep Learning Based Knowledge Extraction Toolkit<br>for Knowledge Graph Construction</p> </h1>

DeepKE is a knowledge extraction toolkit for knowledge graph construction supporting cnSchemalow-resource, document-level and multimodal scenarios for entity, relation and attribute extraction. We provide documents, online demo, paper, slides and poster for beginners.

If you encounter any issues during the installation of DeepKE and DeepKE-LLM, please check Tips or promptly submit an issue, and we will assist you with resolving the problem!

Table of Contents

<br>

What's New

<details> <summary><b>Previous News</b></summary> </details>

Prediction Demo

There is a demonstration of prediction. The GIF file is created by Terminalizer. Get the code. <img src="pics/demo.gif" width="636" height="494" align=center>

<br>

Model Framework

<h3 align="center"> <img src="pics/architectures.png"> </h3> <br>

Quick Start

DeepKE-LLM

In the era of large models, DeepKE-LLM utilizes a completely new environment dependency.

conda create -n deepke-llm python=3.9
conda activate deepke-llm

cd example/llm
pip install -r requirements.txt

Please note that the requirements.txt file is located in the example/llm folder.

DeepKE

🔧Manual Environment Configuration

Step1 Download the basic code

git clone --depth 1 https://github.com/zjunlp/DeepKE.git

Step2 Create a virtual environment using Anaconda and enter it.<br>

conda create -n deepke python=3.8

conda activate deepke
  1. Install DeepKE with source code

    pip install -r requirements.txt
    
    python setup.py install
    
    python setup.py develop
    
  2. Install DeepKE with pip (NOT recommended!)

    pip install deepke
    
    • Please make sure that pip version <= 24.0

Step3 Enter the task directory

cd DeepKE/example/re/standard

Step4 Download the dataset, or follow the annotation instructions to obtain data

wget 120.27.214.45/Data/re/standard/data.tar.gz

tar -xzvf data.tar.gz

Many types of data formats are supported,and details are in each part.

Step5 Training (Parameters for training can be changed in the conf folder)

We support visual parameter tuning by using wandb.

python run.py

Step6 Prediction (Parameters for prediction can be changed in the conf folder)

Modify the path of the trained model in predict.yaml.The absolute path of the model needs to be used,such as xxx/checkpoints/2019-12-03_ 17-35-30/cnn_ epoch21.pth.

python predict.py

🐳Building With Docker Images

Step1 Install the Docker client

Install Docker and start the Docker service.

Step2 Pull the docker image and run the container

docker pull zjunlp/deepke:latest
docker run -it zjunlp/deepke:latest /bin/bash

The remaining steps are the same as Step 3 and onwards in Manual Environment Configuration.

Requirements

DeepKE

python == 3.8

<br>

Introduction of Three Functions

1. Named Entity Recognition

2. Relation Extraction

3. Attribute Extraction

<br>

4. Event Extraction

<table h style="text-align:center"> <tr> <th colspan="2"> Sentence </th> <th> Event type </th> <th> Trigger </th> <th> Role </th> <th> Argument </th> </tr> <tr> <td rowspan="3" colspan="2"> 据《欧洲时报》报道,当地时间27日,法国巴黎卢浮宫博物馆员工因不满工作条件恶化而罢工,导致该博物馆也因此闭门谢客一天。 </td> <td rowspan="3"> 组织行为-罢工 </td> <td rowspan="3"> 罢工 </td> <td> 罢工人员 </td> <td> 法国巴黎卢浮宫博物馆员工 </td> </tr> <tr> <td> 时间 </td> <td> 当地时间27日 </td> </tr> <tr> <td> 所属组织 </td> <td> 法国巴黎卢浮宫博物馆 </td> </tr> <tr> <td rowspan="3" colspan="2"> 中国外运2019年上半年归母净利润增长17%:收购了少数股东股权 </td> <td rowspan="3"> 财经/交易-出售/收购 </td> <td rowspan="3"> 收购 </td> <td> 出售方 </td> <td> 少数股东 </td> </tr> <tr> <td> 收购方 </td> <td> 中国外运 </td> </tr> <tr> <td> 交易物 </td> <td> 股权 </td> </tr> <tr> <td rowspan="3" colspan="2"> 美国亚特兰大航展13日发生一起表演机坠机事故,飞行员弹射出舱并安全着陆,事故没有造成人员伤亡。 </td> <td rowspan="3"> 灾害/意外-坠机 </td> <td rowspan="3"> 坠机 </td> <td> 时间 </td> <td> 13日 </td> </tr> <tr> <td> 地点 </td> <td> 美国亚特兰 </td> </tr> </table> <br>

Tips

1.Using nearest mirror, THU in China, will speed up the installation of Anaconda; aliyun in China, will speed up pip install XXX.

2.When encountering ModuleNotFoundError: No module named 'past',run pip install future .

3.It's slow to install the pretrained language models online. Recommend download pretrained models before use and save them in the pretrained folder. Read README.md in every task directory to check the specific requirement for saving pretrained models.

4.The old version of DeepKE is in the deepke-v1.0 branch. Users can change the branch to use the old version. The old version has been totally transfered to the standard relation extraction (example/re/standard).

5.If you want to modify the source code, it's recommended to install DeepKE with source codes. If not, the modification will not work. See issue

6.More related low-resource knowledge extraction works can be found in Knowledge Extraction in Low-Resource Scenarios: Survey and Perspective.

7.Make sure the exact versions of requirements in requirements.txt.

To do

In next version, we plan to release a stronger LLM for KE.

Meanwhile, we will offer long-term maintenance to fix bugs, solve issues and meet new requests. So if you have any problems, please put issues to us.

Reading Materials

Data-Efficient Knowledge Graph Construction, 高效知识图谱构建 (Tutorial on CCKS 2022) [slides]

Efficient and Robust Knowledge Graph Construction (Tutorial on AACL-IJCNLP 2022) [slides]

PromptKG Family: a Gallery of Prompt Learning & KG-related Research Works, Toolkits, and Paper-list [Resources]

Knowledge Extraction in Low-Resource Scenarios: Survey and Perspective [Survey][Paper-list]

Related Toolkit

DoccanoMarkToolLabelStudio: Data Annotation Toolkits

LambdaKG: A library and benchmark for PLM-based KG embeddings

EasyInstruct: An easy-to-use framework to instruct Large Language Models

Reading Materials:

Data-Efficient Knowledge Graph Construction, 高效知识图谱构建 (Tutorial on CCKS 2022) [slides]

Efficient and Robust Knowledge Graph Construction (Tutorial on AACL-IJCNLP 2022) [slides]

PromptKG Family: a Gallery of Prompt Learning & KG-related Research Works, Toolkits, and Paper-list [Resources]

Knowledge Extraction in Low-Resource Scenarios: Survey and Perspective [Survey][Paper-list]

Related Toolkit:

DoccanoMarkToolLabelStudio: Data Annotation Toolkits

LambdaKG: A library and benchmark for PLM-based KG embeddings

EasyInstruct: An easy-to-use framework to instruct Large Language Models

Citation

Please cite our paper if you use DeepKE in your work

@inproceedings{EMNLP2022_Demo_DeepKE,
  author    = {Ningyu Zhang and
               Xin Xu and
               Liankuan Tao and
               Haiyang Yu and
               Hongbin Ye and
               Shuofei Qiao and
               Xin Xie and
               Xiang Chen and
               Zhoubo Li and
               Lei Li},
  editor    = {Wanxiang Che and
               Ekaterina Shutova},
  title     = {DeepKE: {A} Deep Learning Based Knowledge Extraction Toolkit for Knowledge Base Population},
  booktitle = {{EMNLP} (Demos)},
  pages     = {98--108},
  publisher = {Association for Computational Linguistics},
  year      = {2022},
  url       = {https://aclanthology.org/2022.emnlp-demos.10}
}
<br>

Contributors

Ningyu Zhang, Haofen Wang, Fei Huang, Feiyu Xiong, Liankuan Tao, Xin Xu, Honghao Gui, Zhenru Zhang, Chuanqi Tan, Qiang Chen, Xiaohan Wang, Zekun Xi, Xinrong Li, Haiyang Yu, Hongbin Ye, Shuofei Qiao, Peng Wang, Yuqi Zhu, Xin Xie, Xiang Chen, Zhoubo Li, Lei Li, Xiaozhuan Liang, Yunzhi Yao, Jing Chen, Yuqi Zhu, Shumin Deng, Wen Zhang, Guozhou Zheng, Huajun Chen

Community Contributors: thredreams, eltociear, Ziwen Xu, Rui Huang, Xiaolong Weng

Other Knowledge Extraction Open-Source Projects