Home

Awesome

LasUIE: Latent Adaptive Structure-aware LM for Universal Information Extraction

<a href="https://github.com/ChocoWu/LasUIE"> <img src="https://img.shields.io/badge/LasUIE-1.0-blue" alt="pytorch 1.8.1"> </a> <a href="https://pytorch.org" rel="nofollow"> <img src="https://img.shields.io/badge/pytorch-1.10.0-green" alt="pytorch 1.8.1"> </a> <a href="https://huggingface.co/docs/transformers/index" rel="nofollow"> <img src="https://img.shields.io/badge/transformers-4.24.0-orange" alt="Build Status"> </a>

The pytroch implementation of the NIPS-2022 paper Unifying Information Extraction with Latent Adaptive Structure-aware Generative Language Model.


🎉 Visit the project page here LasUIE


Quick Links


1. Methodology<a name="methodology" />

1.1 Modeling Universal Information Extraction (UIE)<a name="uie" />

UIE has been proposed to unify all information extraction tasks in NLP community, which converts the structure prediction of IE tasks universally into the sequence prediction via generative LMs.

All IE jobs essentially revolves around predicting two key elements: <mention spans> or/and their <semantic relations>. In this project, we thus reduce all the IE tasks into three prototypes: span extraction, pair extraction and hyper-pair extraction:

<p align="center"> <img src="./figures/UIE_intro.png" width="650"/> </p>

Under this scheme, mention spans are described with <Span> terms and the corresponding <Span Attribute> labels; semantic relations are straightforwardly denoted with <Relation> labels.

And all the IE structures are cast into a sequential representation: Linearized Hierarchical Expression (LHE). For example,

<p align="center"> <img src="./figures/LHE.png" width="200"/> </p>

1.2 UIE with Structure-aware Generative Language Model<a name="lasuie" />

As cast above, UIE has two key common challenges of IEs:

We thus propose addressing the two challenges by modeling both the syntactic dependency structure and constituency structure, where the constituency syntax mostly benefits the first challenge; the dependency structure well aids the second challenge. To implement the above idea, we propose learning a Latent Adaptive Structure-aware Generative Language Model for UIE, aka, LasUIE.

LasUIE has a three-stage learning procedure:

<p align="center"> <img src="./figures/three-stage.png" width="650"/> </p>

1.2.1 Unsupervised structure-aware post-training

A Heterogeneous structure inductor (HSI) module is used to unsupervisedly enrich the backbone GLM with sufficient structural knowledge, reinforcing the awareness of linguistic syntax.

<p align="center"> <img src="./figures/struct_post_train.png" width="650"/> </p>

1.2.2 Supervised task-oriented structure fine-tuning

Further adjusting (finetune) the syntactic attributes within the GLM with stochastic policy gradient algorithm by directly taking the feedback of end task performance, such that the learned structural features are most coincident with the end task needs.

<p align="center"> <img src="./figures/struct_tune.png" width="550"/> </p>

2. Code Usage<a name="code" />

2.1 Requirement Installation<a name="requirement" />

2.2 Code Structure<a name="structure" />

│---------------------------------------------------
├─config                           // configuration fold
│    ├─config.json                 // config for generic finetune
│    └─config_struct_tune.json     // config for structural finetune
│
├─data                             // data fold
│  ├─hyperpair                     // dataset for hyperpair extraction 
│  │  └─orl                        // task name
│  │      └─mpqa                   // dataset name
│  │          ├─labels.json        // template labels for hyperpair extraction 
│  │          ├─dev.json           // template dev set for hyperpair extraction 
│  │          ├─test.json          // template test set for hyperpair extraction 
│  │          └─train.json         // template train set for hyperpair extraction 
│  │  
│  ├─pair                           // dataset for pair extraction 
│  │  └─re  
│  │      └─nyt  
│  │          └─...
│  │  
│  ├─span                          // dataset for span extraction  
│  │  └─ner  
│  │      └─conll03  
│  │           └─...
│  │  
│  └─post-training                 // corpos for post-training of the GLM
│      ├─books-corpus  
│      └─wikipedia-en  
│---------------------------------------------------
├─checkpoint                       // saving model checkpoints
│    └─...
├─logs                             // saving experiment logs
│    └─...
├─test_output                      // saving testing/inference outputs
│    └─...
├─figures                          
├─requirements.txt                 
├─README.md             
├─LICENSE  
│---------------------------------------------------
├─engine                           // core codes here 
│    ├─constants.py
│    ├─cus_argument.py
│    ├─data_utils.py
│    ├─evaluating.py
│    ├─module.py
│    ├─t5_modeling.py
│    └─utils.py
│
├─run_struct_post_train.py          // entry of second phase of structural post-training
├─run_finetune.py                   // entry of thrid phase of generic fine-tuning
├─run_finetune_with_struct_tune.py  // entry of thrid phase of structural fine-tuning
├─run_inference.py                  // entry of fourth phase of inference 
â””---------------------------------------------------

2.3 Running Pipeline<a name="pipeline" />

The general pipeline goes as:

Step 1         run_struct_post_train.py  
                          ↓ 
Step 2            run_finetune.py (first train, then eval)
                          ↓
Step 3       run_finetune_with_struct_tune.py
                          ↓          
Step 4             run_inference.py

2.3.1 Structure-aware post-training<a name="post-training" />

2.3.2 Supervised fine-tuning<a name="fine-tuning" />

A. task-oriented fine-tuning

B. structure fine-tuning

2.3.3 Inference<a name="inference" />

2.4 Dataset & Evaluating<a name="dataset-evaluating" />

2.4.1 Dataset

2.4.1 Evaluating


3 MISC<a name="misc" />

3.1 Citation

If you use this work or code, please kindly cite:

@inproceedings{fei2022lasuie,
  author = {Fei, Hao and Wu, Shengqiong and Li, Jingye and Li, Bobo and Li, Fei and Qin, Libo and Zhang, Meishan and Zhang, Min and Chua, Tat-Seng},
  booktitle = {Advances in Neural Information Processing Systems},
  title = {LasUIE: Unifying Information Extraction with Latent Adaptive Structure-aware Generative Language Model},
  url = {https://proceedings.neurips.cc/paper_files/paper/2022/file/63943ee9fe347f3d95892cf87d9a42e6-Paper-Conference.pdf},
  pages = {15460--15475},
  year = {2022}
}

3.2 Acknowledgement

This code is partially referred from following projects or papers: UIE; Structformer, Huggingface-T5.

3.3 License

The code is released under Apache License 2.0 for Noncommercial use only. Any commercial use should get formal permission first from authors.

3.4 Contact

For any question or issue, please contact @Hao Fei and @Shengqiong Wu.