Awesome
AdaSeq: An All-in-One Library for Developing State-of-the-Art Sequence Understanding Models
<div align="center"> </div> <div align="center">English | 简体中文
</div>Introduction
AdaSeq (Alibaba Damo Academy Sequence Understanding Toolkit) is an easy-to-use all-in-one library, built on ModelScope, that allows researchers and developers to train custom models for sequence understanding tasks, including part-of-speech tagging (POS Tagging), chunking, named entity recognition (NER), entity typing, relation extraction (RE), etc.
<details open> <summary>🌟 <b>Features:</b></summary>-
Plentiful Models:
AdaSeq provide plenty of cutting-edge models, training methods and useful toolkits for sequence understanding tasks.
-
State-of-the-Art:
Our aim to develop the best implementation, which can beat many off-the-shelf frameworks on performance.
-
Easy-to-Use:
One line of command is all you need to obtain the best model.
-
Extensible:
It's easy to register a module, or build a customized sequence understanding model by assembling the predefined modules.
⚠️Notice: This project is under quick development. This means some interfaces could be changed in the future.
📢 What's New
- 2022-07: [SemEval 2023] Our U-RaNER paper won Best Paper Award!
- 2022-03: [SemEval 2023] Our U-RaNER won 1st place in 9 tracks at SemEval 2023 Task2: Multilingual Complex Named Entity Recognition! Model introduction and source code can be found here.
- 2022-12: [EMNLP 2022] Retrieval-augmented Multimodal Entity Understanding Model (MoRe)
- 2022-11: [EMNLP 2022] Ultra-Fine Entity Typing Model (NPCRF)
- 2022-11: [EMNLP 2022] Unsupervised Boundary-Aware Language Model (BABERT)
⚡ Quick Experience
You can try out our models via online demos built on ModelScope: [English NER] [Chinese NER] [CWS]
More tasks, more languages, more domains: All modelcards we released can be found in this page Modelcards.
🛠️ Model Zoo
<details open> <summary><b>Supported models:</b></summary>- Transformer-based CRF
- Partial CRF
- Retrieval Augmented NER
- Biaffine NER
- Global-Pointer
- Multi-label Entity Typing
- ...
💾 Dataset Zoo
We collected many datasets for sequence understanding tasks. All can be found in this page Datasets.
📦 Installation
AdaSeq project is based on Python >= 3.7
, PyTorch >= 1.8
and ModelScope >= 1.4
. We assure that AdaSeq can run smoothly when ModelScope == 1.9.5
.
- installation via pip:
pip install adaseq
- installation from source:
git clone https://github.com/modelscope/adaseq.git
cd adaseq
pip install -r requirements.txt -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html
Verify the Installation
To verify whether AdaSeq is installed properly, we provide a demo config for training a model (the demo config will be automatically downloaded).
adaseq train -c demo.yaml
You will see the training logs on your terminal. Once the training is done, the results on test set will be printed: test: {"precision": xxx, "recall": xxx, "f1": xxx}
. A folder experiments/toy_msra/
will be generated to save all experimental results and model checkpoints.
📖 Tutorials
- Quick Start
- Basics
- Learning about Configs
- Customizing Dataset
- [TODO] Common Architectures
- [TODO] Useful Hooks
- Hyperparameter Optimization
- Training with Multiple GPUs
- Best Practice
- Training a Model with Custom Dataset
- Reproducing Results in Published Papers
- [TODO] Uploading Saved Model to ModelScope
- [TODO] Customizing your Model
- [TODO] Serving with AdaLA
📝 Contributing
All contributions are welcome to improve AdaSeq. Please refer to CONTRIBUTING.md for the contributing guideline.
📄 License
This project is licensed under the Apache License (Version 2.0).