Home

Awesome

ERNIE_milestone_20210519_zh

文心大模型ERNIE是百度发布的产业级知识增强大模型,涵盖了NLP大模型和跨模态大模型。2019年3月,开源了国内首个开源预训练模型文心ERNIE 1.0,此后在语言与跨模态的理解和生成等领域取得一系列技术突破,并对外开源与开放了系列模型,助力大模型研究与产业化应用发展。提醒: ERNIE老版本代码已经迁移至repro分支,欢迎使用我们全新升级的基于动静结合的新版ERNIE套件进行开发。另外,也欢迎上EasyDLBML体验更丰富的功能。 【了解更多】

开源Roadmap

环境安装

  1. 安装环境依赖:环境安装
  2. 安装Ernie套件
git clone https://github.com/PaddlePaddle/ERNIE.git

快速上手:使用文心ERNIE大模型进行训练

下载模型

# ernie_3.0 模型下载
# 进入models_hub目录
cd ./applications/models_hub
# 运行下载脚本
sh download_ernie_3.0_base_ch.sh

准备数据

#进入文本分类任务文件夹
cd ./applications/tasks/text_classification/
#查看文本分类任务自带数据集
ls ./data

配置训练json文件

#查看 ERNIE3.0预训练模型 训练文本分类任务的配置文件
cat ./examples/cls_ernie_fc_ch.json

启动训练

# ernie 中文文本分类模型
# 基于json实现预置网络训练。其调用了配置文件./examples/cls_ernie_fc_ch.json
python run_trainer.py --param_path ./examples/cls_ernie_fc_ch.json
fleetrun --gpus=x,y run_trainer.py./examples/cls_ernie_fc_ch.json

配置预测json文件

{
"dataset_reader":{"train_reader":{"config":{"data_path":"./data/predict_data"}}},
"inference":{"inference_model_path":"./output/cls_ernie_fc_ch/save_inference_model/inference_step_251",
                        "output_path": "./output/predict_result.txt"}
}

启动预测

python run_infer.py --param_path ./examples/cls_ernie_fc_ch_infer.json

预训练模型介绍

#进入预训练模型下载目录
cd ./applications/models_hub
#下载ERNIE3.0 base模型
sh downlaod_ernie_3.0_base_ch.sh

数据集下载

CLUE数据集

DuIE2.0数据集

MSRA_NER数据集

模型效果评估

评估数据集

CLUE 评测结果:

配置模型CLUEWSC2020IFLYTEKTNEWSAFQMCCMNLICSLOCNLI平均值
24L1024HRoBERTa-wwm-ext-large90.7962.0259.3376.0083.8883.6778.8176.36
20L1024HERNIE 3.0-XBase91.1262.2260.3476.9584.9884.2782.0777.42
12L768HRoBERTa-wwm-ext-base88.5561.2258.0874.7581.6681.6377.2574.73
12L768HERNIE 3.0-Base88.1860.7258.7376.5383.6583.3080.3175.63
6L768HRBT6, Chinese75.0059.6856.6273.1579.2680.0473.1570.99
6L768HERNIE 3.0-Medium79.9360.1457.1674.5680.8781.2377.0272.99

具体评测方式

  1. 以上所有任务均基于 Grid Search 方式进行超参寻优。分类任务训练每间隔 100 steps 评估验证集效果,取验证集最优效果作为表格中的汇报指标。
  2. 分类任务 Grid Search 超参范围: batch_size: 16, 32, 64; learning rates: 1e-5, 2e-5, 3e-5, 5e-5;因为 CLUEWSC2020 数据集较小,所以模型在该数据集上的效果对 batch_size 较敏感,所以对 CLUEWSC2020 评测时额外增加了 batch_size = 8 的超参搜索; 因为CLUEWSC2020 和 IFLYTEK 数据集对 dropout 概率值较为敏感,所以对 CLUEWSC2020 和 IFLYTEK 数据集评测时增加dropout_prob = 0.0 的超参搜索。

下游任务的固定超参配置

分类和匹配任务:

TASKAFQMCTNEWSIFLYTEKCMNLIOCNLICLUEWSC2020CSL
epoch33325505
max_seq_length128128128128128128256
warmup_proportion0.10.10.10.10.10.10.1

ERNIE模型Grid Search 最优超参

ModelAFQMCTNEWSIFLYTEKCMNLIOCNLICLUEWSC2020CSL
ERNIE 3.0-Mediumbsz_32_lr_2e-05bsz_16_lr_3e-05bsz_16_lr_5e-05bsz_16_lr_1e-05/bsz_64_lr_2e-05bsz_64_lr_2e-05bsz_8_lr_2e-05bsz_32_lr_1e-05
ERNIE 3.0-Basebsz_16_lr_2e-05bsz_64_lr_3e-05bsz_16_lr_5e-05bsz_16_lr_2e-05bsz_16_lr_2e-05bsz_8_lr_2e-05(drop_out _0.1)bsz_16_lr_3e-05
ERNIE 3.0-XBasebsz_16_lr_1e-05bsz_16_lr_2e-05bsz_16_lr_3e-05bsz_16_lr_1e-05bsz_32_lr_2e-05bsz_8_lr_2e-05bsz_64_lr_1e-05

应用场景

文本分类(文本分类

文本匹配(文本匹配

序列标注(序列标注

信息抽取(信息抽取

文本生成(文本生成

图文匹配(图文匹配

数据蒸馏(数据蒸馏

工具使用(工具使用

文献引用

ERNIE 1.0

@article{sun2019ernie,
  title={Ernie: Enhanced representation through knowledge integration},
  author={Sun, Yu and Wang, Shuohuan and Li, Yukun and Feng, Shikun and Chen, Xuyi and Zhang, Han and Tian, Xin and Zhu, Danxiang and Tian, Hao and Wu, Hua},
  journal={arXiv preprint arXiv:1904.09223},
  year={2019}
}

ERNIE 2.0

@inproceedings{sun2020ernie,
  title={Ernie 2.0: A continual pre-training framework for language understanding},
  author={Sun, Yu and Wang, Shuohuan and Li, Yukun and Feng, Shikun and Tian, Hao and Wu, Hua and Wang, Haifeng},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={34},
  number={05},
  pages={8968--8975},
  year={2020}
}

ERNIE-GEN

@article{xiao2020ernie,
  title={Ernie-gen: An enhanced multi-flow pre-training and fine-tuning framework for natural language generation},
  author={Xiao, Dongling and Zhang, Han and Li, Yukun and Sun, Yu and Tian, Hao and Wu, Hua and Wang, Haifeng},
  journal={arXiv preprint arXiv:2001.11314},
  year={2020}
}

ERNIE-ViL

@article{yu2020ernie,
  title={Ernie-vil: Knowledge enhanced vision-language representations through scene graph},
  author={Yu, Fei and Tang, Jiji and Yin, Weichong and Sun, Yu and Tian, Hao and Wu, Hua and Wang, Haifeng},
  journal={arXiv preprint arXiv:2006.16934},
  year={2020}
}

ERNIE-Gram

@article{xiao2020ernie,
  title={ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding},
  author={Xiao, Dongling and Li, Yu-Kun and Zhang, Han and Sun, Yu and Tian, Hao and Wu, Hua and Wang, Haifeng},
  journal={arXiv preprint arXiv:2010.12148},
  year={2020}
}

ERNIE-Doc

@article{ding2020ernie,
  title={ERNIE-Doc: A retrospective long-document modeling transformer},
  author={Ding, Siyu and Shang, Junyuan and Wang, Shuohuan and Sun, Yu and Tian, Hao and Wu, Hua and Wang, Haifeng},
  journal={arXiv preprint arXiv:2012.15688},
  year={2020}
}

ERNIE-UNIMO

@article{li2020unimo,
  title={Unimo: Towards unified-modal understanding and generation via cross-modal contrastive learning},
  author={Li, Wei and Gao, Can and Niu, Guocheng and Xiao, Xinyan and Liu, Hao and Liu, Jiachen and Wu, Hua and Wang, Haifeng},
  journal={arXiv preprint arXiv:2012.15409},
  year={2020}
}

ERNIE-M

@article{ouyang2020ernie,
  title={Ernie-m: Enhanced multilingual representation by aligning cross-lingual semantics with monolingual corpora},
  author={Ouyang, Xuan and Wang, Shuohuan and Pang, Chao and Sun, Yu and Tian, Hao and Wu, Hua and Wang, Haifeng},
  journal={arXiv preprint arXiv:2012.15674},
  year={2020}
}