Home

Awesome

GAU-α

基于Gated Attention Unit的Transformer模型(尝鲜版)

介绍

评测

CLUE榜单分类任务结果

iflytektnewsafqmccmnliocnliwsccsl
BERT60.0656.8072.4179.5673.9378.6283.93
RoBERTa60.6458.0674.0581.2476.0087.5084.50
RoFormer60.9157.5473.5280.9276.0786.8484.63
RoFormerV2<sup>*</sup>60.8756.5472.7580.3475.3680.9284.67
GAU-α61.4157.7674.1781.8275.8679.9385.67

CLUE榜单阅读理解和NER结果

cmrc2018c3chidcluener
BERT56.1760.5485.6979.45
RoBERTa56.5467.6686.7179.47
RoFormer56.2667.2486.5779.72
RoFormerV2<sup>*</sup>57.9164.6285.0981.08
GAU-α58.0968.2487.9180.01

使用

需要bert4keras>=0.11.3。参考代码:

from bert4keras.models import build_transformer_model
from models import GAU_alpha

gau_model = build_transformer_model(
    config_path=config_path,
    checkpoint_path=checkpoint_path,
    model=GAU_alpha,
)

下载

引用

Bibtex:

@techreport{gau-alpha,
  title={GAU-α: GAU-based Transformers for NLP - ZhuiyiAI},
  author={Jianlin Su, Shengfeng Pan, Bo Wen, Yunfeng Liu},
  year={2022},
  url="https://github.com/ZhuiyiTechnology/GAU-alpha",
}

联系