Home

Awesome

中文 | English

<p align="center"> <br> <img src="./banner.png" width="500"/> <br> </p>

Collections of resources from Joint Laboratory of HIT and iFLYTEK Research (HFL).

<!-- TOC --> <!-- /TOC -->

Pre-trained Language Model

NameDescription
VLEMultimodal Vision-Language Encoder
MiniRBTChinese MiniRBT models (a series of small pre-trained models)
LERTChinese LERT models (small-level, base-level, large-level)
PERTChinese and English PERT models (base-level, large-level)
Chinese-MobileBERTChinese MobileBERT (base-level, large-level) (archival purpose only)
CINOPre-trained Language Models for Chinese Minority Languages
MacBERTChinese pre-trained MacBERT models (MacBERT-base, MacBERT-large)
CharBERTEnglish pre-trained CharBERT models
Chinese-ELECTRAChinese pre-trained ELECTRA models (ELECTRA-base, ELECTRA-small) with code supports for six tasks: CMRC 2018, DRCD, XNLI, ChnSentiCorp, LCQMC, BQCorpus
Chinese-XLNetChinese pre-trained XLNet models: XLNet-mid, XLNet-base
Chinese-BERT-wwmChinese BERT with Whole Word Masking (wwm), including BERT-wwm, BERT-wwm-ext, RoBERTa-wwm-ext, RoBERTa-wwm-ext-large, RBT3, RBTL3

Dataset

NameTypePaper
CCTCText CorrectionWang et al., 2022
CTC 2021Text CorrectionWang et al., 2022
ExpMRCReading ComprehensionCui et al., 2021
AdvRACEReading ComprehensionSi et al., 2020
CMRC 2019Reading ComprehensionCui et al., 2020
CJRCReading ComprehensionDuan et al., 2019
CMRC 2018Reading ComprehensionCui et al., 2019
CMRC 2017Reading ComprehensionCui et al., 2018
PD&CFTReading ComprehensionCui et al., 2016

Toolkit

NameDescriptionPaper
TextPrunerModel Pruning for NLPYang et al., 2022
TextBrewerKnowledge Distillation for NLPYang et al., 2020

System Demonstration

NameDescriptionPaper
IFlyEAA Chinese Essay Assessment System with Automated Rating, Review Generation, and RecommendationGong et al., 2021
iFLYCheckerA Chinese Grammar Checking System-
IFlyLegalA Chinese Legal System for Consultation & Law SearchingWang et al., 2019

Evaluation Campaign

NameDescriptionLive Leaderboard
CMRC 2022Explainable Reading Comprehension
CTC 2021Chinese Text Correction
CAIL 2020Judiciary Reading Comprehension
CMRC 2019Sentence Cloze Reading Comprehension
CAIL 2019Judiciary Reading Comprehension
CMRC 2018Span-Extraction Reading Comprehension
CMRC 2017Cloze-style Reading Comprehension

Paper

YearPaperAuthor ListPublished inNote
2022Visualizing Attention Zones in Machine Reading Comprehension ModelsYiming Cui, Wei-Nan Zhang, Ting LiuSTAR ProtocolsGitHub
2022Multilingual Multi-Aspect Explainability Analyses on Machine Reading Comprehension ModelsYiming Cui, Wei-Nan Zhang, Wanxiang Che, Ting Liu, Zhigang Chen, Shijin WangiScienceGitHub
2021ExpMRC: Explainability Evaluation for Machine Reading ComprehensionYiming Cui, Ting Liu, Wanxiang Che, Zhigang Chen, Shijin WangHeliyonGitHub
2022Teaching Machines to Read, Answer and ExplainYiming Cui, Ting Liu, Wanxiang Che, Zhigang Chen, Shijin WangIEEE/ACM TASLP
2022PERT: Pre-training BERT with Permuted Language ModelYiming Cui, Ziqing Yang, Ting LiuGitHub
2022A Static and Dynamic Attention Framework for Multi Turn Dialogue GenerationWei-Nan Zhang, Yiming Cui, Kaiyan Zhang, Yifa Wang, Qingfu Zhu, Lingzhi Li, Ting LiuACM TOIS
2022Cross-Lingual Text Classification with Multilingual Distillation and Zero-Shot-Aware TrainingZiqing Yang, Yiming Cui, Zhigang Chen, Shijin Wang
2022CINO: A Chinese Minority Pre-trained Language ModelZiqing Yang, Zihang Xu, Yiming Cui, Baoxin Wang, Min Lin, Dayong Wu, Zhigang ChenGitHub
2022HFL at SemEval-2022 Task 8: A Linguistics-inspired Regression Model with Data Augmentation for Multilingual News SimilarityZihang Xu, Ziqing Yang, Yiming Cui, Zhigang ChenSemEval 2022GitHub
2022HIT at SemEval-2022 Task 2: Pre-trained Language Model for Idioms DetectionZheng Chu, Ziqing Yang, Yiming Cui, Zhigang Chen, Ming LiuSemEval 2022
2022TextPruner: A Model Pruning Toolkit for Pre-trained Language ModelsZiqing Yang, Yiming Cui, Zhigang ChenACL 2022 DemoGitHub
2022Interactive Gated Decoder for Machine Reading ComprehensionYiming Cui, Wanxiang Che, Ziqing Yang, Ting Liu, Bing Qin, Shijin Wang, Guoping HuACM TALLIP
2021IFlyEA: A Chinese Essay Assessment System with Automated Rating, Review Generation, and RecommendationJiefu Gong, Xiao Hu, Wei Song, Ruiji Fu, Zhichao Sheng, Bo Zhu, Shijin Wang, Ting LiuACL 2021 Demo
2021Dynamic Connected Networks for Chinese Spelling CheckBaoxin Wang, Wanxiang Che, Dayong Wu, Shijin Wang, Guoping Hu, Ting LiuFindings of ACL 2021
2021Various Legal Factors Extraction Based on Machine Reading ComprehensionBeichen Wang, Ziyue Wang, Baoxin Wang, Dayong Wu, Zhigang Chen, Shijin Wang, Guoping HuCCIR 2021
2021利用深层语言分析改进中文作文自动评分方法魏思,巩捷甫,宋巍,宋子尧,王士进中文信息学报
2021Bilingual Alignment Pre-training for Zero-shot Cross-lingual TransferZiqing Yang, Wentao Ma, Yiming Cui, Jiani Ye, Wanxiang Che, Shijin WangMRQA 2021
2021Adversarial Training for Machine Reading Comprehension with Virtual EmbeddingsZiqing Yang, Yiming Cui, Chenglei Si, Wanxiang Che, Ting Liu, Shijin Wang, Guoping Hu*SEM 2021
2021Pre-Training with Whole Word Masking for Chinese BERTYiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing YangIEEE/ACM TASLPGitHub1, GitHub2
2021Benchmarking Robustness of Machine Reading Comprehension ModelsChenglei Si, Ziqing Yang, Yiming Cui, Wentao Ma, Ting Liu, Shijin WangFindings of ACL 2021GitHub
2020A Sentence Cloze Dataset for Chinese Machine Reading ComprehensionYiming Cui, Ting Liu, Ziqing Yang, Zhipeng Chen, Wentao Ma, Wanxiang Che, Shijin Wang, Guoping HuCOLING 2020GitHub
2020CharBERT: Character-aware Pre-trained Language ModelWentao Ma, Yiming Cui, Chenglei Si, Ting Liu, Shijin Wang, Guoping HuCOLING 2020GitHub
2020Revisiting Pre-Trained Models for Chinese Natural Language ProcessingYiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Shijin Wang, Guoping HuFindings of EMNLP 2020GitHub
2020Is Graph Structure Necessary for Multi-hop Question Answering?Nan Shao, Yiming Cui, Ting Liu, Shijin Wang, Guoping HuEMNLP 2020-
2020TextBrewer: An Open-Source Knowledge Distillation Toolkit for Natural Language ProcessingZiqing Yang, Yiming Cui, Zhipeng Chen, Wanxiang Che, Ting Liu, Shijin Wang, Guoping HuACL 2020 DemoGitHub
2020Conversational Word Embedding for Retrieval-based Dialog SystemWentao Ma, Yiming Cui, Ting Liu, Dong Wang, Shijin Wang, Guoping HuACL 2020GitHub
2020Discriminative Sentence Modeling for Story Ending PredictionYiming Cui, Wanxiang Che, Wei-Nan Zhang, Ting Liu, Shijin Wang, Guoping HuAAAI 2020-
2019Cross-Lingual Machine Reading ComprehensionYiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Shijin Wang, Guoping HuEMNLP 2019GitHub
2019A Span-Extraction Dataset for Chinese Machine Reading ComprehensionYiming Cui, Ting Liu, Wanxiang Che, Li Xiao, Zhipeng Chen, Wentao Ma, Shijin Wang, Guoping HuEMNLP 2019GitHub
2019IFlyLegal: A Chinese Legal System for Consultation, Law Searching, and Document AnalysisZiyue Wang, Baoxin Wang, Xingyi Duan, Dayong Wu, Shijin Wang, Guoping Hu, Ting LiuEMNLP 2019 Demo-
2019TripleNet: Triple Attention Network for Multi-Turn Response Selection in Retrieval-based ChatbotsWentao Ma, Yiming Cui, Nan Shao, Su He, Wei-Nan Zhang, Ting Liu, Shijin Wang, Guoping HuCoNLL 2019GitHub
2019Improving Machine Reading Comprehension via Adversarial TrainingZiqing Yang, Yiming Cui, Wanxiang Che, Ting Liu, Shijin Wang, Guoping Hu--
2019Contextual Recurrent Units for Cloze-style Reading ComprehensionYiming Cui, Wei-Nan Zhang, Wanxiang Che, Ting Liu, Zhipeng Chen, Shijin Wang, Guoping Hu--
2019CJRC: A Reliable Human-Annotated Benchmark DataSet for Chinese Judicial Reading ComprehensionXingyi Duan, Baoxin Wang, Ziyue Wang, Wentao Ma, Yiming Cui, Dayong Wu, Shijin Wang, Ting Liu, Tianxiang Huo, Zhen Hu, Heng Wang, Zhiyuan LiuCCL 2019GitHub
2019Convolutional Spatial Attention Model for Reading Comprehension with Multiple-Choice QuestionsZhipeng Chen, Yiming Cui, Wentao Ma, Shijin Wang, Guoping HuAAAI 2019-
2018Disconnected Recurrent Neural Networks for Text CategorizationBaoxin WangACL 2018-
2018HFL-RC System at SemEval-2018 Task 11: Hybrid Multi-Aspects Model for Commonsense Reading ComprehensionZhipeng Chen, Yiming Cui*, Wentao Ma, Shijin Wang, Ting Liu, Guoping Hu--
2018Dataset for the First Evaluation on Chinese Machine Reading ComprehensionYiming Cui, Ting Liu, Zhipeng Chen, Wentao Ma, Shijin Wang, Guoping HuLREC 2018GitHub
2018Chinese Grammatical Error Diagnosis using Statistical and Prior Knowledge driven Features with Probabilistic Ensemble EnhancementRuiji Fu, Zhengqi Pei, Jiefu Gong, Wei Song, Dechuan Teng, Wanxiang Che, Shijin Wang, Guoping Hu, Ting LiuNLP-TEA@ACL 2018-
2017面向作文自动评分的优美句识别付瑞吉,王栋,王士进,胡国平,刘挺中文信息学报-
2017Attention-over-Attention Neural Networks for Reading ComprehensionYiming Cui, Zhipeng Chen, Si Wei, Shijin Wang, Ting Liu, Guoping HuACL 2017-
2017Generating and Exploiting Large-scale Pseudo Training Data for Zero Pronoun ResolutionTing Liu, Yiming Cui, Qingyu Yin, Wei-Nan Zhang, Shijin Wang, Guoping HuACL 2017-
2016Consensus Attention-based Neural Networks for Chinese Reading ComprehensionYiming Cui, Ting Liu, Zhipeng Chen, Shijin Wang, Guoping HuCOLING 2016GitHub
2016LSTM Neural Reordering Feature for Statistical Machine TranslationYiming Cui, Shijin Wang, Jianfeng LiNAACL 2016-

Follow Us

Follow our official WeChat account to keep updated with our latest technologies!