Home

Awesome

SemBleu: A Robust Metric for AMR Parsing Evaluation

The repository corresponds to our ACL 2019 paper entitled "SemBleu: A Robust Metric for AMR Parsing Evaluation".

Usage

chmod a+x eval.sh
./eval.sh output-file-path reference-file-path

Same as Smatch, AMRs in each file are separated by one empty line, such as:

(a / ask-01 :ARG0 (b / boy) :ARG1 (q / question))

(a / answer-01 :ARG0 (g / girl) :ARG1 (q / question))

AMR data

If you're developing a new metric and would like to have a comparison. Here is the 100 AMR graphs and the corresponding system outputs.

Results

The table below lists the SemBleu scores of recent SOTA work. The numbers are obtained by running our script on their provided outputs.

ModelSemBleu
LDC2015E86
Lyu and Titov, (ACL 2018)58.7
Groschwitz et al., (ACL 2018)51.8
Guo and Lu, (EMNLP 2018)50.4
LDC2016E25
Lyu and Titov, (ACL 2018)60.3
van Noord and Bos, (CLIN 2017)49.5
LDC2017T10
Zhang et al., (ACL 2019)59.9
Cai and Lam (EMNLP 2019)56.9
Groschwitz et al., (ACL 2018)52.5
Guo and Lu, (EMNLP 2018)52.4