Home

Awesome

am-transition-parser

A top-down transition-based AM dependency parser with quadratic run time complexity and well-typedness guarantees. Implemented are the LTF and LTL transition systems and unconstrained versions for normal dependency parsing.

Steps for setting up

Training a model

Select an appropriate configuration file (e.g. training_configs/bert/DM.jsonnet) and run the following command:

python -m allenpipeline train <your config.jsonnet> -s models/<your model name> --include-package topdown_parser

You can use other command line arguments as well, see python -m allenpipeline train --help, in particular you can select the cuda device as follows: -o '{trainer : {cuda_device : 0 } }'.

You can train an almost minimal configuration with the provided example AM dependency trees as follows:

python -m allenpipeline train configs/example_config.jsonnet -s models/example-model --include-package topdown_parser

Parsing

There are different ways to parse, depending on what you want.

Directory structure

By default, we assume the following directory structure. If you want to parse only the test set, you only need the lexicon subfolder, the lookup subfolder (only for AMR) and the test* subfolders.

data/
├── AMR
│   ├── 2015
│   │   ├── dev
│   │   │   ├── dev.amconll
│   │   │   └── goldAMR.txt
│   │   ├── gold-dev
│   │   │   └── gold-dev.amconll
│   │   ├── lexicon
│   │   │   ├── constants.txt
│   │   │   ├── edges.txt
│   │   │   ├── lex_labels.txt
│   │   │   └── types.txt
│   │   ├── lookup
│   │   │   ├── nameLookup.txt
│   │   │   ├── nameTypeLookup.txt
│   │   │   ├── README.txt
│   │   │   ├── wikiLookup.txt
│   │   │   └── words2labelsLookup.txt
│   │   ├── test
│   │   │   ├── goldAMR.txt
│   │   │   └── test.amconll
│   │   └── train
│   │       └── train.amconll
│   └── 2017
│       ├── dev
│       │   ├── dev.amconll
│       │   └── goldAMR.txt
│       ├── gold-dev
│       │   └── gold-dev.amconll
│       ├── lexicon
│       │   ├── constants.txt
│       │   ├── edges.txt
│       │   ├── lex_labels.txt
│       │   └── types.txt
│       ├── lookup
│       │   ├── nameLookup.txt
│       │   ├── nameTypeLookup.txt
│       │   ├── wikiLookup.txt
│       │   └── words2labelsLookup.txt
│       ├── test
│       │   ├── goldAMR.txt
│       │   └── test.amconll
│       └── train
│           └── train.amconll
├── EDS
│   ├── dev
│   │   ├── dev.amconll
│   │   ├── dev-gold
│   │   ├── dev-gold.amr.txt
│   │   └── dev-gold.edm
│   ├── gold-dev
│   │   └── gold-dev.amconll
│   ├── lexicon
│   │   ├── constants.txt
│   │   ├── edges.txt
│   │   ├── lex_labels.txt
│   │   └── types.txt
│   ├── README.txt
│   ├── test
│   │   ├── test
│   │   ├── test.amconll
│   │   ├── test-gold
│   │   ├── test-gold.amr.txt
│   │   └── test-gold.edm
│   └── train
│       ├── train.amconll
└── SemEval
    └── 2015
        ├── DM
        │   ├── dev
        │   │   ├── dev.amconll
        │   │   └── dev.sdp
        │   ├── gold-dev
        │   │   └── gold-dev.amconll
        │   ├── lexicon
        │   │   ├── constants.txt
        │   │   ├── edges.txt
        │   │   ├── lex_labels.txt
        │   │   └── types.txt
        │   ├── test.id
        │   │   ├── en.id.dm.sdp
        │   │   └── test.id.amconll
        │   ├── test.ood
        │   │   ├── en.ood.dm.sdp
        │   │   └── test.ood.amconll
        │   └── train
        │       └── train.amconll
        ├── PAS
        │   ├── dev
        │   │   ├── dev.amconll
        │   │   └── dev.sdp
        │   ├── gold-dev
        │   │   └── gold-dev.amconll
        │   ├── lexicon
        │   │   ├── constants.txt
        │   │   ├── edges.txt
        │   │   ├── lex_labels.txt
        │   │   └── types.txt
        │   ├── test.id
        │   │   ├── en.id.pas.sdp
        │   │   └── test.id.amconll
        │   ├── test.ood
        │   │   ├── en.ood.pas.sdp
        │   │   └── test.ood.amconll
        │   └── train
        │       └── train.amconll
        └── PSD
            ├── dev
            │   ├── dev.amconll
            │   └── dev.sdp
            ├── gold-dev
            │   ├── gold-dev.amconll
            ├── lexicon
            │   ├── constants.txt
            │   ├── edges.txt
            │   ├── lex_labels.txt
            │   └── types.txt
            ├── test.id
            │   ├── en.id.psd.sdp
            │   └── test.id.amconll
            ├── test.ood
            │   ├── en.ood.psd.sdp
            │   └── test.ood.amconll
            └── train
                └── train.amconll


Uni Saarland internal notes

The lexica and some pre-trained models can be found in /proj/irtg.shadow/EMNLP20/transition_systems/. A conda environment is prepared already and it's called pytorch1.4.