Home

Awesome

Code for "Self-generated Replay Memories for Continual Neural Machine Translation"

Requirements

Optional: wandb A note about dictionaries: pyenchant works with different providers: be sure to check the requirements to install additional dictionaries. For example, for enchant, if the USER_CONFIG_DIR/PROVIDER_NAME/ is set, you can add Hunspell dictionaries (.dic and .aff files) to the directory and they will be automatically loaded. Refer to this thread for more info: https://github.com/pyenchant/pyenchant/issues/167

Setup

  1. Create a data directory and a model directory:

    • data directory: Where the datasets will be saved.
    • model directory: Folder of the various models.
  2. Make a copy of env.example and save it as env. In env, set the value of DATA_DIR as data directory and set the value of MODEL_BASE_DIR as model directory.

Experiments:

The first argument is the model subdirectory name, the second are the CUDA_VISIBLE_DEVICES

Note: The scripts are set to use the iwslt2017 dataset. To use the unpc dataset, change the --dataset_name flag to unpc in the scripts and the language pairs accordingly.

Changing default parameters

Below are some of the most common cmd line flags.

To use them run generic_entrypoint.sh specifying as third argument the strategy you want to use. Strategies are under src/strategies.

Example:

./generic_entrypoint.sh model_directory 0 src.strategies.ewc_cill_unified --train_epochs 10 --dataset_name unpc

will run the EWC training with specified arguments.

OptionsDescription
model_save_pathWhere to save the models
dataset_save_pathFolder of the datasets. If no data is present it will be downloaded
dataset_nameName of the dataset: iwslt2017 or unpc
train_epochsEpochs for training for all tasks.
lang_pairsList of translation pairs for the various experiences e.g en-fr en-ro
replay_memoryNumber of samples for the memory
pairs_in_experienceNumber of translation pairs in each experience
metric_for_best_modelEvaluation metric to select the best model
batch_sizeBatch size for all tasks.
save_stepsSaving interval steps
eval_stepsEvaluate every steps
fp16Use float16 precision
early_stoppingPatience parameter
ewc_lambdaLambda parameter for EWC strategy
agem_sample_sizeSise of the AGEM samples during optimization
agem_pattern_per_expNumber of examples to sample from to populate agem memory buffer
logging_dirDirectory to store logs.
bidirectionalDefaults to True. If True it will include also the reverse direction e.g. with --lang_pairs en-fr it will train also on fr-en

Check individual strategies for more options.

Evaluation

Models are evaluated on validation set during training. At the end of the training phase the best model is evaluated on the test set. During training BLEU score is used as evaluation metric as it is the most common metric for machine translation and cheaper than COMET.

To evaluate a model on the test set, use the eval_models.py script under src/utils folder and pass as cmdline arguments the same used to train the model. The script will evaluate the best model using BLEU and COMET.

Docker

We provide a template Dockerfile to build an image with all the dependencies needed to run the experiments. Check the Dockerfile for more info.

Citation

If find this code useful, please consider citing our work:

@misc{resta2024selfgenerated,
      title={Self-generated Replay Memories for Continual Neural Machine Translation}, 
      author={Michele Resta and Davide Bacciu},
      year={2024},
      eprint={2403.13130},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}