Home

Awesome

Gloss Attention for Gloss-free Sign Language Translation

This is the official implementation of the GASLT paper.

Environment

git clone https://github.com/YinAoXiong/GASLT
cd GASLT
conda env create -f env.yaml
conda activate gaslt

Datasets

For the RWTH-PHOENIX-Weather 2014 T dataset, we provide processed data for download.

Since the public link will expire after a period of time, if the link expires, please contact me via email yinaoxiong@zju.edu.cn to get a new access link.

For other datasets, please refer to the following steps for processing because we do not have permission to distribute them.

Step 1: Download the raw data:

Step 2: Extract visual features:

Step 3: Pack the dataset:

Follow the format of the slt project to package the visual features. Specifically, the python list object is first serialized using pickle and then gzip compressed.

Step 4: Calculate Similarity Labels

We use the distiluse-base-multilingual-cased-v1 model from the Sentence-Transformers project to calculate the similarity between texts.

Training and Testing

First, make sure that the structure under the project data folder is as follows,

data
└── pht
    ├── bpe
    │   ├── de.wiki.bpe.vs25000.d300.w2v.txt
    │   ├── de.wiki.bpe.vs25000.d300.w2v.txt.pt
    │   └── de.wiki.bpe.vs25000.model
    ├── data
    │   ├── phoenix14t.pami0.dev
    │   ├── phoenix14t.pami0.test
    │   └── phoenix14t.pami0.train
    └── sim
        ├── cos_sim.pkl
        └── name_to_video_id.json
    ... 

and then run the command to train the model.

python -m signjoey train configs/train_pht.yaml --gpu_id 0

Run the following command to test the model.

python -m signjoey test configs/test_pht.yaml  --ckpt <path_to_ckpt> --output_path <path_to_output> --gpu_id 0

Citation

If you find this project useful, please cite our paper:

@inproceedings{yin2023gloss,
  title={Gloss attention for gloss-free sign language translation},
  author={Yin, Aoxiong and Zhong, Tianyun and Tang, Li and Jin, Weike and Jin, Tao and Zhao, Zhou},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={2551--2562},
  year={2023}
}

Acknowledgements

Our codes are based on the following repos: