Awesome
SignGraph: A Sign Sequence is Worth Graphs of Nodes
An implementation of the paper: SignGraph: A Sign Sequence is Worth Graphs of Nodes. (CVPR 2024) [paper]
Prerequisites
-
This project is implemented in Pytorch (>1.8). Thus please install Pytorch first.
-
ctcdecode==0.4 [parlance/ctcdecode],for beam search decode.
-
For these who failed install ctcdecode (and it always does), you can download ctcdecode here, unzip it, and try
cd ctcdecode
andpip install .
-
Pealse follow this link to install pytorch geometric
-
You can install other required modules by conducting
pip install -r requirements.txt
Data Preparation
-
PHOENIX2014 dataset: Download the RWTH-PHOENIX-Weather 2014 Dataset [download link].
-
PHOENIX2014-T datasetDownload the RWTH-PHOENIX-Weather 2014 Dataset [download link]
-
CSL dataset: Request the CSL Dataset from this website [download link]
Download datasets and extract them, no further data preprocessing needed.
Weights
We make some imporvments of our code, and provide newest checkpoionts and better performance.
Dataset | Backbone | Dev WER | Del / Ins | Test WER | Del / Ins | Pretrained model |
---|---|---|---|---|---|---|
Phoenix14T | SignGraph | 17.00 | 4.99/2.32 | 19.44 | 5.14/3.38 | [Google Drive] |
Phoenix14 | SignGraph | 17.13 | 6.00/2.17 | 18.17 | 5.65/2.23 | [Google Drive] |
CSL-Daily | SignGraph | 26.38 | 9.92/2.62 | 25.84 | 9.39/2.58 | [Google Drive] |
To evaluate the pretrained model, choose the dataset from phoenix2014/phoenix2014-T/CSL/CSL-Daily in line 3 in ./config/baseline.yaml first, and run the command below:
python main.py --device your_device --load-weights path_to_weight.pt --phase test
Training
The priorities of configuration files are: command line > config file > default values of argparse. To train the SLR model, run the command below:
python main.py --device your_device
Note that you can choose the target dataset from phoenix2014/phoenix2014-T/CSL/CSL-Daily in line 3 in ./config/baseline.yaml.
Thanks
This repo is based on VAC (ICCV 2021), VIT (NIPS 2022) and RTG-Net (ACM MM2023)!
Citation
If you find this repo useful in your research works, please consider citing:
@inproceedings{gan2024signgraph,
title={SignGraph: A Sign Sequence is Worth Graphs of Nodes},
author={Gan, Shiwei and Yin, Yafeng and Jiang, Zhiwei and Wen, Hongkai and Xie, Lei and Lu, Sanglu},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={13470--13479},
year={2024}
}
@inproceedings{gan2023towards,
title={Towards Real-Time Sign Language Recognition and Translation on Edge Devices},
author={Gan, Shiwei and Yin, Yafeng and Jiang, Zhiwei and Xie, Lei and Lu, Sanglu},
booktitle={Proceedings of the 31st ACM International Conference on Multimedia},
pages={4502--4512},
year={2023}
}
@inproceedings{gan2023contrastive,
title={Contrastive learning for sign language recognition and translation},
author={Gan, Shiwei and Yin, Yafeng and Jiang, Zhiwei and Xia, Kang and Xie, Lei and Lu, Sanglu},
booktitle={Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23},
pages={763--772},
year={2023}
}
@article{han2022vision,
title={Vision gnn: An image is worth graph of nodes},
author={Han, Kai and Wang, Yunhe and Guo, Jianyuan and Tang, Yehui and Wu, Enhua},
journal={Advances in neural information processing systems},
volume={35},
pages={8291--8303},
year={2022}
}