Home

Awesome

COCO-CN

COCO-CN is a bilingual image description dataset enriching MS-COCO with manually written Chinese sentences and tags. The new dataset can be used for multiple tasks including image tagging, captioning and retrieval, all in a cross-lingual setting.

Chinese sentencesCOCO-CN trainCOCO-CN valCOCO-CN test
human written:white_check_mark::white_check_mark::white_check_mark:
human translation:x::x::white_check_mark:
machine translation (baidu):white_check_mark::white_check_mark::white_check_mark:
<img src="dataset-snapshot.png" alt="coco-cn annotation examples" width="400" />

Progress

Citation

If you find COCO-CN useful, please consider citing the following paper: