Home

Awesome

The COCO-CF Dataset in ECCV Paper: Improving Vision and Language Concepts Understanding with Multimodal Counterfactual Samples

COCO-CF is the dataset which is automatically generated from MS-COCO by injecting concepts from off-the-shelf language models and diffusion models to reduce the bias of bag-of-objects.

Introduction

The structure of the Directory "COCO-CF-Example"
-COCO-CF-Example
---images
    --COCO_train2014_000000001122.jpg
    --COCO_train2014_000000001122_2113.jpg
    .
    .
    .
    --COCO_train2014_000000001907_138726.jpg
---coco-cf.json
The structure of the file "coco-cf.json" in "COCO-CF-Example"
[
    {
        "factual": {
            "image": "COCO_train2014_000000001122.jpg",
            "caption": "a red and yellow trains engine pulling its cars and some tracks"
        },
        "counterfactual": {
            "image": "COCO_train2014_000000001122_2113.jpg",
            "caption": "a blue and yellow trains engine pulling its cars and some tracks"
        }
    }
]