Home

Awesome

Birds-to-Words

This repository contains the Birds-to-Words dataset, a collection of paragraph-length descriptions of the differences between pairs of iNaturalist bird photographs.

The Birds-to-Words dataset was introduced in the paper:

Neural Naturalist: Generating Fine-Grained Image Comparisons<br/> Maxwell Forbes, Christine Kaeser-Chen, Piyush Sharma, Serge Belongie<br/> EMNLP 2019

Please see the Neural Naturalist project page for an overview of the research project and publication.

The data is provided in the file birds-to-words-v1.0.tsv in this repository.

Sample

Animal 1Animal 2
A photograph of a brown and white bird <br> photo: John Ratzlaff (CC BY-NC-ND 4.0)A photograph of a multicolored bird <br> photo: Jessica (CC BY-NC 4.0)

Comparative Descriptions (four different writers):<br/>

  1. Animal 1 is brown and white with a squatty body with a light brown head. Animal 2 is multi-colored with a light blue and black head.

  2. Animal 1 has a brown head and wings, with a pale breast. The breast also has darker brown speckles on it. Animal 2 has a bright blue area around its eye, with a black patch right along the eye. Animal 2 also has a darker brown breast and greenish wings and back of its head.

  3. Animal 1 has a brown and white face, animal 2 has a black and bright blue face. Animal 1 has a white breast with black spots, animal 2 has a brown breast. Animal 1 has brown wings, animal 2 has green wings.

  4. Animal 1 is much smaller and shorter. Animal 2 has a larger head and longer tail feathers. Animal 1 has extensive spotting on the neck, chest, and belly. Animal 2 has turquoise head patches and brown coloring on the chest and belly.

Data Format

The tsv file is tab-separated and contains the following eleven columns:

ColumnNameTypeDescription
1img1ObservationURLstringURL of the iNaturalist photo record (including metadata) corresponding to the left image in the pair
2img1ImgURLstringURL of the left image itself
3img1SpeciesstringScientific species name for the animal in the left image
4img1SelectionstringHow the left image was selected in the "pivot-branch" stratified sampling procedure described in the paper. Value is one of: {base, visual, sameSpecies, sameGenus, sameFamily, sameOrder, sameClass}
5img2ObservationURLstringURL of the iNaturalist photo record (including metadata) corresponding to the right image in the pair
6img2ImgURLstringURL of the right image itself
7img2SpeciesstringScientific species name for the animal in the right image
8img2SelectionstringHow the right image was selected in the "pivot-branch" stratified sampling procedure described in the paper. Value is one of: {base, visual, sameSpecies, sameGenus, sameFamily, sameOrder, sameClass}
9splitstringSplit for training models and reporting results. One of: {train, val, test}
10annNintWe collect up to five annotations of each image pair. This is the annotation number of this instance. Value is one of: {1,2,3,4,5}
11descriptionstringA natural language paragraph describing the differences between the animals in the two photographs

License

The Birds-to-Words dataset is released under the Creative Commons Attribution-ShareAlike 4.0 International License. For the full license, see LICENSE.txt.