Awesome
Functional Map of the World (fMoW) Dataset
There are two versions of the dataset: fMoW-full and fMoW-rgb. fMoW-full is in TIFF format, contains 4-band and 8-band multispectral imagery, and is quite large at ~3.5TB in size. fMoW-rgb is in JPEG format, all multispectral imagery has been converted to RGB, and it is significantly smaller in size at ~200GB.
Please see the fMoW flyer for more info about the challenge. Note that the fMoW challenge has now ended.
Reference
If you use our dataset or code, please cite our paper:
@inproceedings{fmow2018,
title={Functional Map of the World},
author={Christie, Gordon and Fendley, Neil and Wilson, James and Mukherjee, Ryan},
booktitle={CVPR},
year={2018}
}
Categories
["airport", "airport_hangar", "airport_terminal", "amusement_park", "aquaculture", "archaeological_site", "barn", "border_checkpoint", "burial_site", "car_dealership", "construction_site", "crop_field", "dam", "debris_or_rubble", "educational_institution", "electric_substation", "factory_or_powerplant", "fire_station", "flooded_road", "fountain", "gas_station", "golf_course", "ground_transportation_station", "helipad", "hospital", "impoverished_settlement", "interchange", "lake_or_pond", "lighthouse", "military_facility", "multi-unit_residential", "nuclear_powerplant", "office_building", "oil_or_gas_facility", "park", "parking_lot_or_garage", "place_of_worship", "police_station", "port", "prison", "race_track", "railway_bridge", "recreational_facility", "road_bridge", "runway", "shipyard", "shopping_mall", "single-unit_residential", "smokestack", "solar_farm", "space_facility", "stadium", "storage_tank", "surface_mine", "swimming_pool", "toll_booth", "tower", "tunnel_opening", "waste_disposal", "water_treatment_facility", "wind_farm", "zoo"]
Download
Originally, there were two official ways to download the dataset: from AWS or using BitTorrent. However, the BitTorrent method is no longer actively maintained, leaving AWS as the primary recommended method for downloading the data.
AWS
The fMoW datasets are available on AWS for free at:
- fMoW-full: s3://spacenet-dataset/Hosted-Datasets/fmow/fmow-full
- fMoW-rgb: s3://spacenet-dataset/Hosted-Datasets/fmow/fmow-rgb
Accessing the data through AWS is possible using tools such as the AWS CLI. For example, to get a directory listing using the AWS CLI run the following commands:
aws s3 ls s3://spacenet-dataset/Hosted-Datasets/fmow/fmow-full/
aws s3 ls s3://spacenet-dataset/Hosted-Datasets/fmow/fmow-rgb/
To download the manifest.json.bz2 files that list all images and metadata present in each bucket, run the following commands:
aws s3 cp s3://spacenet-dataset/Hosted-Datasets/fmow/fmow-full/manifest.json.bz2 ./
aws s3 cp s3://spacenet-dataset/Hosted-Datasets/fmow/fmow-rgb/manifest.json.bz2 ./
BitTorrent (Deprecated)
NOTE: This download method is no longer maintained! Using the client of your choice, you can add the following torrent files to download the corresponding subsets of the fMoW dataset:
fMoW-full train and valfMoW-full testfMoW-full val sample with false detectionsfMoW-full sequestered and ground truth datafMoW-rgb train and valfMoW-rgb testfMoW-rgb val sample with false detectionsfMoW-rgb sequestered and ground truth data
Additional details
The train and val sets were released to competitors with category labels and a rich set of metadata fields. The test and seq sets had category labels removed and a small amount of noise added to many metadata fields. Certain fields, such as GPS coordinates, were removed from all sets during the challenge. However, now that the challenge has ended, the sequestered and ground truth data has been released, which contains all raw metadata, including category labels and GPS coordinates, for every image.
Joining these ground truth metadata files with the original test and seq imagery does require a small amount of effort. In each of the ground truth archives for fMoW-full and fMoW-rgb there is a mapping JSON file. This mapping file provides the association between each test and seq image and its corresponding metadata. You can also use this mapping file to reorganize the test and seq data into category and temporal sequence folders similar to the train and val sets.
Bounding box format
Bounding boxes are provided in the format [x, y, width, height]
where the point (x, y)
corresponds to the top-left edge of the box surrounding the object of interest. In other words, these four values can also be thought of as corresponding to [left, top, width, height]
for a box surrounding the object of interest.
Non-existent country codes
Some country codes in the dataset may not be valid. Please consider re-computing country codes using the underlying geographic coordinate metadata. See this issue for more details.
License
This data is licensed under the Functional Map of the World Challenge Public License. This new license is similar to the previous license with modifications to clarify that algorithms trained on challenge data are not considered adapted material.