Awesome
Using Satellite Imagery and Deep Learning to Evaluate the Impact of Anti-Poverty Programs
Luna Yue Huang, Solomon Hsiang, Marco Gonzalez-Navarro
Figure Raw Data
Figure Identifier | Figure No. | CSV File |
---|---|---|
fig-schematic | Figure 1 | N/A |
fig-map | Figure 2 | fig-map.csv |
fig-ate | Figure 3 | fig-ate.csv |
fig-engel | Figure 4 | fig-engel-a.csv for Panel a, fig-engel-b.csv for Panel b |
fig-prcurve | ED Figure 1 | fig-prcurve.csv |
fig-chips | ED Figure 2 | N/A |
fig-colors | ED Figure 3 | N/A |
fig-engel-housing | ED Figure 4 | fig-engel-a.csv for Panel a, fig-engel-b.csv for Panel b |
fig-engel-nonhousing | ED Figure 5 | fig-engel-a.csv for Panel a, fig-engel-b.csv for Panel b |
fig-engel-consumption | ED Figure 6 | fig-engel-a.csv for Panel a, fig-engel-b.csv for Panel b |
fig-engel-tc-diff | ED Figure 7 | fig-engel-diff.csv |
fig-engel-ei-diff | ED Figure 8 | fig-engel-diff.csv |
fig-mx | ED Figure 9 | fig-mx.csv |
Instructions for Replication
To pretrain the deep learning model (required for replication of all the subsequent analyses)
python scripts/preprocess_openaitanzania.py
: Pre-process OpenAITanzania training data. (Before running the script, place raw data indata/OpenAITanzania/GeoJSON/
anddata/OpenAITanzania/GeoTIFF/
.)python scripts/pretrain_oat.py
: Pretrain Mask R-CNN on OpenAITanzania training data.python scripts/pretrain_pool.py
: Pretrain Mask R-CNN on pooled Google Static Maps images from rural Kenya, peri-urban Tanzania, and rural Mexico. (Before running the script, copyruns/run_00_PretrainOAT/best_checkpoint.pth.tar
toruns/run_01_PretrainPool/pretrained_checkpoint.pth.tar
.)
To download nightlight rasters used for comparison in subsequent analyses
- Go to Google Earth Engine, run
scripts/download_nightlight.js
: Download and preprocess nightlight rasters in 2019 for Kenya and Mexico. Save rasters indata/External/Nightlight/
.
To replicate the analysis in the main text (on the GiveDirectly (GD) randomized controlled trial in rural Kenya)
Note that the replication of some of the analyses requires field data collected in the GiveDirectly trial (placed in the data/External/GiveDirectly/
folder). These data contain sensitive geolocation information of the trial participants and thus cannot be shared without IRB approval.
python scripts/gd_prepare_download_gsm.py
: Prepare for downloading Google Static Map data.python scripts/gd_download_gsm.py
: Download Google Static Map data.python scripts/gd_sample_for_annotation.py
: Randomly sample images from the downloaded data for annotation on Supervisely. Save annotations indata/Siaya/Mask/
.python scripts/gd_train.py
: Fine tune Mask R-CNN on in-sample annotations in rural Kenya. (Before running the script, copyruns/run_01_PretrainPool/best_checkpoint.pth.tar
toruns/run_02_Siaya/pretrained_checkpoint.pth.tar
.)python scripts/gd_infer.py
: Run inference to generate predictions on all the images in Siaya.python scripts/gd_polygonize.py
: Collate the inference results into a geojson / csv file.python scripts/gd_postprocess.py
: Post-process the inference results csv file.python scripts/gd_merge.py --resolution 0.005
andpython scripts/gd_merge.py --resolution 0.001 --placebo 100 --eligible-only
: Merge the satellite observations with field survey data.
To replicate fig-colors
(roof color K-means clustering result)
python gd_fig_colors
: Generate the raw figures.
To replicate fig-prcurve
(Precision-Recall curve)
python scripts/gd_fig_prcurve_cv.py
: Conduct cross validation on in-sample annotations in rural Kenya. (Before running the script, copyruns/run_01_PretrainPool/best_checkpoint.pth.tar
toruns/run_03_SiayaCV0/pretrained_checkpoint.pth.tar
,runs/run_04_SiayaCV1/pretrained_checkpoint.pth.tar
andruns/run_05_SiayaCV2/pretrained_checkpoint.pth.tar
.)python scripts/gd_fig_prcurve.py
: Generate the raw figure.
To replicate fig-schematic
and fig-chips
(schematics and randomly sampled images/predictions)
python scripts/gd_fig_schematic.py
: Sample images and predictions.
To replicate fig-map
(map of treatment and outcome variables)
python scripts/gd_fig_map.py > logs/gd_fig_map.log
: Generate the raw figures.
To replicate fig-ate
(average treatment effect estimation)
Rscript scripts/gd_fig_ate.R > logs/gd_fig_ate.log
: Generate raw figures.
To replicate fig-engel
, fig-engel-housing
, fig-engel-nonhousing
, fig-engel-consumption
, fig-engel-tc-diff
, fig-engel-ei-diff
(Engel curves)
python scripts/gd_fig_engel.py > logs/gd_fig_engel.log
: Generate raw figures.
To replicate the analysis in the appendix (in rural Mexico)
python scripts/mx_prepare_download_gsm.py
: Prepare for downloading Google Static Map data. (Before running the script, download the raw data for the Population and Housing Census 2010 on the official INEGI website. Convert the.dbf
file to a.csv
file and place it in thedata/External/MexicoCPV/
folder.)python scripts/mx_download_gsm.py
: Download Google Static Map data.python scripts/mx_sample_for_annotation.py
: Randomly sample images from the downloaded data for annotation on Supervisely. Save annotations indata/Mexico/Mask/
.python scripts/mx_infer.py
: Run inference to generate predictions on all the images in the sample.python scripts/mx_postprocess.py
: Post-process inference results and collate into a geojson file.
To replicate fig-mx
(appendix figures showing validation results in Mexico)
python scripts/mx_fig_validate.py
: Generate raw figures.