Awesome
deepfake-satellite-images
By DeepMedia
DeepMedia is setting the standard for responsible use of synthetic media technology. Here, we present the DM-AER-DeepFake-V1 dataset that includes over 1M images of synthetic aerial images. Details about the dataset, creation methods, and initial detection results are presented below.
DM-AER-DeepFake-V1 Dataset
- Contains 1M DeepFake Aerial Images
- Contains 120K Real Aerial Images
The DM-AER-DeepFake-V1 dataset is split into sample, validation, and training folders. The real images are a part of the Million-AID dataset. These images were not present during the creation of the synthetic images. The synthetic images are generated by DeepMedia. The sample folder contains images from the validation set, but there is no overlap in images between the validation and training sets. The data can be downloaded at this google drive location.
https://drive.google.com/drive/folders/1h65vVQvfYzMsmofxTTEIOVSIWdi7zjTo?usp=sharing
Dataset Creation
The DM-AER-DeepFake-V1 dataset was created by training the rosinality StyleGAN2 repo on the AID dataset of aerial images. The software was trained an NVIDIA V100 GPU for 14 days until 800K steps. The following parameters were adjusted from their default values:
- architecture: swagan
- d_reg_every: 4
- added learning rate scheduler
- augmentation
Detection
Initial detection experiments were run on the initial 1M fake images generated. The ViT-pytorch repo was chosen as a state-of-the-art detection method. The real AID dataset was used as the "authentic" image source. The repo's dataloader module was modified to ensure upsampling of the authentic images so that the model would see equal number of authentic and fake images during training. The network converged after 2 days of training on an NVIDIA V100 GPU. High-level results are presented below:
- accuracy: 97%
- Type 1 error: 04%
- Type 2 error: 02%
Additionally, class activation maps were implemented in the ViT network and ran on both authentic and fake aerial images in order to better understand what image features provided the most contextual information in order to lead to ViT convergence.
<img width="627" alt="Screen Shot 2022-04-24 at 12 44 44 PM" src="https://user-images.githubusercontent.com/104281028/164994005-c6a3afca-34ce-4772-9bbb-5e98a928b897.png">Future Research
These results depict an early exploration into the creation and detection of synthetic aerial images. Below, we attempt to provide a non-exhaustive list of areas for potential improvement in creation methods, as well as techniques for detection.
- Integrate the Million-AID into synthetic aerial image creation
- Integrate higher resolution input images
- Integrate a more diverse set of input images
- Retrain StyleGAN2 with hyper-parmater grid search to achieve higher quality results
- Transition from StyleGAN2 to StyleGAN3 to achieve higher quality results
- Retrain at 1024x1024 resolution, or higher
- Integrate projection-based networks to enable targeted manipulation of aerial imagery
- Integrate higher-information images, just as 12bit/16bit/etc
- Investigate and reproduce initial ViT results
- Investigate more advanced detection methods
Acknowledgements
DeepMedia hopes to set the standard for ethical and responsible use of synthetic media technology, but we can't do this alone. We would like to thank the investors, government organizations, and academic instutions that are helping make this possible.
For more information, please send a message to contact@deepmedia.ai