

Focus on the Positives: Self-Supervised Learning for Biodiversity Monitoring

This repository contains the code for reproducing the results of our ICCV 2021 paper. A summary video can be found here The following figure illustrates our context positive approach within a SSL framework.

Overview of Context approach.

The organization of the repository is the following:


A large repository of camera trap data can be found at lila.science, including Caltech Camera Traps (CCT20), Island Conservation Camera Traps (ICCT) and Snapshot Serengeti datasets which were used for our main paper experiments.

Getting started

Data preprocessing

Running self-supervised pretraining and evaluating the performance on downstream task



There follows an example of code executing the above steps. The example is for the CCT20 dataset and uses SimCLR as the base self-supervised learning approach.

Step 1

Extract camera trap object regions from images (these can be either available from the given data or acquired from Megadetector)

python data_processing/preprocess_images.py --dataset cct20

Step 2

Save a metadata file for the contextual information of each image

python data_processing/preprocess_context.py --dataset cct20 --annotation_file CaltechCameraTrapsECCV18.json

Step 3

Learn representations with a variety of SSL training settings and evaluate their quality on a downstream task (i.e. species classification).

The following scenarios cover standard SimCLR, SimCLR with sequence positives and SimCLR with context-informed positives:

python main.py --train_loss simclr --pos_type augment_self --backbone resnet18 --im_res 112 --dataset cct20 --exp_name "simclr standard"
python main.py --train_loss simclr --pos_type seq_positive --backbone resnet18 --im_res 112 --dataset cct20 --exp_name "simclr seq positive"  
python main.py --train_loss simclr --pos_type context_sample --backbone resnet18 --im_res 112 --dataset cct20 --exp_name "simclr context distance"   

In the above Python scripts important parameters include: train_loss which can be simclr, triplet, simsiam (for SSL pretraining) or rand_init,imagenet,supervised (for supervised or transfer-learning baselines). In addition pos_type parameter corresponds to the type of SSL approach and can be augment_self (standard augmentation), seq_positive (sequence positives), context_sample (context positives) and oracle (oracle positives).



If you find our work useful in your research please consider citing our paper:

  title={Focus on the Positives: Self-Supervised Learning for Biodiversity Monitoring},
  author={Pantazis, Omiros and 
          Brostow, Gabriel and 
          Jones, Kate and 
          Mac Aodha, Oisin},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},