Home

Awesome

WS-DAN.PyTorch

A neat PyTorch implementation of WS-DAN (Weakly Supervised Data Augmentation Network) for FGVC (Fine-Grained Visual Classification). (Hu et al., "See Better Before Looking Closer: Weakly Supervised Data Augmentation Network for Fine-Grained Visual Classification", arXiv:1901.09891)

NOTICE: This is NOT an official implementation by authors of WS-DAN. The official implementation is available at tau-yihouxiang/WS_DAN (and there's another unofficial PyTorch version wvinzh/WS_DAN_PyTorch).

Innovations

  1. Data Augmentation: Attention Cropping and Attention Dropping

    <div align="left"> <img src="./images/Fig1.png" height="600px" alt="Fig1" > </div>
  2. Bilinear Attention Pooling (BAP) for Features Generation

    <div align="left"> <img src="./images/Fig3.PNG" height="400px" alt="Fig3" > </div>
  3. Training Process and Testing Process

    <div align="left"> <img src="./images/Fig2a.PNG" height="446px" alt="Fig2a" > <img src="./images/Fig2b.PNG" height="400px" alt="Fig2b" > </div>

Performance

DatasetObjectCategoryTrainTestAccuracy (Paper)Accuracy (PyTorch)Feature Net
FGVC-AircraftAircraft1006,6673,33393.093.28inception_mixed_6e
CUB-200-2011Bird2005,9945,79489.488.28inception_mixed_6e
Stanford CarsCar1968,1448,04194.594.38inception_mixed_6e
Stanford DogsDog12012,0008,58092.289.66inception_mixed_7c

Usage

WS-DAN

This repo contains WS-DAN with feature extractors including VGG19('vgg19', 'vgg19_bn'), ResNet34/50/101/152('resnet34', 'resnet50', 'resnet101', 'resnet152'), and Inception_v3('inception_mixed_6e', 'inception_mixed_7c') in PyTorch form, see ./models/wsdan.py.

net = WSDAN(num_classes=num_classes, M=num_attentions, net='inception_mixed_6e', pretrained=True)
net = WSDAN(num_classes=num_classes, M=num_attentions, net='inception_mixed_7c', pretrained=True)
net = WSDAN(num_classes=num_classes, M=num_attentions, net='vgg19_bn', pretrained=True)
net = WSDAN(num_classes=num_classes, M=num_attentions, net='resnet50', pretrained=True)

Dataset Directory

Run

  1. git clone this repo.

  2. Prepare data and modify DATAPATH in datasets/<abcd>_dataset.py.

  3. Set configurations in config.py (Training Config, Model Config, Dataset/Path Config):

    tag = 'aircraft'  # 'aircraft', 'bird', 'car', or 'dog'
    
  4. $ nohup python3 train.py > progress.bar & for training.

  5. $ tail -f progress.bar to see training process (tqdm package is required. Other logs are written in <config.save_dir>/train.log).

  6. Set configurations in config.py (Eval Config) and run $ python3 eval.py for evaluation and visualization.

Attention Maps Visualization

Code in eval.py helps generate attention maps. (Image, Heat Attention Map, Image x Attention Map)

<div align="center"> <img src="./images/007_raw.jpg" height="200px" alt="Raw" > <img src="./images/007_heat_atten.jpg" height="200px" alt="Heat" > <img src="./images/007_raw_atten.jpg" height="200px" alt="Atten" > </div>