Home

Awesome

ArtEmis: Affective Language for Visual Art

A codebase created and maintained by <a href="https://ai.stanford.edu/~optas" target="_blank">Panos Achlioptas</a>.

representative

Introduction

This work is based on the arXiv tech report which is provisionally accepted in CVPR-2021, for an <b>Oral</b> presentation.

Citation

If you find this work useful in your research, please consider citing:

@article{achlioptas2021artemis,
    title={ArtEmis: Affective Language for Visual Art},
    author={Achlioptas, Panos and Ovsjanikov, Maks and Haydarov, Kilichbek and
            Elhoseiny, Mohamed and Guibas, Leonidas},
    journal = {CoRR},
    volume = {abs/2101.07396},
    year={2021}
}

Dataset

To get the most out of this repo, please download the data associated with ArtEmis by filling this form.

Installation

This code has been tested with Python 3.6.9, Pytorch 1.3.1, CUDA 10.0 on Ubuntu 16.04.

Assuming some (potentially) virtual environment and python 3x

git clone https://github.com/optas/artemis.git
cd artemis
pip install -e .

This will install the repo with all its dependencies (listed in setup.py) and will enable you to do things like:

from artemis.models import xx

(provided you add this artemis repo in your PYTHON-PATH)

Playing with ArtEmis

Step-1 (important   :pushpin:)

Preprocess the provided annotations (spell-check, patch, tokenize, make train/val/test splits, etc.).

   artemis/scripts/preprocess_artemis_data.py

This script allows you to preprocess ArtEmis according to your needs. The default arguments will do minimal preprocessing so the resulting output can be used to fairly compare ArtEmis with other datasets; and, derive most faithful statistics about ArtEmis's nature. That is what we used in our analysis and what you should use in "Step-2" below. With this in mind do:

  python artemis/scripts/preprocess_artemis_data.py -save-out-dir <ADD_YOURS> -raw-artemis-data-csv <ADD_YOURS>

If you wish to train deep-nets (speakers, emotion-classifiers etc.) exactly as we did it in our paper, then you need to rerun this script by providing only a single extra optional argument ("--preprocess-for-deep-nets True"). This will do more aggressive filtering and you should use its output for "Steps-3" and "Steps-4" below. Use a different save-out-dir to avoid overwritting the output of previous runs.

  python artemis/scripts/preprocess_artemis_data.py -save-out-dir <ADD_YOURS> -raw-artemis-data-csv <ADD_YOURS> --preprocess-for-deep-nets True

To understand and customize the different hyper-parameters please read the details in the provided help messages of the used argparse.

Step-2

Analyze & explore the dataset. :microscope:

Using the minimally preprocessed version of ArtEmis which includes all (454,684) collected annotation.

  1. This is a great place to start :checkered_flag:. Run this notebook to do basic linguistic, emotion & art-oriented analysis of the ArtEmis dataset.
  2. Run this notebook to analyze ArtEmis in terms of its: concreteness, subjectivity, sentiment and Parts-of-Speech. Optionally, contrast these values with with other common datasets like COCO.
  3. Run this notebook to extract the emotion histograms (empirical distributions) of each artwork. This in necessary for the Step-3 (1).
  4. Run this notebook to analyze the extracted emotion histograms (previous step) per art genre and style.

Step-3

Train and evaluate emotion-centric image & text classifiers. :hearts:

Using the preprocessed version of ArtEmis for deep-nets which includes 429,431 annotations. (Training on a single GPU from scratch is a matter of minutes for these classifiers!)

  1. Run this notebook to train an image-to-emotion classifier.

  2. Run this notebook to train an LSTM-based utterance-to-emotion classifier. Or, this notebook to train a BERT-based one.

Step-4

Train & evaluate neural-speakers. :bomb:

    python artemis/scripts/train_speaker.py -log-dir <ADD_YOURS> -data-dir <ADD_YOURS> -img-dir <ADD_YOURS>

    log-dir: where to save the output of the training process, models etc.
    data-dir: directory that contains the _input_ data
              the directory that contains the ouput of preprocess_artemis_data.py: e.g., 
              the artemis_preprocessed.csv, the vocabulary.pkl
    img-dir: the top folder containing the WikiArt image dataset in its "standard" format:
                img-dir/art_style/painting_xx.jpg

Note. The default optional arguments will create the same vanilla-speaker variant we used in the CVPR21 paper.

    python artemis/scripts/train_speaker.py -log-dir <ADD_YOURS> -data-dir <ADD_YOURS> -img-dir <ADD_YOURS>
                                            --use-emo-grounding True
 python artemis/scripts/sample_speaker.py -arguments

For an explanation of the arguments see the argparse help messages. It is worth noting that when you want to sample an emotionally-grounded variant you need to provide a pretrained image2emotion classifier. The image2emotion will be used to deduce the most likely emotion of an image, and input this emotion to the speaker. See Step-3 (1) for how to train such a net.

MISC

Pretrained Models (used in CVPR21-paper)

News

License

This code is released under MIT License (see LICENSE file for details). In simple words, if you copy/use parts of this code please keep the copyright note in place.