Awesome

Multimodal-Emotion-Recognition-on-Comics-scenes-EmoRecCom-

ICDAR2021 Competition hosted on Codalab. The emotions of comic characters are described by the Visual information, the Text in speech Balloons or Captions and the Onomatopoeia (Comic drawings of words that phonetically imitates, resembles, or suggests the sound that it describes). The task hence is a multi-modal analysis task which can take advantages from both fields: computer vision and natural language processing which are two of the main interests of the ICDAR community.

Link to the competition: https://competitions.codalab.org/competitions/27884

Dataset link: https://drive.google.com/file/d/12fXFXw8AgxlZ7fU4_kcPogN2YDdT5rK3/view

Training/Testing Data (6,112/2046)

Data format: train_transcriptions.json: contains auto-transcriptions in comic scenes

train: contains raw images of training data

train_emotion_labels.csv: contains binary labels additional_infor:emotion_polarity.csv: contains additional info, the polarities of emotions in (0,1). Participants are encouraged to leverage these additional resources to achieve better performance.

test: contains raw images of testing data

test_transcriptions.json: contains auto-transcriptions in comic scenes

Target labels: There are 8 emotion classes including: 0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5=Surprise, 6=Neutral, 7=Others.

Sample Predictions:

Example 1:

<img src="1697_3_3.jpg" width="224" height="224"> Prediction: [Happy, Surprise, Neutral] | Ground Truth: [Happy, Surprise, Neutral]

Example 2:

<img src="1920_15_0.jpg" width="224" height="224"> Prediction: [Disgust, Happy, Neutral] | Ground Truth: [Disgust, Happy, Neutral]

Example 3:

<img src="2260_47_8.jpg" width="224" height="224"> Prediction: [Angry, Surprise, Neutral] | Ground Truth: [Fear, Happy, Neutral]

Leaderboard https://competitions.codalab.org/competitions/30954#results