Awesome
SMILE: A Multimodal Dataset for Understanding Laughter
This is the repository of SMILE: A Multimodal Dataset for Understanding Laughter. It comprises SMILE dataset, and codes involving the description of the dataset and evaluation for laughter reasoning.
Installation
$ conda create -n SMILE python==3.10.11
$ conda activate SMILE
# move to FastChat/ directory
$ cd FastChat
$ pip3 install --upgrade pip
$ pip3 install -e .
$ pip3 install openai
$ pip3 install scikit-image
$ pip3 install evaluate
$ pip3 install bert-score
Download the SMILE Dataset
-
Now, we are updating SMILE dataset v.2. After the update, we will update the laugh reasoning benchmark.
-
Download SMILE dataset v.2 in here
-
unzip the dataset.
├── annotations | ├── data_split.json | ├── GT_laughter_reason.json | └── multimodal_textual_representation.json | └── videos └── SMILE_videos.zip ├── video_clips └── video_segments
-
Details about each file
- data_split.json: key index for train, validation, test split
- GT_laughter_reason.json: Ground-Truth laughter reason for video clip
- multimodal_textual_representation.json: multimodal textual representation encoded from video clip
- video_clips: 887 video clips from sitcom and TED, Note: sitcom has an underbar in the key index, while TED does not. You can use this information for splitting our dataset by video types.
- video_segments: 4482 video segments trimmed from video clip by utterances.
-
SMILE dataset v.1 for evaluation
- We provide v.1 dataset for evaluation download in hear
- Note that sitcom_reasoning_{train/val}.json and ted_reasoning_{train/val}.json are subset of smile_reasoning_{train/val}.json.
├── SMILE_v1_evaluation ├── smile_reasoning_train.json ├── smile_reasoning_val.json ├── sitcom_reasoning_train.json ├── sitcom_reasoning_val.json ├── ted_reasoning_train.json └── ted_reasoning_val.json
Evaluation
Laugh reasoning
We provide the inference code for in-context and zero-shot experiment using GPT3.
As the fine-tuneded GPT3 requires a certain openai api-key which the model was fine-tuned on, we instead provide the inferecne code for fine-tuned model using LLaMA.
Please evaluate the models with the provided v.1. dataset.
In-context and Zero-shot experiment (GPT3)
Note that running GPT3 requires your own openai api-key and it also charges for running the model.
Replace the { } with your own information.
$ python gpt3_inferece.py -openai_key {your openai api key} -engine {name of gpt3 model} -shot {fewshot or zeroshot} -val_data {path/for/validation_data} -train_data {path/for/train_data} -random_seed {any integer number}
Fine-tuned experiment (LLaMA)
We provide the pre-trained weights of the LLaMA for the research purpose only.
Training data | Link |
---|---|
SMILE | SMILE_checkpoint |
SMILE_Sitcom | Sitcom_checkpoint |
SMILE_Ted | Ted_checkpoint |
├── SMILE
├── checkpoint
├── SMILE_SITCOM
├── checkpoint
├── SMILE_TED
├── checkpoint
Replace the { } with your own information.
You should direct the checkpoint directory for the model_path, e.g., "SMILE/checkpoint".
$ python FastChat/fastchat/serve/inference.py -model_path {path/for/fine-tuned model} -val_data {path/for/validation_data} -train_data {path/for/train_data} -random_seed {any integer number}
Acknowledgement
We are grateful for the following awesome projects, our SMILE arising from:
- GPT3: Language Models are Few-Shot Learners
- LLaMA: LLaMA: Open and Efficient Foundation Language Models
- Vicuna: Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90% ChatGPT Quality
- MUStARD: Towards Multimodal Sarcasm Detection (An Obviously Perfect Paper)
- UR-FUNNY: UR-FUNNY: A Multimodal Language Dataset for Understanding Humor