Awesome
Humanoid-Vision-Engine
[ECCV 2022] Contributions of Shape, Texture, and Color in Visual Recognition
Code is actively updating.
<div align="center"> <img src="./docs/Fig-1.png" alt="Editor" width="1000"> </div>Figure: Left: Contributions of Shape, Texture, and Color may be different among different scenarios/tasks. Right: Humanoid Vision Engine takes dataset as input and summarizes how shape, texture, and color contribute to the given recognition task in a pure learning manner (E.g., In ImageNet classification, shape is the most discriminative feature and contributes most in visual recognition).
<div align="center"> <img src="./docs/Fig-2.png" alt="Editor" width="1000"> </div>Figure: Pipeline for humanoid vision engine (HVE). (a) shows how will humans’ vision system deal with an image. After humans’ eyes perceive the object, the different parts of the brain will be activated. The human brain will organize and summarize that information to get a conclusion. (b) shows how we design HVE to correspond to each part of the human’s vision system.
Getting Started
Installation
- Clone this repo:
git clone https://github.com/gyhandy/Humanoid-Vision-Engine.git
cd Humanoid-Vision-Engine
Datasets
- Please download the preprocessed dataset from here, and then unzip it place in
.data
. (Note: if not download automatically, please right click, copy the link and open in a new tab.)
Analyze Dasaset Bias
- Please train the feature models with
# train shape model
python preprocess_dataset/1_train_resnet/main.py --data data/iLab/feature_images/shape --arch data/iLab/model/shape_resnet18/
# train texture model
python preprocess_dataset/1_train_resnet/main.py --data data/iLab/feature_images/texture --arch data/iLab/model/texture_resnet18/
# train color model
python preprocess_dataset/1_train_resnet/main.py --data data/iLab/feature_images/color --arch data/iLab/model/color_resnet18/
- Please train the humanoid neural network with
python HNN/train_HNN.py --root_shape data/iLab/feature_images/shape --root_texture data/iLab/feature_images/texture --root_color data/iLab/feature_images/color --shape_model data/iLab/model/shape_resnet18/1.pth --texture_model data/iLab/model/texture_resnet18/1.pth --color_model data/iLab/model/color_resnet18/1.pth --save_model_dir data/iLab/model/Attention)
- Analyze the bias of dataset
python HNN/compute_bias.py --root_shape data/iLab/feature_images/shape --root_texture data/iLab/feature_images/texture --root_color data/iLab/feature_images/color --shape_model data/iLab/model/shape_resnet18/1.pth --texture_model data/iLab/model/texture_resnet18/1.pth --color_model data/iLab/model/color_resnet18/1.pth --attention_model_dir data/iLab/model/Attention/model_ck0.pth)
Analyze your customized dataset
- We need to filter the foreground with GradCam, so we need to train a model first.
python preprocess_dataset/1_train_resnet/main.py --data YOUR_DATA_ROOT --arch PATH_TO_SAVE_MODEL
- Entity segmentation
please follow this repo https://github.com/dvlab-research/Entity/tree/main/Entity
- Identify foreground. Please have a look into the code and change the arguments to your customized data.
python preprocess_dataset/2_find_foreground_with_gradcam/select_mask.py
- Compute texture feature images.
python preprocess_dataset/3_compute_feature_images/generate_texture_feature.py
- Run DPT to get monodepth estimation.
python preprocess_dataset/3_compute_feature_images/preprocessed_shape/DPT/run_monodepth.py
- Compute shape feature images.
python preprocess_dataset/3_compute_feature_images/preprocessed_shape/generate_shape_feature.py
- Compute the images that used to generate color features.
python preprocess_dataset/3_compute_feature_images/preprocess_color/FourierScrambled/generate_input.py
-
Use matlab to implement FourierScrambled. Matlab code path
preprocess_dataset/3_compute_feature_images/preprocess_color/FourierScrambled/main.m
. -
After get all the feature images, you can go to part 2 and analyze the dataset bias with humanoid neural network.
Imagination of HVE
Experiment of Section 5.2 "Cross Feature Imagination with HVE"
1. Train the Model
-
In terminal, run
cd Imagine
-
Edit
script.sh
. Set the path of dataset and output file.
#!/usr/bin/env bash
FEATURE=texture # choose from texture, color, shape
python main.py --cuda 0, 1 \
--mode train \
--batch_size 16 \
--dataset_path /lab/tmpig8d/u/yao_data/human_simulation_engine/V3_${FEATURE}_dataset \
--output_path out/deeper/${FEATURE}
- run
sh script.sh
2. Run Generation
- Edit
test.sh
.- Set the path of dataset and output file.
- Set the test checkpoint file
- If want to use the mismatched shape, texture, color as input, set
--mismatch
- Example
FEATURE=texture # choose from texture, color, shape
python main.py --cuda 0 \
--mode predict \
--batch_size 16 \
--dataset_path /lab/tmpig8d/u/yao_data/human_simulation_engine/V3_${FEATURE}_dataset \
--output_path out/deeper/${FEATURE} \
--test_epoch 269 \
- run
sh test.sh
3. Calculate the FID
- Install pytorch-fid
pip install pytorch-fid
- Resize the groughtruth image
For example (you need to change the path):
#!/usr/bin/env bash
FEATURE=texture # choose from texture, color, shape
# Ground truth images dir
dataset_path=/lab/tmpig8d/u/yao_data/human_simulation_engine/V3_${FEATURE}_dataset/ori/valid/
# Processed gt images dir
process_path=out/deeper_deeper_res_new_texture/${FEATURE}/gt
python create_dataset.py --ori_path ${dataset_path} --path ${process_path}
- Run FID code:
For example (you need to change the path):
#!/usr/bin/env bash
FEATURE=texture # choose from texture, color, shape
# Processed gt images dir
process_path=out/deeper_deeper_res_new_texture/${FEATURE}/gt
# Generation result dir
output_path=out/deeper_deeper_res_new_texture/${FEATURE}/result_mismatch
python -m pytorch_fid ${process_path} ${output_path} --device cuda:1 --batch-size 128
Zero shot Segmentation
1. Generate distance file
please generate distance file with
python Zero_shot/cross_modality_two_latents.py
2. Download vector
please download vector from conceptnet
3. Zero shot Learning
please run zero shot learning with
python Zero_shot/zero_shot.py