Awesome

<p align="center"> Beyond the Contact: Discovering Comprehensive Affordance for 3D Objects from Pre-trained 2D Diffusion Models (ECCV 2024, Oral)</p>

Project Page | Paper

This is the official code for the ECCV 2024 paper "Beyond the Contact: Discovering Comprehensive Affordance for 3D Objects from Pre-trained 2D Diffusion Models".

News

[2024/08/19] Initial code release.
[2024/07/24] Arxiv paper release.
[2024/07/17] Initial dataset release.

Download the Dataset

Since our pipeline for learning ComA is scalable with respect to the category of input objects, we collected a total of 83 object meshes from SketchFab, encompassing various interaction types. Each object mesh was converted to the .obj format, including image texture files. We manually canonicalize the location, orientation, and scale of the objects. The dataset can be downloaded at Google Drive.

The format of the dataset is as follows:

data
└── SketchFab
    ├── accordion # object category
    │   └── wx75e99elm1yhyfxz1efg60luadp95sl # object id
    │       ├── images # folder for texture files
    │       ├── model.obj
    │       └── model.mtl
    ├── axe
    ├── ...
    └── wine glass

In SketchFab/categories.json, you can check the existing object categories, along with the original data URL and the license information.

Installation

To setup the environment for running ComA, please refer to the instructions provided <a href="INSTALL.md">here</a>.

Quick Start

2D HOI Image Generation

To generate 2D HOI Images of given 3D object (in this case, backpack), use following command.

bash scripts/generate_2d_hoi_images.sh --gpus 0 1 2 3 4 5 6 7 --dataset_type "BEHAVE" --supercategory "BEHAVE" --category "backpack"

3D HOI Sample Generation

To generate 3D HOI Samples from the generated 2D HOI Images (of the given 3D object, backpack), use following command.

bash scripts/generate_3d_hoi_samples.sh --gpus 0 1 2 3 4 5 6 7 --dataset_type "BEHAVE" --supercategory "BEHAVE" --category "backpack"

ComA Extraction

To learn ComA from the generated 3D HOI Samples, use following command.

bash scripts/learn_coma.sh --IoU_threshold_min 0.7 --inlier_num_threshold_min 10 --supercategory "BEHAVE" --category "backpack" --dataset_type "BEHAVE"

Note that the keys in the dictionary in constants/coma/qual.py, IoU_threshold_min and inlier_num_threshold_min are hyperparameters that can affect the results.

Visualize Results

Visualize Results via Blender

To visualize results via Blender (human contact, object contact, human orientational tendency), use following commands. Make sure you have a GUI on your machine.

blenderproc debug src/visualization/visualize_human.py --affordance_path [human affordance npy path] # human contact, human orientational tendency
blenderproc debug src/visualization/visualize_object.py --affordance_path [object affordance ply path] # object contact

Both [human affordance npy path] and [object affordance ply path] will be under the directory results/coma/affordance/ after executing ComA extraction.

Visualize Results via Mayavi

To visualize results via Mayavi (human occupancy), use following commands.

python src/visualization/visualize_occupancy.py --asset_downsample_pth [downsampled object pickle path] --affordance_path [occupancy affordance npy path]

[downsampled object pickle path] will be under the directory results/coma/asset_downsample/ after executing object downsampling (included in ComA Extraction bash file). [occupancy affordance npy path] will be under the directory results/coma/affordance/ after executing ComA extraction.

Inference (Reproduce Results)

We release pre-trained ComA for backpack object of BEHAVE. We use different configuration settings (defined at constants/coma/qual.py) for obtaining each of the results (e.g., human contact, object contact, human orientational tendency, human occupancy). Download pre-trained ComA from Google Drive and place it at main directory. The final directory should be as:

coma
└── pre-trained
    └── BEHAVE 
        └── backpack
            ├── human_contact
            │   ├── behave_asset_180.pickle
            │   ├── coma_backpack_human_contact.pickle # ComA
            │   └── smplx_star_downsampled_FULL.pickle
            ├── human_occupancy
            │   ├── behave_asset_180.pickle
            │   ├── coma_backpack_human_occupancy.pickle # ComA
            │   └── smplx_star_downsampled_FULL.pickle
            ├── human_orientational_tendency
            │   ├── behave_asset_180.pickle
            │   ├── coma_backpack_human_orientation.pickle # ComA
            │   └── smplx_star_downsampled_FULL.pickle
            └── object_contact
                ├── behave_asset_1500.pickle
                ├── coma_backpack_object_contact.pickle # ComA
                └── smplx_star_downsampled_1000.pickle

For the pre-trained ComA, use following command to visualize each affordances.

Human Contact

python src/coma/inference.py --supercategory "BEHAVE" --category "backpack" --coma_path "pre-trained/BEHAVE/backpack/human_contact/coma_backpack_human_contact.pickle" --visualize_type "aggr-object-contact" --smplx_downsample_pth "pre-trained/BEHAVE/backpack/human_contact/smplx_star_downsampled_FULL.pickle" --asset_downsample_pth "pre-trained/BEHAVE/backpack/human_contact/behave_asset_180.pickle" --hyperparams_key "qual:backpack_human_contact"
blenderproc debug src/visualization/visualize_human.py --affordance_path "output/BEHAVE/backpack/human_contact.npy"

Object Contact

python src/coma/inference.py --supercategory "BEHAVE" --category "backpack" --coma_path "pre-trained/BEHAVE/backpack/human_contact/coma_backpack_human_contact.pickle" --visualize_type "aggr-object-contact" --smplx_downsample_pth "pre-trained/BEHAVE/backpack/human_contact/smplx_star_downsampled_FULL.pickle" --asset_downsample_pth "pre-trained/BEHAVE/backpack/human_contact/behave_asset_180.pickle" --hyperparams_key "qual:backpack_human_contact"
blenderproc debug src/visualization/visualize_object.py --affordance_path "output/BEHAVE/backpack/object_contact.ply"

Human Orientational Tendency

python src/coma/inference.py --supercategory "BEHAVE" --category "backpack" --coma_path "pre-trained/BEHAVE/backpack/human_orientational_tendency/coma_backpack_human_orientation.pickle" --visualize_type "orientation" --smplx_downsample_pth "pre-trained/BEHAVE/backpack/human_orientational_tendency/smplx_star_downsampled_FULL.pickle" --asset_downsample_pth "pre-trained/BEHAVE/backpack/human_orientational_tendency/behave_asset_180.pickle" --hyperparams_key "qual:backpack_orientation" 
blenderproc debug src/visualization/visualize_human.py --affordance_path "output/BEHAVE/backpack/orientational_tendency.npy"

Human Occupancy

python src/coma/inference.py --supercategory "BEHAVE" --category "backpack" --coma_path "pre-trained/BEHAVE/backpack/human_occupancy/coma_backpack_human_occupancy.pickle" --visualize_type "occupancy" --smplx_downsample_pth "pre-trained/BEHAVE/backpack/human_occupancy/smplx_star_downsampled_FULL.pickle" --asset_downsample_pth "pre-trained/BEHAVE/backpack/human_occupancy/behave_asset_180.pickle" --hyperparams_key "qual:backpack_occupancy"
python src/visualization/visualize_occupancy.py --asset_downsample_pth "pre-trained/BEHAVE/backpack/human_occupancy/behave_asset_180.pickle" --affordance_path "output/BEHAVE/backpack/occupancy.npy"

(Optional) Application

We release code for our optimization framework to reconstruct Human-Object Interaction. You can optimize human via following command.

python src/application/optimize.py  --coma_path "pre-trained/BEHAVE/backpack/human_contact/coma_backpack_human_contact.pickle" --asset_downsample_pth "pre-trained/BEHAVE/backpack/human_contact/behave_asset_180.pickle" --use_collision

Citation

If you find our work helpful or use our code, please consider citing:

@misc{coma,
      title={Beyond the Contact: Discovering Comprehensive Affordance for 3D Objects from Pre-trained 2D Diffusion Models}, 
      author={Hyeonwoo Kim and Sookwan Han and Patrick Kwon and Hanbyul Joo},
      year={2024},
      eprint={2401.12978},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2401.12978}, 
}

Acknowledgements

Our codebase builds heavily on
- <a href="https://github.com/jellyheadandrew/CHORUS">CHORUS</a>
- <a href="https://github.com/vchoutas/smplx">SMPL/SMPL-X</a>
- <a href="https://github.com/mks0601/Hand4Whole_RELEASE">Hand4Whole</a>
- <a href="https://github.com/CompVis/stable-diffusion">Stable Diffusion</a>
- <a href="https://github.com/facebookresearch/detectron2">Detectron2/PointRend</a>
- <a href="https://github.com/nghorbani/human_body_prior">VPoser</a>
- <a href="https://github.com/markomih/COAP">COAP</a>
Thanks for open-sourcing!
We thank <a href="https://vi.kaist.ac.kr/project/hyeon-seong-kim/">Hyeonseong Kim</a> and <a href="https://bjkim95.github.io/">Byungjun Kim</a> for valuable comments!

License

This work is licensed under a <a href="https://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>. However, please note that our code depends on other libraries (e.g., <a href="https://smpl.is.tue.mpg.de/">SMPL</a>), which each have their own respective licenses that must also be followed.