Home

Awesome

Multimodal Material Segmentation, CVPR2022

This repository provides an inplementation of our paper Multimodal Material Segmentation in CVPR 2022. If you use our code and data, please cite our paper.

@InProceedings{Liang_2022_CVPR,
    author    = {Liang, Yupeng and Wakaki, Ryosuke and Nobuhara, Shohei and Nishino, Ko},
    title     = {Multimodal Material Segmentation},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022},
    pages     = {19800-19808}
}

This implemenatation is based on pytorch-deeplab-xception. The diff can be found at pytorch-deeplab-xception-20191218.patch.

Please note that this is research software and may contain bugs or other issues – please use it at your own risk. If you experience major problems with it, you may contact us, but please note that we do not have the resources to deal with all issues.

Download data

The Multimodal Material Segmentation dataset (MCubeS dataset) is available in Google Drive (multimodal_dataset.zip, 9.27GB). Uncompress the folder and move it to /dataset/multimodal_dataset/

The data should organized in the following format:

multimodal_dataset
├── polL_dolp
│   ├──outscene1208_2_0000000150.npy
│   ├──outscene1208_2_0000000180.npy
    ...
├── polL_color
│   ├──outscene1208_2_0000000150.png
│   ├──outscene1208_2_0000000180.png
    ...
├── polL_aolp_sin
├── polL_aolp_cos
├── list_folder
│   ├──all.txt
│   ├──test.txt
│   ├──train.txt
│   └──val.txt
├── SS
├── SSGT4MS
├── NIR_warped_mask
├── NIR_warped
└── GT

Details of MCubeS dataset

Overview

MCubeS captures the visual appearance of various materials found in daily outdoor scenes from a viewpoint on a road, pavement, or sidewalk. At each viewpoint, we capture images with three fundamentally different imaging modalities, RGB, polarization (represented as Aolp and Dolp), and near-infrared (NIR). There are 500 images sets in MCubeS dataset, and each pixel in RGB image is labeled as one of 13 material classes.

<p align="center"> <img src="img/Fig1.png"> </p>

Material labels

We define 20 distinct materials by thoroughly examining the data we captured in the MCubeS dataset. Examples of all materials are shown below.

<p align="center"> <img src="img/Fig2.png"> </p> Here are some annotation examples. <p align="center"> <img src="img/Fig3.png"> </p>

Statistics

The pixel numbers of each material class in train, val and test set are shown below.

<p align="center"> <img src="img/Fig4.png"> </p>

Pretrained model

A pretrained model is available in Google Drive (chekpoint.pth.tar, 1.8GB). You can use this file as run/multimodal_dataset/MCubeSNet/experiment_0/checkpoint.pth.tar as specified in main_test_multimodal.sh below to run the test code.

Usage

Prerequisites

We tested our code with Python 3.6.12 on Ubuntu 18.04 LTS. Our code depends on the following modules.

To install this package with pip, use the following command:

$ python3 -m pip install -r requirements.txt

in your (virtual) environment.

Prepare of the semantic segmentation guidance

Train a semantic segmentation network model by images in /dataset/multimodal_dataset/polL_color/ and /dataset/multimodal_dataset/SSGT4MS/. Put the semantic segmentation result images in /dataset/multimodal_dataset/SSmask/.

Training and testing

$ sh main_train_multimodal.sh
$ sh main_test_multimodal.sh