We propose a more realistic and general setting for multi-task dense prediction problems, called multi-task partially-supervised learning (MTPSL) where not all task labels are available in each training image (Fig. 1(b)), which generalizes over the standard supervised learning (Fig. 1(a)) where all task labels are available. And we propose a novel and architecture-agnostic MTL model that penalizes cross-task consistencies between pairs of tasks in joint pairwise task-spaces, each encoding the commonalities between pairs, in a computationally efficient manner (Fig. 1(c)).

Learning Multiple Dense Prediction Tasks from Partially Annotated Data,
Wei-Hong Li, Xialei Liu, Hakan Bilen,
CVPR 2022 (arXiv 2111.14893)


Prepare dataset

We use the preprocessed NYUv2 dataset and Cityscapes dataset provided by this repo. Download the dataset and place the dataset folder in ./data/


The easiest way is to download our pre-trained models learned with our proposed cross-task consistency learning and evaluate it on the validation set. To download the pretrained model, one can use gdown (installed by pip install gdown) and execute the following command in the root directory of this project:

gdown https://drive.google.com/uc?id=1s9x8neT9SYR2M6C89CvbeID3XlBRJoEw && md5sum nyuv2_pretrained.zip && unzip nyuv2_pretrained.zip -d ./results/ && rm nyuv2_pretrained.zip

This will donwnload the pre-trained models and place them in the ./results directory.

One can evaluate these model by:

CUDA_VISIBLE_DEVICES=<gpu_id> python nyu_eval.py --dataroot ./data/nyuv2 --ssl-type onelabel --model ./results/nyuv2/mtl_xtc_onelabel.pth.tar

Train our method

Training our method with SegNet for multi-task partially-supervised learning settings, e.g. one-label and random-label settings. In one-label setting, i.e. one task label per image, we learn cross-task consistency for multi-task partially-supervised learning:

CUDA_VISIBLE_DEVICES=<gpu_id> python nyu_mtl_xtc.py --out ./results/nyuv2 --ssl-type onelabel --dataroot ./data/nyuv2 

One may train our method that learns cross-task consistency for multi-task learning with full supervision (--ssl-type full):

CUDA_VISIBLE_DEVICES=<gpu_id> python nyu_mtl_xtc.py --out ./results/nyuv2 --ssl-type full --dataroot ./data/nyuv2 

Train supervised learning baselines

CUDA_VISIBLE_DEVICES=<gpu_id> python nyu_stl_sl.py --out ./results/nyuv2 --ssl-type onelabel --dataroot ./data/nyuv2 --task semantic 
CUDA_VISIBLE_DEVICES=<gpu-id> python nyu_mtl_sl.py --out ./results/nyuv2 --ssl-type onelabel --dataroot ./data/nyuv2

Train on Cityscapes

Similar to experiments on NYUv2, one may train the STL, MTL, and our method on Cityscapes. For example, to train our method on Cityscapes:

CUDA_VISIBLE_DEVICES=<gpu_id> python cityscapes_mtl_xtc.py --out ./results/cityscapes2 --ssl-type onelabel --dataroot ./data/cityscapes2 

Training the provided code on Cityscapes will result different performances than the reported numbers in the paper. But, the rankings stay the same. For comparing models in the paper, please re-run the model with your preferred training strategies (learning rate, optimizer, etc) and keep all training strategies consistent for all compared methods for fair comparison.


We thank authors of MTAN and Multi-Task-Learning-PyTorch for their source code.


For any question, you can contact Wei-Hong Li.


