Awesome
MaskFormer: Per-Pixel Classification is Not All You Need for Semantic Segmentation
Bowen Cheng, Alexander G. Schwing, Alexander Kirillov
<div align="center"> <img src="https://bowenc0221.github.io/images/maskformer.png" width="100%" height="100%"/> </div><br/>Mask2Former
Checkout Mask2Former, a universal architecture based on MaskFormer meta-architecture that achieves SOTA on panoptic, instance and semantic segmentation across four popular datasets (ADE20K, Cityscapes, COCO, Mapillary Vistas).
Features
- Better results while being more efficient.
- Unified view of semantic- and instance-level segmentation tasks.
- Support major semantic segmentation datasets: ADE20K, Cityscapes, COCO-Stuff, Mapillary Vistas.
- Support ALL Detectron2 models.
Installation
See installation instructions.
Getting Started
See Preparing Datasets for MaskFormer.
See Getting Started with MaskFormer.
Model Zoo and Baselines
We provide a large set of baseline results and trained models available for download in the MaskFormer Model Zoo.
License
The majority of MaskFormer is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
However portions of the project are available under separate license terms: Swin-Transformer-Semantic-Segmentation is licensed under the MIT license.
<a name="CitingMaskFormer"></a>Citing MaskFormer
If you use MaskFormer in your research or wish to refer to the baseline results published in the Model Zoo, please use the following BibTeX entry.
@inproceedings{cheng2021maskformer,
title={Per-Pixel Classification is Not All You Need for Semantic Segmentation},
author={Bowen Cheng and Alexander G. Schwing and Alexander Kirillov},
journal={NeurIPS},
year={2021}
}