Awesome
CellCentroidFormer
Hybrid CNN-ViT model for cell detection in biomedical microscopy images.
Model architecture
The comparison of ViTs and CNNs in computer vision applications reveals that their receptive fields are fundamentally different (Raghu et al., 2021). The receptive fields of ViTs capture local and global information in both earlier and later layers. The receptive fields of CNNs, on the other hand, initially capture local information and gradually grow to capture global information in later layers. Therefore, we use MobileViT blocks (Mehta and Rastegari, 2022) in the neck part of our proposed model to enhance global information compared to a fully convolutional neck part. We represent cells by their centroid, their width, and their height. Our model contains two fully convolutional heads to predict these cell properties. The first head predicts a heatmap for cell centroids, and the second head predicts the cell dimensions (width and height) at the position of the corresponding cell centroid.
Self-supervised pre-training: Pseudo-colorize masked cells
(a) Pseudo-colorization of fluorescence microscopy images and the corresponding colormaps. (b) Masking schemes and masked fluorescence microscopy images. MAE (He et al., 2021) masks cover 75% of images, whereas our proposed padded masks contain smaller patches and cover 33%. Image areas masked by our padded masking scheme are highlighted in white here to enhance their visibility. During pre-training, these areas are set to zero. (c) Proposed pre-training objective: Pseudo-colorize masked cells.
Conference Paper
CellCentroidFormer: Combining Self-attention and Convolution for Cell Detection, Wagner, Royden and Rohr, Karl, MIUA 2022; arXiv (arXiv:2206.00338)
Citation
@inproceedings{wagner2022cellcentroidformer,
title={CellCentroidFormer: Combining Self-attention and Convolution for Cell Detection},
author={Royden Wagner and Karl Rohr},
booktitle={Medical Image Understanding and Analysis},
year={2022}
}
Acknowledgements
The subclass implementation of the MobileViT block (Mehta and Rastegari, 2022) in this repo is based on the functional implementation by Sayak Paul.
The @compact_get_layers
class decorator is inspired by the get method by Danijar Hafner and the nn.compact decorator in flax.
The normalized termperature-scaled cross-entropy loss for SimCLR (Chen et al., 2020) is based on the implementation by András Béres.