Home

Awesome

LGRNet: Local-Global Reciprocal Network for Video Polyp Segmentation Paper | BibTeX

Huihui Xu, Yijun Yang(📈), Angelica Aviles-Rivero, Guang Yang, Jing Qin, and Lei Zhu

PWC

PWC

PWC

PWC

📈: the UFUV dataset in the original paper may be made open-accessed in future, please email them for permissions.

This is the official implmentation of LGRNet (MICCAI'24 Early Accept), which incorporates local Cyclic Neighborhoold Propagation and global Hilbert Selective Scan. Together with the notion of Frame Bottleneck Queries, LGRNet can both efficiently and effectively aggregate the local-global temporal context, which achieves state-of-the-art on the public Video Polyp Segmentation(VPS) benchmark.

<div align="justify">As an example for ultrasound video, a single frame is too noisy and insufficient for accurate lesion diagnosis. In practice, doctors need to check neighboring frames(local) and collect all visual clues (global) in the video to predict possible lesion region and filter out irrelevent surrounding issues. </div> </br> <div align="center" style="padding: 0 100pt"> <img src="assets/images/pipeline.png"> </div> </br> <div align="justify"> In CNP, each token takes the neighborhood tokens (defined by a kernel) in the cyclic frame as attention keys. CNP enables aggregating the local(cyclic) temporal information into one token. In Hilbert Selective Scan, a set of frame bottleneck queries are used to aggreate spatial information from each frame. Then, we use Hilbert Selective Scan to efficiently parse the global temporal context based on these bottleneck queries. The global temporal context is then propagated back to the feature maps by a Distribute layer. Based on Mask2Former, the decoder can output a set of different mask predictions with corresponding confidence score, which also facilitates comprehesive diagnosis.</div>

Items

  1. Installation: Please refer to INSTALL.md for more details.

  2. Data preparation: Please refer to DATA.md for more details.

  3. Training:

Change PORT_NUM for DDP and make sure the $CURRENT_TASK is 'VIS':

export CURRENT_TASK=VIS
export MASTER_ADDR=127.0.0.1
export MASTER_PORT=PORT_NUM

Make sure the $PT_PATH and $DATASET_PATH are correctly set during installation and preparing data.

The training on SUN-SEG is conducted using 2 4090-24GB GPUs:

CUDA_VISIBLE_DEVICES=0,1 TORCH_NUM_WORKERS=8 python main.py --config_file output/VIS/sunseg/pvt/pvt.py --trainer_mode train_attmpt
  1. logs, checkpoints, predictions
BackboneDatasetDicemIoulogckptpredictions
PVTv2-B2SUN-SEG-Train----logckpt--
PVTv2-B2SUN-SEG-Hard-Testing0.8760.805logckptmask predictions
PVTv2-B2SUN-SEG-Easy-Testing0.8750.810logckptmask predictions
PVTv2-B2SUN-SEG-Hard-Unseen-Testing0.8650.792logckptmask predictions
PVTv2-B2SUN-SEG-Easy-Unseen-Testing0.8530.783logckptmask predictions
Res2Net-50SUN-SEG-Hard-Testing0.8410.765log
Res2Net-50SUN-SEG-Easy-Testing0.8430.774log
PVTv2-B2CVC612V0.9330.877log
PVTv2-B2CVC300TV0.9160.852log
PVTv2-B2CVC612T0.8750.814log
  1. Evaluate: Evaluating on SUN-SEG-Easy AND SUN-SEG-Hard using 1 4090-24GPU GPUS (modify the ckpt_path to the absolute path):
CUDA_VISIBLE_DEVICES=0 TORCH_NUM_WORKERS=8 python main.py --config_file output/VIS/sunseg/pvt/pvt.py --trainer_mode eval --eval_path ckpt_path

citing

@article{xu2024lgrnet,
  title={LGRNet: Local-Global Reciprocal Network for Uterine Fibroid Segmentation in Ultrasound Videos},
  author={Xu, Huihui and Yang, Yijun and Aviles-Rivero, Angelica I and Yang, Guang and Qin, Jing and Zhu, Lei},
  journal={arXiv preprint arXiv:2407.05703},
  year={2024}
}

Acknowledgments