Awesome
<p align="right">English | <a href="./README_CN.md">็ฎไฝไธญๆ</a></p> <div align="center"> <img src="assets/Logo02.PNG" width="100%" higth="100%"> <h3 align="center"><strong>Segment Any Event Streams via Weighted Adaptation of Pivotal Tokens [CVPR '24] </strong></h3> <p align="center"> <a>Zhiwen Chen</a><sup>1</sup> <a>Zhiyu Zhu</a><sup>2</sup> <a>Yifan Zhang</a><sup>2</sup> <a>Junhui Hou</a><sup>2</sup> <a> Guangming Shi</a><sup>1</sup> <a>Jinjian Wu</a><sup>1</sup> <br> <sup>1</sup>Xidian University <sup>2</sup>City University of Hong Kong </div>About
Official Code for Segment Any Events via Weighted Adaptation of Pivotal Tokens [๐Paper
]. This paper delves into the nuanced challenge of tailoring the Segment Anything Models (SAMs) for integration with event data, with the overarching objective of attaining robust and universal object segmentation within the event-centric domain.
Getting Started
Installation
Clone the repository locally:
pip install git+https://github.com/happychenpipi/EventSAM.git
Create and activate a conda environment and install the required packages:
conda create -n eventsam python=3.8
conda activate eventsam
bash install_eventsam.sh
Data Preparation
In this work, we collected a large-scale RGB-Event dataset for event-centric segmentation, from current available pixel-level aligned datasets (VisEvent and COESOT), namely RGBE-SEG. To explore the zero-shot performance of our method, we showed more segmentation results on MVSEC, DDD17 and DSEC datasets. In addition, we also provide corresponding groundtruth masks or prediction results for comparison. Please download these data with the link below and put in ./data.
Datasets Code๏ผ1234 Groundtruths Code๏ผ1234 Predictions Code๏ผ1234
OneDrive Passwd:1234
Format of All Datasets:
โโโ RGBE_SEG dataset
โโโ Training Subset (472 sequences)
โโโ dvSave-2021_09_01_06_59_10
โโโ event # Event Source File๏ผ [N,4]-[x,y,t,p]
โโโ rgb_image # RGB Images, which is the input of teacher network.
โโโ event_image # Event-oriented Binary Images, which is used for event visualization.
โโโ voxel_image # Event-oriented Voxel-like Images, which is the input of student network.
โโโ ...
โโโ Testing Subset For Normal Scenes (104 sequences) # Easy, Medium, Hard
โโโ dvSave-2021_07_30_11_04_12
โโโ event
โโโ rgb_image
โโโ event_image
โโโ voxel_image
โโโ ...
โโโ Testing Subset For Degraded Scenes (28 sequences) # Low Light, Over Exposure, Motion Blur
โโโ video_0078
โโโ event
โโโ rgb_image
โโโ event_image
โโโ voxel_image
โโโ ...
โโโ MVSEC_SEG/DDD17_SEG/DSEC_SEG dataset
โโโ Testing Subset
โโโ seq_name
โโโ event
โโโ rgb_image
โโโ event_image
โโโ voxel_image
โโโ ...
Format of Groundtruth Masks or Prediction Masks:
โโโ RGBE_SEG dataset
โโโ Testing Subset For Normal Scenes (108 sequences) # Easy, Medium, Hard
โโโ dvSave-2021_07_30_11_04_12
โโโ **.png # Groundtruth Masks/Prediction Masks.
โโโ ...
โโโ MVSEC_SEG/DDD17_SEG/DSEC_SEG dataset
โโโ Testing Subset
โโโ seq_name
โโโ **.png # Groundtruth Masks/Prediction Masks.
โโโ ...
Training
First download a pre-trained model checkpoint (e.g. sam_vit_b.pth) SAM and put in ./pretrained. Then the model can be used as teacher for rgb-event knowledge distillation:
python ./event_encoder/train.py
Pre-trained Model
Pre-trained EventSAM model (e.g. rgbe_encoder.pth) needs to be downloaded and put in ./checkpoints.
EventSAM Model Code๏ผ1234
Evaluation
Predict the segment masks of event images:
python ./evaluate/predict_mask.py
Calculate metrics of predicted masks:
python ./evaluate/calculate_metric.py
Visualization
<div align="center"> <img src="assets/Visual.PNG" width="100%" higth="100%"> </div>EventSAM&LLM
To further validate the strong zero-shot object recognition ability of our event-adapt SAM. We integrate it with a vision-language object segmentation framework LISA. Through this, we could further unlock the rich semantic inherent in SAM, for interactive universal object segmentation with Event data. There are some visualizations.
<div align="center"> <img src="assets/01.gif" width="50%" height="50%" /><img src="assets/02.gif" width="50%" height="50%"/> <img src="assets/03.gif" width="50%" height="50%" /><img src="assets/04.gif" width="50%" height="50%"/> <img src="assets/05.gif" width="50%" height="50%" /><img src="assets/06.gif" width="50%" height="50%"> </div>Acknowledgments
Thanks to VisEvent, COESOT, MVSEC, DDD17, DSEC datasets, SAM and LISA projects.
Contact
Feedbacks and comments are welcome! Feel free to contact us via zhiwen.chen@stu.xidian.edu.cn and zhiyuzhu2-c@my.cityu.edu.hk.
Citing EventSAM
If you use EventSAM in your research, please use the following BibTeX entry.
@InProceedings{Chen_2024_CVPR,
author = {Chen, Zhiwen and Zhu, Zhiyu and Zhang, Yifan and Hou, Junhui and Shi, Guangming and Wu, Jinjian},
title = {Segment Any Event Streams via Weighted Adaptation of Pivotal Tokens},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2024},
pages = {3890-3900}
}