Awesome
<div align="center">[NeurIPS 2024] Hawk: Learning to Understand Open-World Video Anomalies
<div align="center">This is the official repository for Hawk.
Jiaqi Tang^, Hao Lu^, Ruizheng Wu, Xiaogang Xu, Ke Ma, Cheng Fang,
Bin Guo, Jiangbo Lu, Qifeng Chen and Ying-Cong Chen*
^: Equal contribution. *: Corresponding Author.
<img src="figs/icon.png" alt="Have eyes like a HAWK!" width="80"> </div> </div>π Motivation - Have eyes like a Hawk!
-
π© Current VAD systems are often limited by their superficial semantic understanding of scenes and minimal user interaction.
-
π© Additionally, the prevalent data scarcity in existing datasets restricts their applicability in open-world scenarios.
<div align="center"> <img src="figs/motivation1.svg" alt="Hawk"> </div>
π’ Updates
- β Step 26, 2024 - Hawk is accepted by NeurIPS 2024.
- β July 29, 2024 - We release the dataset of Hawk. Check this Google Cloud link for DOWNLOAD.
βΆοΈ Getting Started
<!-- 1. [Installation](#installation) 2. [Dataset](#dataset) 3. [Configuration](#configuration) 5. [Testing](#Testing) 4. [Training](#Training) -->πͺ Installation
<!-- - *Python >= 3.8.2* - *PyTorch >= 1.8.1* - *Install [Polanalyser](https://github.com/elerac/polanalyser) for processing polarization image* ``` pip install git+https://github.com/elerac/polanalyser ``` - *Install other dependencies by* ``` pip install -r requirements.txt ``` -->πΎ Dataset Preparation
-
DOWNLOAD all video datasets for their original sources.
-
Google Drive Link to DOWNLOAD our annotations.
-
Data Structure: each folder contains one annotation file (e.g. CUHK Avenue, DoTA, etc.). The
All_Mix
directory contains all of the datasets in training and testing. -
The dataset is organized as follows:
data βββ All_Mix β βββ all_videos_all.json β βββ all_videos_test.json β βββ all_videos_train.json β βββ CUHK_Avenue β βββ Avenue.json βββ DoTA β βββ DoTA.json βββ Ped1 β βββ ... βββ ... βββ UCF_Crime βββ ...
NoteοΌthe data path should be redefined.
π° Pretrained Model
<!-- - Google Drive Link for downloading our [Pretrained Model](https://drive.google.com/file/d/13Cn7tX5bFBxsYZG1Haw5VcqhSxWnNzMW/view?usp=sharing) in K-Ford. -->π¨ Configuration
<!-- - The configuration files for [`testing`](FilmRemoval/codes/options/test/test.yml) and [`training`](FilmRemoval/codes/options/train/train.yml). - The Test_K_ford option specifies the number of folds for K-fold cross-validation during testing. The data root option specifies the root directory for the dataset, which is set to Dataset. Other configuration settings include learning rate schemes, loss functions, and logger options. ``` datasets: train: name: Reconstruction mode: LQGT_condition Test_K_ford: K10 # remove from training dataroot: /remote-home/share/jiaqi2/Dataset dataroot_ratio: ./ use_shuffle: true n_workers: 0 batch_size: 1 GT_size: 0 use_flip: true use_rot: true condition: image val: name: Reconstruction mode: LQGT_condition_Val Test_K_ford: K10 # for testing dataroot: /remote-home/share/jiaqi2/Dataset dataroot_ratio: ./ condition: image ``` -->β³ Testing
<!-- - Modify `dataroot`, `Test_K_ford` and `pretrain_model_G` in [`testing`](FilmRemoval/codes/options/train/test.yml) configuration, then run ``` python test.py -opt ./codes/options/test/test.yml ``` - The test results will be saved to `./results/testset_name`, including `Restored Image` and `Prior`. -->π₯οΈ Training
<!-- - Modify `dataroot` and `Test_K_ford` in [`training`](FilmRemoval/codes/options/train/train.yml) configuration, then run ``` python train.py -opt ./codes/options/train/train.yml ``` - The logs, models and training states will be saved to `./experiments/name`. You can also use `tensorboard` for monitoring for the `./tb_logger/name`. - Restart Training (To add checkpoint in [`training`](FilmRemoval/codes/options/train/train.yml) configuration) ``` path: root: ./ pretrain_model_G: .../experiments/K1/models/XX.pth strict_load: false resume_state: .../experiments/K1/training_state/XX.state ``` -->β‘ Performance
<!-- Compared with other baselines, our model achieves state-of-the-art performance: > β **[Table 1] Quantitative evaluation in image reconstruction with 10-fold cross-validation.** > | Methods | PSNR | SSIM | > |---------|------|------| > | SHIQ| 21.58 | 0.7499 | > | Polar-HR| 22.19 | 0.7176 | > | Uformer| 31.68 | 0.9426 | > | Restormer| 34.32 | 0.9731 | > | Ours| 36.48 | 0.9824 | > β **[Figure 1] Qualitative Evaluation in image reconstruction.** > ![](fig/image-1.png) > β **[Figure 2-3] Qualitative Evaluation in Industrial Environment. (QR Reading & Text OCR)** > ![](fig/image-2.png) -->π Citations
The following is a BibTeX reference:
@inproceedings{atang2024hawk,
title = {Hawk: Learning to Understand Open-World Video Anomalies},
author = {Tang, Jiaqi and Lu, Hao and Wu, Ruizheng and Xu, Xiaogang and Ma, Ke and Fang, Cheng and Guo, Bin and Lu, Jiangbo and Chen, Qifeng and Chen, Ying-Cong},
year = {2024},
booktitle = {Neural Information Processing Systems (NeurIPS)}
}
π§ Connecting with Us?
If you have any questions, please feel free to send email to jtang092@connect.hkust-gz.edu.cn
.