Awesome

<p align="center"> <h1 align="center"><strong>[ECCV2024] Watching it in Dark: A Target-aware Representation Learning Framework for High-Level Vision Tasks in Low Illumination</strong></h1> <p align="center"> Yunan Li&emsp; Yihao Zhang&emsp; Shoude Li&emsp; Long Tian&emsp; Dou Quan&emsp; Chaoneng Li&emsp; Qiguang Miao&emsp; <br> <em>Xidian University; Xi'an Key Laboratory of Big Data and Intelligent Vision</em> <br> <br> <em><a href="https://github.com/ZhangYh994/WiiD" style="color:blue;">Code</a> | <a href="https://www.ecva.net/papers/eccv_2024/papers_ECCV/papers/09518.pdf" style="color:blue;">Paper</a> | <a href="https://www.ecva.net/papers/eccv_2024/papers_ECCV/papers/09518-supp.pdf" style="color:blue;">Supp</a></em> <br> </p> </p> <div id="top" align="center"> This is the official implementaion of paper <b><i>Watching it in Dark: A Target-aware Representation Learning Framework for High-Level Vision Tasks in Low Illumination</i></b>, which is accepted in <b><i>ECCV 2024</i></b>. In this paper, we propose a target-aware representation learning framework designed to improve high-level task performance in low-illumination environments. We achieve a bi-directional domain alignment from both image appearance and semantic features to bridge data across different illumination conditions. To concentrate more effectively on the target, we design a target highlighting strategy, incorporated with the saliency mechanism and Temporal Gaussian Mixture Model to emphasize the location and movement of task-relevant targets. We also design a mask token-based representation learning scheme to learn a more robust target-aware feature. Our framework ensures compact and effective feature representation for high-level vision tasks in low-lit settings. Extensive experiments conducted on CODaN, ExDark, and ARID datasets validate the effectiveness of our approach for a variety of image and video-based tasks, including classification, detection, and action recognition. </div> <div style="text-align: center;"> <img src="assets/principle.png" alt="Dialogue_Teaser" width=100% > </div>

👀TODO

First Release.
Release Code of Image Classification.
- ResNet18 on CODaN
- ResNet50 on COCO&ExDark
Release Code of Object Detection.
Release Code of Action Recognition.

🌏 Pipeline of WiiD

📚 Dataset

Original Data
Common Objects Day and Night (CODaN)
Exclusively Dark Image Dataset (ExDark)
The PASCAL Visual Object Classes (VOC)
Microsoft Common Objects in Context (COCO)
normal light data of action recognition(CVPR'22 UG2 challenge)
low light data of action recognition(ARID dataset)
Processed Data
Preprocessed CODaN Dataset
...

🐒 Model Zoo

Object Detection
Pre-Trained YOLOv5m
CUT Darken Model
Image Classification
Our Pre-trained Model
ResNet-18 Baseline
CUT Darken Model
Action Recognition
...

💻 Code

🕴️Object Detection

Dataset Preparation

We utlized the VOCO dataset(part of the VOC dataset and COCO dataset. The exact composition can be found in Supplementary, and we will also be uploading our training data in a few days), you can apply the Zero-DCE method to enhance low-light data in the test night folder. For darkening the data in the train folder, you may either train the CUT model yourself using unpaired normal-light and low-light data for darkening the normal-light data, or directly use our pre-trained model parameters.

Model Preparation

We use the pre-trained YOLOv5m model. You can directly place it in the ./weights folder.

Training

You can run the following command to train the model:

python train_byol.py --weights weights/yolov5m.pt --cfg models/yolov5m.yaml --data data/Exdark_night.yaml --batch-size 8 --epochs 30 --imgsz 608 --hyp data/hyps/hyp.scratch-high.yaml --back_ratio 0.3 --byol_weight 0.1.

--back_ratio is used to specify the background occlusion ratio. --byol_weight is used to specify the weight for contrastive learning.

Evaluation

You can run the following command to validate the model:

python val.py --data data/Exdark_night.yaml --batch-size 8 --weights runs/train/exp1/weights/best.pt --imgsz 608 --task test --verbose

🐱Image Classification

ResNet18 on CODaN

Dataset Preparation

We utilized the CODaN dataset, you can apply the Zero-DCE method to enhance low-light data in the test_night folder. For darkening the data in the train folder, you may either train the CUT model yourself using unpaired normal-light and low-light data for darkening the normal-light training data, or directly use our pre-trained model parameters.

Of course, you can download our preprocessed CODaN dataset directly and put it under ./classification/resnet18/data/. In this version, the test_night_zdce folder contains data that has been enhanced for low-light conditions using Zero-DCE, and the train_day2night folder contains data darkened.

Models Preparation

We use a pre-trained ResNet-18 as the baseline, which you can download and place in ./classification/resnet18/checkpoints/baseline_resnet.

Training

Run train.sh in ./classification/resnet18 or the following command to start training:

python train.py --use_BYOL \
  --checkpoint  'checkpoints/baseline_resnet/model_best.pt' \
  --experiment  'your_own_folder'

Use --checkpoint to specify the pre-trained model and --experiment to set the storage location for model checkpoints and logs.

Our training log is provided in ./classification/resnet18/checkpoints/our_train_log.txt, which you can use as a reference.

Evaluation

Run test.sh in ./classification/resnet18 to evaluate the model's performance or to validate our pre-trained model.

python test.py  --checkpoint 'checkpoints/train/model_best.pt'

ResNet50 on COCO&ExDark

coming soon

⛹️Action Recognition

coming soon

Citation

If our work is useful for your research, please consider citing:

@inproceedings{li2025watching,
  title={Watching it in Dark: A Target-Aware Representation Learning Framework for High-Level Vision Tasks in Low Illumination},
  author={Li, Yunan and Zhang, Yihao and Li, Shoude and Tian, Long and Quan, Dou and Li, Chaoneng and Miao, Qiguang},
  booktitle={ECCV},
  year={2024}
}

Acknowledgment

This work is heavily based on CIConv, YOLOv5, ARID and Similarity Min-Max. Thanks to all the authors for their great work.