Home

Awesome

ED-DCFNet: An Unsupervised Encoder-decoder Neural Model for Event-driven Feature Extraction and Object Tracking

<p align="center"> <img src="./figures/ED-DCFNet_architecture.png" width="600"> </p> <p align="center"> <img src="./figures/framework_design.png" width="600"> </p>

Title: ED-DCFNet: An Unsupervised Encoder-decoder Neural Model for Event-driven Feature Extraction and Object Tracking

Abstract: Neuromorphic cameras feature asynchronous event-based pixel-level processing and are particularly useful for object tracking in dynamic environments. Current approaches for feature extraction and optical flow with high-performing hybrid RGB-events vision systems require large computational models and supervised learning which impose challenges for embedded vision and require annotated datasets. In this work we propose ED-DCFNet, a small and efficient (< 72k) unsupervised multi-domain learning framework which extracts events-frames shared features without requiring annotations with comparable performance. Furthermore we introduce an open-sourced event and frame-based dataset that captures in-door scenes with various lighting and motion-type conditions in realistic scenarios which can be used for model building and evaluation. The dataset is available at https://github.com/NBELab/UnsupervisedTracking.

Autohrs: Raz Ramon, Hadar Cohen-Duwek, Elishai Ezra Tsur

Publication: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 2191-2199 (link)

Table of Contents

  1. Requirements
  2. Usage
  3. Dataset
  4. Citation
  5. License

Requirements

Usage

Tracking

  1. Download the dataset from Dataset, and follow the readme in the data folder.

  2. Navigate to the 'track' directory in the repository:

cd ./track
  1. Run the 'track_dvs_dataset.py':
python track_dvs_dataset.py -- name XXXX

The testing accepts several command line arguments:

usage: track_dvs_dataset.py [-h] --name NAME [--weights WEIGHTS] [--data-path DATA_PATH] [--annotation-path ANNOTATION_PATH] [--data-info-path DATA_INFO_PATH] [--num-bins NUM_BINS]
                            [--track-events] [--no-track-events] [--track-frames] [--no-track-frames] [--track-combined] [--no-track-combined] [--eval-only] [-p P]

Run tracker on new dataset

optional arguments:
  -h, --help            show this help message and exit
  --name NAME           Test name
  --weights WEIGHTS     Path to model weights (default: ./models/model.pth.tar)
  --data-path DATA_PATH
                        Path to test data (default: ./data/)
  --annotation-path ANNOTATION_PATH
                        Path to annotation data (default: ./data/annotations/)
  --data-info-path DATA_INFO_PATH
                        Path to data information (default: ./data)
  --num-bins NUM_BINS, -b NUM_BINS
                        Number of temporal bins, must be the same as the network (default: 5)
  --track-events, -e    Track on events (default: True)
  --no-track-events
  --track-frames, -f    Track on frames (default: True)
  --no-track-frames
  --track-combined, -c  Track on events and frames combined (default: True)
  --no-track-combined
  --eval-only           Evaluate on previous results (default: False)
  -p P                  p (as in the paper), for reduction of frame net output (default:None)

The files are configured to run the tracking of the new dataset.

Training

  1. Create the training dataset
full/path/to/each/training/file.h5
  1. Update the Config File
  1. Run the Training Script
python train/train.py
  1. Output Weights
{datetime}_{num_bins}.pth.tar

Datasets

Our dataset link:(link)

OTB2015 dataset: in order to use this, you will need to follow these instructions:

  1. Download the OTB2015 dataset (link).
  2. Download ROS and follow the following guide of ESIM usage: (link).
  3. Convert the dataset to events using the ESIM.

Training dataset: can be found here (link).

Citation

If you use our work in your research, please cite it using the following BibTeX entry:

@InProceedings{Ramon_2024_CVPR,
    author    = {Ramon, Raz and Cohen-Duwek, Hadar and Tsur, Elishai Ezra},
    title     = {ED-DCFNet: An Unsupervised Encoder-decoder Neural Model for Event-driven Feature Extraction and Object Tracking},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
    month     = {June},
    year      = {2024},
    pages     = {2191-2199}
}

License

MIT License