Awesome
Updates
- [2021.09.17] Code for flying guide dog prototype and the Pedestrian and Vehicle Traffic Lights (PVTL) dataset are released.
Flying Guide Dog
Official implementation of paper "Flying Guide Dog: Walkable Path Discovery for the Visually Impaired Utilizing Drones and Transformer-based Semantic Segmentation".
Overview
config/
: Configexperiment/
: Config yaml files for different experimentsdefault.py
: Default config
drone/
: Drone initialization and controlmodels/
: Deep Learning modelssegmentation/
: Segmentation modelstraffic_light_classification/
: Traffic light classification models
utils/
: Helper functions and scripts
Drone
The drone used in this project is DJI Tello.
<img src="assets/DJI-Tello.png" alt="DJI-Tello" style="zoom: 50%;" />Requirements
Python 3.7 or later with all requirements.txt dependencies installed, including torch>=1.7
.
To install run:
pip install -r requirements.txt
SegFormer
-
Install
mmcv-full
To use SegFormer, you need to install
mmcv-full==1.2.7
. For example, to installmmcv-full==1.2.7
withCUDA 11
andPyTorch 1.7.0
, use the following command:pip install mmcv-full==1.2.7 -f https://download.openmmlab.com/mmcv/dist/cu110/torch1.7.0/index.html
To install
mmcv-full
with different version of PyTorch and CUDA, please see: MMCV Installation. -
Use submodule
SegFormer
-
Initialize the submodule(s):
git submodule init
-
Run the update to pull down the files:
git submodule update
-
-
Install the dependencies of
SegFormer
:pip install -e models/segmentation/SegFormer/ --user
-
Copy config file to
SegFormer/
cp models/segmentation/segformer.b0.768x768.mapillary.160k.py models/segmentation/SegFormer/local_configs/segformer/B0
Models
Two types of models are used: street view semantic segmentation and traffic lights classification.
Street view semantic segmentation
We adopt SegFormer-B0 (trained on Mapillary Vistas for 160K iterations) for street-view semantic segmentation based on each frame captured by the drone.
Traffic lights classification
We create a custom traffic lights dataset named Pedestrian and Vehicle Traffic Lights (PVTL) Dataset using traffic lights images cropped from Cityscapes, Mapillary Vistas, and PedestrianLights. The PVTL dataset can be downloaded from Google Drive.
It containes 5 classes: Others, Pedestrian-red, Pedestrian-green, Vehicle-red, and Vehicle-green. Each class contains about 300 images. Train-validation split is 3:1.
<img src="assets/traffic_light_eg.png" alt="Traffic light dataset examples" style="zoom: 67%;" />We train 2 models on this dataset:
- ResNet-18: We fine-tune ResNet-18 from
torchvision.models
. After 25 epochs training, the accuracy achieves around 90%. - Simple CNN model: We build our custom simple CNN model (5 CONV + 3 FC). After 25 epochs training, the accuracy achieves around 83%.
Trained weights
-
Create
weights
folder and its subfoldersegmentation
andtraffic_light_classification
mkdir -p weights/segmentation weights/traffic_light_classification
-
Download trained weights from Google Drive and put them into corresponding folders
Usage
-
Choose a config file in
config/experiment/
, e.g.config/experiment/segformer-b0_720x480_fp32.yaml
. You can also create your custom config file by adjusting the default config. -
Run
python main.py --cfg <config_file>
Model specified in the config file will be loaded.
For example:
python main.py --cfg config/experiment/segformer-b0_720x480_fp32.yaml
-
Turn on DJI Tello. Connect to drone's wifi.
-
Run
python main.py --cfg <config_file> --ready
For example:
python main.py --cfg config/experiment/segformer-b0_720x480_fp32.yaml --ready
After initialization for a few seconds, an OpenCV window will pop up. Then press
T
to take off. During flying, the drone will keep discovering walkable areas and try to keep itself in the middle as well as follow along the walkable path. When pedestrian traffic light occurs in drone's FOV, it will react based on the classification prediction of pedestrian traffic light signal.Another keyboard controls:
L
: Land temporarily. You can pressT
to take off again.Q
: Land and exit.Esc
: Emergency stop. All motors will stop immediately.
Demo Video
Citation
@article{tan2021flying,
title={Flying Guide Dog: Walkable Path Discovery for the Visually Impaired Utilizing Drones and Transformer-based Semantic Segmentation},
author={Tan, Haobin and Chen, Chang and Luo, Xinyu and Zhang, Jiaming and Seibold, Constantin and Yang, Kailun and Stiefelhagen, Rainer},
journal={arXiv preprint arXiv:2108.07007},
year={2021}
}
Acknowledgements
Great thanks for these open-source repositories:
-
DJI Tello drone python interface: DJITelloPy