Home

Awesome

<div align="left">

You Only Look at Once for Real-time and Generic Multi-Task

This repository(Yolov8 multi-task) is the official PyTorch implementation of the paper "You Only Look at Once for Real-time and Generic Multi-Task".

You Only Look at Once for Real-time and Generic Multi-Task

by Jiayuan Wang, Q. M. Jonathan Wu<sup> :email:</sup> and Ning Zhang

(<sup>:email:</sup>) corresponding author.

IEEE Transactions on Vehicular Technology


The Illustration of A-YOLOM

YOLOv8-multi-task

Contributions

Results

Parameters and speed

ModelParametersFPS (bs=1)FPS (bs=32)
YOLOP7.9M26.0134.8
HybridNet12.83M11.726.9
YOLOv8n(det)3.16M102802.9
YOLOv8n(seg)3.26M82.55610.49
A-YOLOM(n)4.43M39.9172.2
A-YOLOM(s)13.61M39.796.2

Traffic Object Detection Result

ModelRecall (%)mAP50 (%)
MultiNet81.360.2
DLT-Net89.468.4
Faster R-CNN81.264.9
YOLOv5s86.877.2
YOLOv8n(det)82.275.1
YOLOP88.676.5
A-YOLOM(n)85.378.0
A-YOLOM(s)86.981.1

Drivable Area Segmentation Result

ModelmIoU (%)
MultiNet71.6
DLT-Net72.1
PSPNet89.6
YOLOv8n(seg)78.1
YOLOP91.6
A-YOLOM(n)90.5
A-YOLOM(s)91.0

Lane Detection Result:

ModelAccuracy (%)IoU (%)
EnetN/A14.64
SCNNN/A15.84
ENet-SADN/A16.02
YOLOv8n(seg)80.522.9
YOLOP84.826.5
A-YOLOM(n)81.328.2
A-YOLOM(s)84.928.8

Ablation Studies 1: Adaptive concatenation module:

Training methodRecall (%)mAP50 (%)mIoU (%)Accuracy (%)IoU (%)
YOLOM(n)85.277.790.680.826.7
A-YOLOM(n)85.37890.581.328.2
YOLOM(s)86.981.190.983.928.2
A-YOLOM(s)86.981.19184.928.8

Ablation Studies 2: Results of different Multi-task model and segmentation structure:

ModelParametersmIoU (%)Accuracy (%)IoU (%)
YOLOv8(segda)100427578.1--
YOLOv8(segll)1004275-80.522.9
YOLOv8(multi)200855084.281.724.3
YOLOM(n)1588090.680.826.7

YOLOv8(multi) and YOLOM(n) only display two segmentation head parameters in total. They indeed have three heads, we ignore the detection head parameters because this is an ablation study for segmentation structure.

Notes:


Visualization

Real Road

Real Rold


Requirement

This codebase has been developed with Python==3.7.16 with PyTorch==1.13.1.

You can use a 1080Ti GPU with 16 batch sizes. That will be fine. Only need more time to train. We recommend using a 4090 or more powerful GPU, which will be fast.

We strongly recommend you create a pure environment and follow our instructions to build yours. Otherwise, you may encounter some issues because the YOLOv8 has many mechanisms to detect your environment package automatically. Then it will change some variable values to further affect the code running.

cd YOLOv8-multi-task
pip install -e .

Data preparation and Pre-trained model

Download

We recommend the dataset directory structure to be the following:

# The id represent the correspondence relation
├─dataset root
│ ├─images
│ │ ├─train2017
│ │ ├─val2017
│ ├─detection-object
│ │ ├─labels
│ │ │ ├─train2017
│ │ │ ├─val2017
│ ├─seg-drivable-10
│ │ ├─labels
│ │ │ ├─train2017
│ │ │ ├─val2017
│ ├─seg-lane-11
│ │ ├─labels
│ │ │ ├─train2017
│ │ │ ├─val2017

Update the your dataset path in the ./ultralytics/datasets/bdd-multi.yaml.

Training

You can set the training configuration in the ./ultralytics/yolo/cfg/default.yaml.

python train.py

You can change the setting in train.py

# setting

sys.path.insert(0, "/home/jiayuan/ultralytics-main/ultralytics")
# You should change the path to your local path to "ultralytics" file
model = YOLO('/home/jiayuan/ultralytics-main/ultralytics/models/v8/yolov8-bdd-v4-one-dropout-individual.yaml', task='multi')
# You need to change the model path for yours.
# The model files saved under "./ultralytics/models/v8" 
model.train(data='/home/jiayuan/ultralytics-main/ultralytics/datasets/bdd-multi-toy.yaml', batch=4, epochs=300, imgsz=(640,640), device=[4], name='v4_640', val=True, task='multi',classes=[2,3,4,9,10,11],combine_class=[2,3,4,9],single_cls=True)

Evaluation

You can set the evaluation configuration in the ./ultralytics/yolo/cfg/default.yaml

python val.py

You can change the setting in val.py

# setting

sys.path.insert(0, "/home/jiayuan/yolom/ultralytics")
# The same with train, you should change the path to yours.

model = YOLO('/home/jiayuan/ultralytics-main/ultralytics/runs/best.pt')
# Please change this path to your well-trained model. You can use our provide the pre-train model or your model under "./ultralytics/runs/multi/Your Project Name/weight/best.pt"
metrics = model.val(data='/home/jiayuan/ultralytics-main/ultralytics/datasets/bdd-multi.yaml',device=[3],task='multi',name='val',iou=0.6,conf=0.001, imgsz=(640,640),classes=[2,3,4,9,10,11],combine_class=[2,3,4,9],single_cls=True)

Prediction

python predict.py

You can change the setting in predict.py

# setting 

sys.path.insert(0, "/home/jiayuan/ultralytics-main/ultralytics")
number = 3 #input how many tasks in your work, if you have 1 detection and 3 segmentation tasks, here should be 4.
model = YOLO('/home/jiayuan/ultralytics-main/ultralytics/runs/best.pt')  
model.predict(source='/data/jiayuan/dash_camara_dataset/daytime', imgsz=(384,672), device=[3],name='v4_daytime', save=True, conf=0.25, iou=0.45, show_labels=False)
# The predict results will save under "runs" folder

PS: If you want to use our provided pre-trained model, please make sure that your input images are (720,1280) size and keep "imgsz=(384,672)" to achieve the best performance, you can change the "imgsz" value, but the results maybe different because he is different from the training size.

Note

Citation

If you find our paper and code useful for your research, please consider giving a star :star: and citation :pencil: :

@ARTICLE{wang2024you,
  author={Wang, Jiayuan and Wu, Q. M. Jonathan and Zhang, Ning},
  journal={IEEE Transactions on Vehicular Technology}, 
  title={You Only Look at Once for Real-Time and Generic Multi-Task}, 
  year={2024},
  pages={1-13},
  keywords={Multi-task learning;panoptic driving perception;object detection;drivable area segmentation;lane line segmentation},
  doi={10.1109/TVT.2024.3394350}}