Awesome

StabStitch++: Unsupervised Online Video Stitching with Spatiotemporal Bidirectional Warps

Introduction

Lang Nie1, Chunyu Lin1, Kang Liao2, Yun Zhang3, Shuaicheng Liu4, Yao Zhao1

1 Beijing Jiaotong University {nielang, cylin, yzhao}@bjtu.edu.cn

2 Nanyang Technological University

3 Communication University of Zhejiang

4 University of Electronic Science and Technology of China

Feature

Compared with the conference version (StabStitch), the main contributions of StabStitch++ are as follows:

We propose a differentiable bidirectional decomposition module to carry out bidirectional warping on a virtual middle plane, which evenly spreads warping burdens across both views. It benefits both image and video stitching, demonstrating universality and scalability.

A new warp smoothing model is presented to simultaneously encourage content alignment, trajectory smoothness, and online collaboration. Different from StabStitch that sacrifices alignment for stabilization, the new model makes no compromise and optimizes both of them in the online mode. The above figure shows the difference between StabStitch and StabStitch++.

Performance Comparison

	Method	Alignment(PSNR/SSIM) $\uparrow$	Stability $\downarrow$	Distortion $\downarrow$	Inference Speed $\uparrow$
1	StabStitch	29.89/0.890	48.74	0.674	35.5fps
2	StabStitch++	30.88/0.898	41.70	0.371	28.3fps

The performance and speed are evaluated on the StabStitch-D dataset with one RTX4090 GPU.

Video

We have released a video of our results on YouTube.

📝 Changelog

2024.10.11: The repository of StabStitch++ is created.
2024.10.14: Release the video of our results.
2024.10.16: Release the collected traditional datasets.
2024.10.17: Release the inference code and pre-trained models.
2024.10.17: Release the training code.
2024.10.17: Release the inference code to stitch multiple videos.
Release the paper of StabStitch++ (journal version of StabStitch).

Dataset

For the StabStitch-D dataset, please refer to StabStitch.

For the collected traditional datasets, they are available at Google Drive or Baidu Cloud(Extraction code: 1234).

Code

Requirement

python 3.8.5
numpy 1.19.5
pytorch 1.13.1+cu116
torchvision 0.14.1+cu116
opencv-python-headless 4.5.1.48
scikit-image 0.15.0
tensorboard 2.9.0

We implement this work with Ubuntu, RTX4090Ti, and CUDA11. Refer to environment.yml for more details.

How to run it

Inference with our pre-trained models: please refer to Full_model_inference/readme.md.
- Train the spatial warp model: please refer to SpatialWarp/readme.md.
- Train the temporal warp model: please refer to TemporalWarp/readme.md.
- Train the warp smoothing model: please refer to SmoothWarp/readme.md.

References

[1] L. Nie, C. Lin, K. Liao, Y. Zhang, S. Liu, R. Ai, Y. Zhao. Eliminating Warping Shakes for Unsupervised Online Video Stitching. ECCV, 2024.
[2] L. Nie, C. Lin, K. Liao, S. Liu, and Y. Zhao. Parallax-Tolerant Unsupervised Deep Image Stitching. ICCV, 2023.
[3] S. Liu, P. Tan, L. Yuan, J. Sun, and B. Zeng. Meshflow: Minimum latency online video stabilization. ECCV, 2016.