Awesome
Action Classification
What's this?
A benchmark of video action classification for common CNN architectures. Implemented in PyTorch v1.2, recent version should also work.
Supported networks
Network | #parameters (exclude final classifier) |
---|---|
2d3d-ResNet18 | 31.82 M |
2d3d-ResNet34 | 60.80 M |
2d3d-ResNet50 | 44.74 M |
3d-ResNet18 | 33.15 M |
3d-ResNet34 | 63.46 M |
3d-ResNet50 | 46.14 M |
I3D | 12.29 M |
S3D | 7.910 M |
S3D-G | 9.098 M |
Files
backbone/
has all backbone modelsmodel.py
gives an example of classifier with S3D backbone.
Notes
- benchmarks will come soon
Link
- ResNet-2d3d is used in SlowFast, DPC, etc.
- ResNet-3d is used in many papers, early ones like Hara et al.
- I3D from Carreira and Zisserman
- S3D/S3D-G from Xie et al.