Home

Awesome

SMALify

<img src="docs/badja_result.gif">

This repository contains an implementation for performing 3D animal (quadruped) reconstruction from a monocular image or video. The system adapts the pose (limb positions) and shape (animal type/height/weight) parameters for the SMAL deformable quadruped model, as well as camera parameters until the projected SMAL model aligns with 2D keypoints and silhouette segmentations extracted from the input frame(s).

The code can be thought of as a modernization of the fitting code used in Creatures Great and SMAL paper; Chainer/ChumPy has been replaced with PyTorch, OpenDR replaced with PyTorch3D etc. However, I have also included some recent innovations from the Who Left the Dogs Out? such as the inclusion of limb scaling parameters, and an improved shape prior.

The aim of this repository is to provided demonstrative fitting code to benefit computer vision researchers but also those working in animal/veterinary science. In either case, I'd be delighted to hear from you!

Installation

  1. Clone the repository with submodules and enter directory

    git clone --recurse-submodules https://github.com/benjiebob/SMALify
    cd SMALify
    

    Note: If you don't clone with submodules you won't get the sample data from BADJA/StanfordExtra/SMALST.

  2. Install dependencies, particularly PyTorch (cuda support recommended), Pytorch3D. Check requirements.txt for full details.

  3. Download BADJA videos and unzip to badja_extra_videos.zip.

  4. Inspect the directory paths in config.py and make sure they match your system.

QuickStart: Running the Fitter

Running on alternative data

Alternative BADJA/StanfordExtra sequences:

Running on your own data

The first job is to generate keypoint/silhouette data for your input image(s). I recommend using LabelMe, which is fantastic software that makes annotating keypoints / silhouettes efficient.

Building your own quadruped deformable model

If you want to represent an animal quadruped category which isn't covered by the SMAL model (e.g. perhaps you want to reconstruct rodents/squirrels), you can use the fitter_3d tool. The basic idea is to fit the existing SMAL model to a collection of 3D artist meshes (you can download online) and thereby learn a new shape space. More information is given in the README.

Improving performance and general tips and tricks

Loss ComponentExplanationTips for Loss Weight
2D Keypoint ReprojectionProject the SMAL model with latest parameters and compare projected joints to input 2D keypointsIf your model limbs don't match well with the input keypoints after fitting, it may be worth increasing this.
3D Shape PriorUsed to constrain the 3D shapes to be 'animal-like'. Note that (unlike equivalent human approaches that use mocap etc.) only artist data is used for this.If your reconstructed animals don't look like animals, try increasing this.
3D Pose PriorUsed to contains the 3D poses to be anatomically plausible.If your reconstructed animals have limb configurations which are very unreasonable, e.g. legs in strange places, try increasing this.
2D SilhouetteProject the SMAL model with latest parameters and compare rendered silhouette to input 2D silhouette.If the shape of your reconstructed animal doesn't match well (e.g. maybe it's too thin?), try increasing this.
TemporalConstrain the change in SMAL parameters between frames. (Only for videos)If your limbs move unnaturally between video frames, try adapting this.

Note that to avoid poor local minima, the optimization proceeds over multiple stages and the weights vary at each stage. For example, only once an approximate camera location has been found should there be 2D joint loss, and only once an approximate set of limb positions have been found should there be a 2D silhouette loss.

Acknowledgements

This repository owes a great deal to the following works and authors:

If you find this fitting code and/or BADJA dataset useful for your research, please consider citing the following paper:

@inproceedings{biggs2018creatures,
  title={{C}reatures great and {SMAL}: {R}ecovering the shape and motion of animals from video},
  author={Biggs, Benjamin and Roddick, Thomas and Fitzgibbon, Andrew and Cipolla, Roberto},
  booktitle={ACCV},
  year={2018}
}

if you make use of the limb scaling parameters, or Unity shape prior (on by default for the dog shape family) or the StanfordExtra dataset please cite Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop:

@inproceedings{biggs2020wldo,
  title={{W}ho left the dogs out?: {3D} animal reconstruction with expectation maximization in the loop},
  author={Biggs, Benjamin and Boyne, Oliver and Charles, James and Fitzgibbon, Andrew and Cipolla, Roberto},
  booktitle={ECCV},
  year={2020}
}

Contribute

Please create a pull request or submit an issue if you would like to contribute.

Licensing

(c) Benjamin Biggs, Oliver Boyne, James Charles, Andrew Fitzgibbon and Roberto Cipolla. Department of Engineering, University of Cambridge 2020

As of 02-NOV-2024, this dataset is now MIT licensed. Enjoy!

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.