Awesome

Eztorch

Introduction

Eztorch is a library to make training, validation, and testing in Pytorch easy to perform image and video self-supervised representation learning and evaluate those representations on downstream tasks.

It was first developed to factorize code during Julien Denize's PhD thesis which was on Self-supervised representation learning and applications to image and video analysis. It led to several academic contributions:

Similarity Contrastive Estimation for Self-Supervised Soft Contrastive Learning (WACV 2023)
Similarity Contrastive Estimation for Image and Video Soft Contrastive Self-Supervised Learning (MVAP 2023)
COMEDIAN: Self-Supervised Learning and Knowledge Distillation for Action Spotting using Transformers (WACV Workshops 2024)

To ease the use of the code, documentation has been built.

How to Install

To install this repository you need to install a recent version of Pytorch (>= 2.) and all Eztorch dependencies.

You can just launch the following command:

cd eztorch
conda create -y -n eztorch
conda activate eztorch
conda install -y pip
conda install -y -c conda-forge libjpeg-turbo
pip install -e .
pip uninstall -y pillow
CC="cc -mavx2" pip install -U --force-reinstall pillow-simd

The argument -e makes a dev installation that allows you to make changes in the repository without needing to install the package again. It is optional.

If you want a lighter installation that only installs the main dependencies you need the requirement file by requirements_lite.txt and then launch the pip install.

How to use

Read tutorials on Pytorch-Lightning and Hydra to be sure to understand those libraries.
Take a look at Eztorch documentation.
Use configs in eztorch/configs/run/ or make your own
Pass your config to running scripts in run/ folder.

Eztorch is a library, therefore you can import its components from anywhere as long as your Python environment has Eztorch installed.

from eztorch.models.siamese import SCEModel

model = SCEModel(...)

Dependencies

Eztorch relies on various libraries to handle different parts of the pipeline:

Why do something worse than people who know best?

Its main dependencies are:

Pytorch-lightning for easy setup of:
- Preparing data through the datamodules
- Models through the Lightning modules
- Training, validating, and testing on various device types (CPU, GPU, TPU) with or without distributed training through the trainer
Hydra to make configuration of your various experiments:
- Write configurations in Python or Yaml
- Enjoy hierarchical configuration
- Let Hydra instantiate
- Speak the same language in Bash or Python to configure your jobs
Torchaug for efficient GPU and batched data augmentations as a replacement to Torchvision when relevant.

For specific dependencies, we can cite:

Timm to instantiate image models
Pytorchvideo for video pipeline:
- Clip samplers to select one or multiple clips per video
- Datasets with decoders to read videos
- Specific transforms for videos
- Models for videos

How to contribute

To contribute follow this process:

Make an issue if you find it necessary to discuss the changes with maintainers.
Checkout to a new branch.
Make your modifications.
Document your changes.
Ask for merging to main.
Follow the merging process with maintainers.

Issue

If you found an error, have trouble making this work or have any questions, please open an issue to describe your problem.

License

This project is under the CeCILL license 2.1.