Home

Awesome

<p align="center"><img src="docs/_static/img/logo.png" width=400 /></p>

pipeline status coverage report pypi version

Documentation | Examples

AutoDist is a distributed deep learning training engine for TensorFlow. AutoDist provides a user-friendly interface to distribute the training of a wide variety deep learning models across many GPUs with scalability and minimal code change.

Introduction

Different from specialized distributed ML systems, AutoDist is created to speed up a broad range of DL models with excellent all-round performance. AutoDist achieves this goal by:

Besides all these advanced features, AutoDist is designed to isolate the sophistication of distributed systems from ML prototyping and exposes a simple API that makes it easy to use and switch between different distributed ML techniques for users of all levels.

<p align="center"><img src="docs/_static/img/Figure1.png" width=400 /><img src="docs/_static/img/Figure2.png" width=400 /></p>

For a closer look at the performance, please refer to our doc.

Using AutoDist

Installation:

pip install autodist

Modifying existing TensorFlow code to use AutoDist is easy:

import tensorflow as tf
from autodist import AutoDist

ad = AutoDist(resource_spec_file="resource_spec.yml")

with tf.Graph().as_default(), ad.scope():
    ########################################################
    # Build your (single-device) model here,
    #   and train it distributedly.
    ########################################################
    sess = ad.create_distributed_session()
    sess.run(...)

Ready to try? Please refer to the examples in our Getting Started page.

References & Acknowledgements

We learned and borrowed insights from a few open source projects including Horovod, Parallax, and tf.distribute.