Home

Awesome

FlexFlow

build gpu tests multinode gpu tests docker pip shell-check clang-format Documentation Status

FlexFlow is a deep learning framework that accelerates distributed DNN training by automatically searching for efficient parallelization strategies. FlexFlow provides a drop-in replacement for PyTorch and TensorFlow Keras. Running existing PyTorch and Keras programs in FlexFlow only requires a few lines of changes to the program.

Install FlexFlow

To install FlexFlow from source code, please read the instructions. If you would like to quickly try FlexFlow, we also provide pre-built Docker packages for several versions of CUDA and for the hip_rocm backend, together with Dockerfiles if you wish to build the containers manually. More info on the Docker images can be found here. You can also use conda to install the FlexFlow Python package (coming soon).

PyTorch Support

Users can also use FlexFlow to optimize the parallelization performance of existing PyTorch models in two steps. First, a PyTorch model can be exported to the FlexFlow model format using flexflow.torch.fx.torch_to_flexflow.

import torch
import flexflow.torch.fx as fx

model = MyPyTorchModule()
fx.torch_to_flexflow(model, "mymodel.ff")

Second, a FlexFlow program can directly import a previously saved PyTorch model and autotune the parallelization performance for a given parallel machine.

from flexflow.pytorch.model import PyTorchModel

def top_level_task():
  torch_model = PyTorchModel("mymodel.ff")
  output_tensor = torch_model.apply(ffmodel, input_tensor)
  ## Model compilation
  ffmodel.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
  ## Model training
  (x_train, y_train) = cifar10.load_data()
  ffmodel.fit(x_train, y_train, epochs=30)

More FlexFlow PyTorch examples: see the pytorch examples folder.

TensorFlow Keras and ONNX Support

FlexFlow prioritizes PyTorch compatibility, but also includes frontends for Tensorflow Keras and ONNX models.

C++ Interface

For users that prefer to program in C/C++. FlexFlow supports a C++ program inference that is equivalent to its Python APIs.

More FlexFlow C++ examples: see the C++ examples folder.

Command-Line Flags

In addition to setting runtime configurations in a FlexFlow Python/C++ program, the FlexFlow runtime also accepts command-line arguments for various runtime parameters:

FlexFlow training flags:

Legion runtime flags:

Performance auto-tuning flags:

Contributing

Please let us know if you encounter any bugs or have any suggestions by submitting an issue.

We welcome all contributions to FlexFlow from bug fixes to new features and extensions.

Citations

The Team

FlexFlow is developed and maintained by teams at CMU, Facebook, Los Alamos National Lab, MIT, and Stanford (alphabetically).

License

FlexFlow uses Apache License 2.0.