Home

Awesome

Harmony

This repository contains the source code implementation of the following papers:

This work was done as part of Microsoft Research's Project Fiddle. This source code is available under the MIT License.

Directory Structure

Setup

To run Harmony, the easiest way is to use the standard nvidia's container (nvcr.io/nvidia/pytorch:20.03-py3) which satisfies most dependencies. It can be launched by:

./launch.sh

Once getting into the container, the remaining dependencies can be satisified by running:

./install.sh

Note:

Dataset

End-to-end Workflow

The end-to-end workflow of Harmony can be illustrated by the figure below:

<img src="Overview3.jpg" alt="drawing" width="80%"/>

For example, to run a BERT-Large with Harmony, we can go through following steps:

Decompose model into per-layer code

cd harmony/1_decomposer/bert_thomwolf && ./run_bert_large.sh

Profile each layer

cd ../../2_profiler/bert_thomwolf && ./run_bert_large.sh

Search the best schedule

cd ../../3_scheduler && ./run_bert_large.sh

Run the best schedule

cd ../4_runtime/bert_thomwolf && ./run_bert_large.sh

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

License

Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

Reference

If you find the code helpful, citing our papers would be appreciated : )

@article{VLDB22Harmony,
    title = {{Harmony: Overcoming the Hurdles of GPU Memory Capacity to Train Massive DNN Models on Commodity Servers}}, 
    author = {Youjie Li and Amar Phanishayee and Derek Murray and Jakub Tarnawski and Nam Sung Kim},
    journal = {The 48th International Conference on Very Large Databases (VLDB'22)},
    year = {2022},
    address = {Sydney, Australia},
    month = sep
}

@inproceedings{HotOS21Harmony,
    title = {{Doing More with Less: Training Large DNN Models on Commodity Servers for the Masses}},
    author = {Youjie Li and Amar Phanishayee and Derek Murray and Nam Sung Kim},
    booktitle = {Workshop on Hot Topics in Operating Systems (HotOS’21)},
    year = {2021},
    address = {Ann Arbor, MI, USA},
    month = jun
}