Home

Awesome

S3 Download Tuner

What is this?

In this example, we use the open-source parameter tuning library Syne Tune to learn a good configuration of boto3's Download_file method, for a given file and client. Using this code, we were able to reduce the download time of various files (5GiB, 10GiB) up to ~40% vs boto3 default transfer configuration. Consider this repo as a one-off creative demo for Syne Tune. Note that download speed depends on many factors and even within a given configuration, may vary across attempts.

What is Syne Tune?

Syne Tune is a distributed parameter search library open-sourced by AWS AI researchers. It integrates several state-of-the art optimization concepts:

While it has its roots in machine learning and machine learning task optimization, Syne Tune is use-case agnostic and can be used for generic complex system tuning, beyond machine learning: it explores and finds arbitrary, user-defined configuration that optimizes a user-defined metric.

Read more here:

How to run this demo?

To get started, open the Jupyter notebook Demo.ipynb and follow the instructions

At any time after the tuner is launched, you can open and run the notebook Evaluation.ipynb to check the status of the parameter search.

To make the demo portable, we wrap the Syne Tune tuning code in a Python script that can be invoked the following way:

    python launcher.py \
        --bucket $bucket \
        --key $key \
        --file_path $file_path \
        --file_name $file_name \
        --init random \
        --max_tuning_time 10800

Our script takes the following arguments:

Areas of improvement

Here are ideas to search faster - we may add those improvements in the future

  1. Parallelism: run the experiments on remote transient workers (such as SageMaker jobs), so that several configurations can be tested in parallel

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.