Home

Awesome

Purpose

Provide a command line ability to download some, or all, of the public/authorized users files in an AWS S3 bucket as well as all of the XML that lists its contents, whether the key is public or not.

Requirements

Python3 - Thanks to dreamflasher (https://github.com/dreamflasher) for migrating from Python2 to 3!

Reason

As I was going through, looking for public AWS S3 buckets that contained PII, I realized that I wanted to be able to download the XML and a subset of data to show companies what data they had exposed. I didn't want to do this manually and I wanted to be able to have ALL of the XML (AWS paginates S3 content per 1k keys).

Use

Just get the XML, downloaded to the working directory under a the subfolder [bucket_name]<br> ./download_bucket.py -n [bucket_name] -x

Download the whole bucket to /home/foo/bar/[bucket_name]<br> ./download_bucket.py -o /home/foo/bar -n [bucket_name] -d

Download only where "test" is in the key and get all of the XML<br> ./download_bucket.py -n [bucket_name] -d -x -i test

Download where "test" is in the key but "exclude me" is not in the key<br> ./download_bucket.py -n [bucket_name] -d -i test -e "exclude me"

Download everything starting after thisfile.txt on public readable downloads, e.g. if you don't want to paginate through again<br> ./download_bucket.py -n [bucket_name] -d --last_key "thisfile.txt"

Download using an API key (e.g. for buckets that allow any authenticated user to access it) ./download_bucket.py -n [bucket_name] -d -ak "AWS_ACCESS_KEY" -sk "AWS_SECRET_KEY"

Notes