Awesome
Description
pylandsat is a Python package that allows you to search and download Landsat scenes from the public dataset hosted on Google Cloud. Additionally, it includes a set of classes and methods to access and preprocess the downloaded scenes.
Only Landsat Collection 1 is supported, i.e. level-1 data products from the following sensors and satellite missions:
- Landsat 8 OLI/TIRS
- Landsat 7 ETM+
- Landsat 4-5 TM
- Landsat 1-5 MSS
Installation
pip install pylandsat
Command-line interface
Download one or multiple scenes
Usage
Usage: pylandsat download [OPTIONS] [PRODUCTS]...
Download a Landsat product according to its identifier.
Options:
-d, --output-dir PATH Output directory.
-f, --files TEXT Comma-separated list of files to download.
--help Show this message and exit.
Examples
# Download an entire product in the current directory
pylandsat download LE07_L1TP_205050_19991104_20170216_01_T1
# Download multiple products
pylandsat download \
LE07_L1TP_205050_19991104_20170216_01_T1 \
LE07_L1TP_206050_19991111_20170216_01_T1
# Download only the blue, green and red bands
pylandsat download --files B1.TIF,B2.TIF,B3.TIF \
LE07_L1TP_205050_19991104_20170216_01_T1
# Download only quality band
pylandsat download --files BQA.TIF \
LE07_L1TP_205050_19991104_20170216_01_T1
Search for scenes
To allow large and fast queries, pylandsat works with a local dump of the Landsat catalog hosted on Google Cloud. As such, an initial sync is required :
# Sync local Landsat catalog
pylandsat sync-database
# Force update
pylandsat sync-database -f
The database is stored in a local directory that can be displayed using the following command :
pylandsat print-datadir
Once the database has been created, the local catalog can be queried.
Usage
Usage: pylandsat search [OPTIONS]
Search for scenes in the Google Landsat Public Dataset catalog.
Options:
-b, --begin TEXT Begin search date (YYYY-MM-DD).
-e, --end TEXT End search date (YYYY-MM-DD).
-g, --geojson PATH Area of interest (GeoJSON file).
-l, --latlon FLOAT... Point of interest (decimal lat/lon).
-p, --path INTEGER WRS2 path.
-r, --row INTEGER WRS2 row.
-c, --clouds FLOAT Max. cloud cover percentage.
-s, --sensors TEXT Comma-separated list of possible sensors.
-t, --tiers TEXT Comma-separated list of possible collection tiers.
--slcoff Include SLC-off LE7 scenes.
-o, --output PATH Output CSV file.
--help Show this message and exit.
At least three options must be provided: --begin
and --end
(i.e. the period of interest), and a geographic extent (--path
and --row
, --latlon
, --address
or --geojson
). By default, pylandsat lists all the product IDs matching the query. The full response can be exported to a CSV file using the --output
option. Note that is the spatial extent is provided as a GeoJSON file, only the first feature will be considered.
Examples
# If only the year is provided, date is set to January 1st
pylandsat search \
--begin 1999 --end 2000 \
--path 206 --row 50 \
--clouds 0
# Using latitude and longitude
pylandsat search \
--begin 2000 --end 2010 \
--latlon 50.85 4.34
# Using a polygon in a GeoJSON file
pylandsat search \
--begin 2000 --end 2010 \
--geojson brussels.geojson
# Using an address that will be geocoded
pylandsat search \
--begin 2000 --end 2010 \
--address 'Brussels, Belgium'
# Limit to TM and ETM sensors
pylandsat search \
--begin 1990 --end 2010 \
--address 'Brussels, Belgium' \
--sensors LT04,LT05,LE07
# Export results into a CSV file
pylandsat search \
--begin 1990 --end 2010 \
--address 'Brussels, Belgium' \
--sensors LT04,LT05,LE07 \
--output scenes.csv
# List available sensors, i.e. possible values
# for the `--sensors` option
pylandsat list-sensors
# List available files for a given sensor
pylandsat list-available-files LT05
Python API
Search the catalog
from datetime import datetime
from shapely.geometry import Point
from pylandsat import Catalog, Product
catalog = Catalog()
begin = datetime(2010, 1, 1)
end = datetime(2020, 1, 1)
geom = Point(4.34, 50.85)
# Results are returned as a list
scenes = catalog.search(
begin=begin,
end=end,
geom=geom,
sensors=['LE07', 'LC08']
)
# Get the product ID of the first scene
product_id = scenes[0].get("product_id")
# Download the scene
product = Product(product_id)
product.download(out_dir='data')
# The output of catalog.search() can be converted to a DataFrame
# for further processing. For instance:
# Get the product ID of the scene with the lowest cloud cover
import pandas as pd
df = pd.DataFrame.from_dict(scenes)
df.set_index(["product_id"], inplace=True)
df = df.sort_values(by='cloud_cover', ascending=True)
product_id = df.index[0]
Load and preprocess data
import numpy as np
import rasterio
import matplotlib.pyplot as plt
from pylandsat import Scene
# Access data
scene = Scene('data/LE07_L1TP_205050_19991104_20170216_01_T1')
print(scene.available_bands())
print(scene.product_id)
print(scene.sensor)
print(scene.date)
# Access MTL metadata
print(scene.mtl['IMAGE_ATTRIBUTES']['CLOUD_COVER_LAND'])
# Quality band
plt.imshow(scene.quality.read())
# Access band data
nir = scene.nir.read(1)
red = scene.red.read(1)
ndvi = (nir + red) / (nir - red)
# Access band metadata
print(scene.nir.bname)
print(scene.nir.fname)
print(scene.nir.profile)
print(scene.nir.width, scene.nir.height)
print(scene.nir.crs)
# Use reflectance values instead of DN
nir = scene.nir.to_reflectance()
# ..or brightness temperature
tirs = scene.tirs.to_brightness_temperature()
# Save file to disk
with rasterio.open('temperature.tif', 'w', **scene.tirs.profile) as dst:
dst.write(tirs, 1)