Awesome
Pangeo Forge Azure Bakery βοΈπ
This repository serves as the provider of an Terraform Application which deploys the necessary infrastructure to provide a pangeo-forge
Bakery on Azure
Contents
- π§βπ» Development - Requirements
- π§βπ» Development - Getting Started
- π§βπ» Development - Makefile goodness
- π Deployment - Prerequisites
- π Deployment - Deploying
- π Deployment - Destroying
- π Flows - Registering the test Recipe
Development
Requirements
To develop on this project, you should have the following installed:
- JQ
- Python 3.8.* (We recommend using Pyenv to handle Python versions)
- Poetry
- Azure CLI
- kubectl
- Terraform 0.15.1
- Docker
- Helm
If you're developing on MacOS, all of the above can be installed using homebrew
If you're developing on Windows, we'd recommend using either Git BASH or Windows Subsystem for Linux
Getting Started πββοΈ
NOTE: All make
commands should be run from the root of the repository
Azure Credential setup
To develop and deploy this project, you will first need to setup some credentials and permissions on Azure
Logging in
Run make init
. This will trigger an azure login
Setup azure credentials
Run make service-principal
to create the service principal needed in the later steps.
This script will make the service principal and inject it into the config.
.env
file
A .env
file is expected at the root of the repository to store variables used within deployment, the expected variables are:
# SET BY YOU MANUALLY:
ARM_SUBSCRIPTION_ID="<your-subscription-id>" # See [Development > Getting Started > Azure Credential setup > Getting your Subscription ID]
ARM_TENANT_ID="<your-service-principals-tenant-id>" # See [Development > Getting Started > Azure Credential setup > Creating a Service Principal]
ARM_CLIENT_ID="<your-service-principals-app-id>" # See [Development > Getting Started > Azure Credential setup > Creating a Service Principal]
ARM_CLIENT_SECRET="<your-service-principals-password>" # See [Development > Getting Started > Azure Credential setup > Creating a Service Principal]
TF_VAR_owner="<your-name>"
TF_VAR_identifier="<a-unique-value-to-tie-to-your-deployment>" # Try to keep this short (Less than 10 chars and only a-z A-Z 0-9 symbols)
TF_VAR_region="<azure-region-name-to-deploy-to>"
BAKERY_NAMESPACE="<the-name-for-your-prefect-agent-k8s-configs-namespace>"
BAKERY_IMAGE="<pangeo-forge-bakery-images-image-you-wish-to-use>" # See [Deployment > Prerequisites > Bakery Image]
PREFECT__CLOUD__AGENT__AUTH_TOKEN="<value-of-runner-token>" # See https://docs.prefect.io/orchestration/agents/overview.html#tokens - This is required for your Agent to communicate to Prefect Cloud
PREFECT__CLOUD__AUTH_TOKEN="<value-of-tenant-token>" # See https://docs.prefect.io/orchestration/concepts/tokens.html#tenant - This is used to support flow registration
PREFECT_PROJECT="<name-of-a-prefect-project>" # See https://docs.prefect.io/orchestration/concepts/projects.html#creating-a-project - This is where the bakery's test flows will be registered
PREFECT__CLOUD__AGENT__LABELS="<a-set-of-prefect-agent-labels>" # See https://docs.prefect.io/orchestration/agents/overview.html#labels - These will be registered with the deployed agent to limit which flows should be executed by the agent
# AUTOMATICALLY INSERTED/UPDATED BY MAKE COMMANDS:
TF_CLI_ARGS_init="<backend-config-values>" # See [Deployment - Prerequisites > Terraform Remote State infrastructure]
FLOW_STORAGE_CONTAINER="<a-flow-storage-container-name>" # See [Deployment - Standard Deployments > Retrieving Flow Storage Container names and Storage Connection String]
FLOW_CACHE_CONTAINER="<a-flow-storage-container-name>" # See [Deployment - Standard Deployments > Retrieving Flow Storage Container names and Storage Connection String]
FLOW_STORAGE_CONNECTION_STRING="<a-storage-account-connection-string>" # See [Deployment - Standard Deployments > Retrieving Flow Storage Container name and Storage Connection String]
An example called example.env
is available for you to copy, rename, and fill out accordingly.
Makefile goodness
A Makefile
is available in the root of the repository to abstract away commonly used commands for development:
make init
This will initialise terraform and perform an Azure login
make deploy
This deploy the bakery
make test
This uses the bakery image defined in
BAKERY_IMAGE
to register your Flow with Prefect. It deploys the oisst test recipe from the test directory
make setup-remote-state
This will run
setup_remote_state.sh
with the contents of.env
, it uses Azure CLI to provision a Resource Group, Storage Account, and Storage container for the Remote State that Terraform will use
make destroy-remote-state
This will use Azure CLI to destroy the Remote State infrastructure provisioned via
make setup-remote-state
. The command assumes that the Resource Group is named as defined in thesetup_remote_state.sh
script:<identifier>-bakery-remote-state-resource-group
make lint
This will run
terraform validate
within theterraform/
directory, showing you anything that is incorrect in your Terraform scripts. It also runs isort, black, and flake8 to highlight any linting issues inflow_test/
andscripts/
make destroy
This will run
terraform destroy
within theterraform/
directory using the contents of your.env
file. The destroy is auto-approved, so make sure you know what you're destroying first! You must have runmake setup-remote-state
beforehand
make generate-bakery-yaml
This generates a bakery definition YAML file
Deployment
Prerequisites
Terraform Remote State infrastructure
The Terraform deployment relies on Remote State within Azure, this is so that the state of your deployment is not locked to a file on your local machine.
The concept of Remote State becomes a π and π₯ situation where you need infrastructure available for Terraform to store your state, but you don't want to use Terraform for that infrastructure as that then needs remote state...
The process of provisioning the infrastructure to support Remote State is encapsulated in a bash script: setup_remote_state.sh
. To provision your Remote State infrastructure, run:
$ make setup-remote-stateΒ # Creates the infrastructure to host your Terraform Remote State
<various-azure-cli-ouput>
Found TF_CLI_ARGS_init set in `.env`, replaced with: TF_CLI_ARGS_init="-backend-config='resource_group_name=<identifier>-bakery-remote-state-resource-group' -backend-config='storage_account_name=remotestatestoreacc' -backend-config='container_name=<identifier>-bakery-remote-state-storage-container' -backend-config='access_key=<an-access-key>' -backend-config='key=<identifier>-bakery.state'"
Bakery Image
To be able to register and run Recipes as Prefect Flows, your Bakery must be running one of the pangeo-forge-bakery-images
images in both your Prefect Agent and your Flow & Dask tasks.
You can find more information on the pangeo-forge-bakery-images
here. Once you've selected which tag you wish to support, you need to add an entry into .env
under the name BAKERY_IMAGE
. See below for an example:
BAKERY_IMAGE="pangeo/pangeo-forge-bakery-images:pangeonotebook-2021.05.15_prefect-0.14.19_pangeoforgerecipes-0.3.4"
Deploying
A Standard Deployment of the Azure Bakery comprises of several steps, they are listed below; the links will take you to an explanation of each step.
Should you wish to just deploy the Bakery without diving into these steps, ensure you've met all of the prerequisites for deployment, then you can simply run:
$ make deploy # Deploys all the Azure Bakery infrastructure and prepares `.env` for further usage
Deployment steps
Destroying
Destroying steps
Removal of the Bakery comprises of two steps, they are listed below and further explained as you scroll down:
Destroying all Bakery Azure infrastructure
To destroy the Bakery infrastructure within Azure, you can run:
$ make destroy # Destroys the Bakery infrastructure - ** NOT the Remote State infrastructure **
Destroying the Remote State infrastructure
As the Remote State infrastructure is not provisioned with Terraform, we have to delete it manually with the Azure CLI.
You only need to delete this infrastructure if you're no longer deploying a Bakery; if you're destroying and re-deploying your Bakery frequently, don't delete this infrastructure
To delete the Remote State infrastructure, run:
$ make destroy-remote-state # Uses Azure CLI to destroy the Remote State infrastructure
Flows
Running the test Recipe
For quick testing of your Bakery deployment, there's a Recipe setup as a Flow within flow_test/
that you can register and run. Before you register the example Flow, you must have the values of PREFECT__CLOUD__AUTH_TOKEN
, PREFECT_PROJECT
, PREFECT__CLOUD__AGENT__LABELS
, FLOW_STORAGE_CONTAINER
, FLOW_CACHE_CONTAINER
, FLOW_STORAGE_CONNECTION_STRING
, and BAKERY_IMAGE
present and populated in .env
. You must also have run make deploy
.
When your .env
is populated and you've installed the project dependencies, you can register the Flow by running:
$ make test-flow
...
[2021-04-29 13:28:22+0100] INFO - prefect.Azure | Uploading test-noaa-flow/2021-06-03t10-07-21-944813-00-00 to <identifier>-bakery-flow-storage-container
Flow URL: https://cloud.prefect.io/<your-account>/flow/aef82344-8a31-485b-a189-bc1398755f9e
βββ ID: ca02500f-97ea-4605-9f66-1cccb457a3c0
βββ Project: <PREFECT_PROJECT>
βββ Labels: <PREFECT__CLOUD__AGENT__LABELS>
Generating Bakery YAML files
- To generate a bakery YAML file, run
make generatebakeryyaml
. - The resulting YAML can be added to the bakery definition repo here https://github.com/pangeo-forge/bakery-database/blob/main/bakeries.yaml
Debugging
- Open Lens and add your cluster (this will leverage your updated kubectl config).
- To view pods in your pangeoforge namespace click workloads and select the namespace you specified when deploying.
- Verify your Prefect agent pod is healthy.
To view Dask cluster logs via Grafana
- Get the info needed to access the Grafana instance with
make get-grafana-admin
. - Use Lens to connect to Grafana by navigating Network -> Services and click
loki-grafana
and then click the80:3000/TCP
link and use usernameadmin
and the password obtained in step 1. - Browsing logs
- Return to the main page and select the Explore icon on the left.
- Click Log Browser.
- After running a test flow via
make test-flow
usemake getinfo
to view a list of flow runs. - Select the flow run of interest and a set of Loki search terms will be provided.
- Enter the search term in the Log Browser bar and click Shift+Enter.
- To include additional search terms you can add `| "<your search term>" to the exising string.
Dask dashboard
- Once your flow is running and the Dask cluster pods have been created the Dask dashboard can be accessed at http://localhost:8787.