Home

Awesome

EMMA

Ecological Monitoring and Management Application (EMMA)

This is the core repository for environmental data processing in the Ecological Monitoring and Management Application EMMA.io.

EMMA workflow overview

The EMMA workflow consists of four modules, each with a separate Github repo:

  1. The Environmental Data module (https://github.com/AdamWilsonLab/emma_envdata)
  2. The Modelling and Change Detection module (https://github.com/AdamWilsonLab/emma_model)
  3. The Change Classification module (https://github.com/AdamWilsonLab/emma_change_classification)
  4. The Reporting module (https://github.com/AdamWilsonLab/emma_report)

File structure

The most important files are:

├── _targets.R (data processing workflow and dependency management)
├── R/
├──── [data_processing_functions]
├── data/
├──── manual_download (files behind firewalls that must be manually downloaded)
├──── raw_data (raw data files downloaded by the workflow)
├──── processed_data (data processed and stored by the workflow)
└── Readme.Rmd (this file)

Files generated by the workflow are stored in the targets-runs branch. The final output of the workflow is a set of parquet files stored as Github releases with the tag “current”.

Workflow structure

<!-- -->

Workflow Notes

Runtime and frequency

Github places some constraints on actions, including memory limits and run time limits. To prevent this workflow from taking too long to run (and thereby losing all progress), there are a few key parameters that can be changed. In the _targets.R file, the argument “max_layers” controls the maximum number of layers that rgee will attempt to download in one action run. When initially setting up the repo, it may be necessary to lower this value and increase the frequency that the targets workflow is run (by adjusting the cron parameters in targets.yaml). Github also limits the rates of requests, and so the file release_data.R includes a call to Sys.sleep that can be adjusted to slow down/speed up the process of pushing data to a Github release.

Data notes

* MODIS NDVI values have been transformed to save space.  To restore them to the original values (between -1 and 1), divide by 100 and subtract 1.
* Untransformed NDVI = (transformed NDVI / 100) - 1
* Raw MODIS fire dates (tag:raw_fire_modis): values are either 0 (no fire) or the day of the year a fire was observed (1 through 366).
* Processed MODIS fire dates (tag: processed_fire_dates: values are either 0 (no fire) or the UNIX date (days since 1 Jan. 1970) a fire was observed.

Data layers

Setting up the repo

* This repo requires github credentials.  To store those securely...
* Credentials are decrypted with the function decryp_secret.sh

Extras

* Call `targets::tar_renv(extras = character(0))` to write a `_packages.R` file to expose hidden dependencies.
* Call `renv::init()` to initialize the `renv` lockfile `renv.lock` or `renv::snapshot()` to update it.
* Commit `renv.lock` to your Git repository.