Awesome
AI4ESS Summer School Hackathon 2020
This repository contains Jupyter notebooks for the challenge problems of the 2020 AI for Earth System Science Summer School.
Contributors
- Aaron Bansemer
- Charlie Becker
- Chih-Chieh (Jack) Chen
- David John Gagne
- Gabrielle Gantos
- Andrew Gettelman
- Matt Hayman
- Alma Hodzic
- Karthik Kashinath
- Rich Loft
- Keely Lawrence
- Taysia Peterson
- Gunther Wallach
- Siyaun Wang
- Ankur Mahesh
- Natasha Flyer
Challenge Problems
-
GOES : Predict the probability of lightning in the next hour from GOES-16 infrared imagery.
-
HOLODEC : Accelerate the processing of an airborne holographic cloud particle imager with deep learning.
-
GECKO : Emulate the GECKO-A complex organic chemistry model with a simpler machine learning approach.
-
Microphysics : Emulate the warm rain processes within the Tel Aviv University (TAU) spectral bin microphysics scheme for use within the Community Atmosphere Model.
-
El Niño : Predict the intensity of ENSO from spatial observations and climate model output.
Requirements
The hackathon notebooks require the following Python libraries and at least Python 3.7 installed:
- numpy
- pip
- scipy
- matplotlib
- pandas
- tqdm
- s3fs
- pyyaml
- netcdf4
- xarray
- h5netcdf
- dask
- pyarrow
- tensorflow
- scikit-learn
- goes16ci
- mlmicrophysics
- jupyter
- zarr
Setup
The hackathon notebooks can be run from 3 platforms: Jupyterhub, Google Colab, and locally. Jupyterhub and Google Colab run entirely within the browser, so all you will require is a modern internet browser and a strong web connection. Advanced users can install and run on their local machine, but deep learning training performance may be degraded by the lack of GPU.
Jupyterhub
If you registered for the hackathon, we will send you an email with a link to the jupyterhub website. There you can log in with the Gmail or G-Suite account you provided at registration. A progress bar will then appear, followed by a Jupyter lab instance. The notebooks will be preinstalled in the ai4ess-hackathon-2020/notebooks directory. Your instance come with 30 GB of storage that will persist over the course of the Hackathon even if you log out. Tensorflow and scikit-learn are the main ML libraries being used in the Hackathon. PyTorch is also installed on
Google Colab
If you are not part of the hackathon, you can still run the hackathon notebooks in the cloud through Google Colab. For a given hackathon problem, click the Open in Colab link above to open a challenge problem notebook in Colab. To install needed dependencies, run the ! pip install ...
cell at the beginning of the notebook. If you want to save the changes you made to a notebook, under the File menu you can save to Google Drive, Github, or your local computer.
Local Machine
You can also run the notebooks on your own computer or a remote cluster. First install Miniconda. Then install the main dependencies with
conda install -c conda-forge numpy scipy matplotlib pandas tqdm s3fs pyyaml netcdf4 xarray h5netcdf dask pyarrow scikit-learn jupyter ipython pip
.
You will need need pip to install the remaining libraries:
pip install tensorflow goes16ci mlmicrophysics
.
Next, clone this repository locally with
git clone https://github.com/NCAR/ai4ess-hackathon-2020.git
.
Finally start Jupyter Lab in your home directory
jupyter lab
Inside jupyter lab navigate to ai4ess-hackathon-2020/notebooks to access the hackathon notebooks. You will need internet access to load the data. Some of the datasets are 10-20 GB in size, so please be careful about streaming or downloading them locally if you have data caps from your internet service provider.