Home

Awesome

MLOps

<a href="https://github.com/jupyterhub/repo2docker"><img src="https://raw.githubusercontent.com/jupyterhub/repo2docker/71eb8058c790a88d223470a55f3ea5b744614dcf/docs/source/_static/images/repo2docker.png" height="40px" /></a> repo2docker GitHub Action

<!-- TOC depthFrom:1 depthTo:6 withLinks:1 updateOnSave:1 orderedList:0 --> <!-- /TOC -->

Trigger repo2docker to build a Jupyter enabled Docker image from your GitHub repository and push this image to a Docker registry of your choice. This will automatically attempt to build an environment from configuration files found in your repository in the manner described here.

Read the full docs on repo2docker for more information: https://repo2docker.readthedocs.io

Images generated by this action are automatically tagged with both latest and <SHA> corresponding to the relevant commit SHA on GitHub. Both tags are pushed to the Docker registry specified by the user. If an existing image with the latest tag already exists in your registry, this Action attempts to pull that image as a cache to reduce uncessary build steps.

What Can I Do With This Action?

API Reference

See the examples section is very helpful for understanding the inputs and outputs of this Action.

Optional Inputs

Outputs

Testing the built image

You can automatically test your built image to make sure package additions or removals do not break your code, allowing you to make changes with confidence. pytest is used to run the tests, and pytest-notebook is used to run any Jupyter Notebooks as tests.

This works with any Jupyter kernel. This action will use the Jupyter kernel defined in any notebook you put in image-tests/. This can be used to execute and test notebooks from any language.

[!WARNING]
If you are using a Dockerfile to configure your environment, then include COPY --chown=${NB_USER}:${NB_USER} image-tests ${REPO_DIR}/image-tests in your Dockerfile to ensure the correct ownership of this folder within the image.

To use automatic image testing, follow these steps:

  1. Create a directory named image-tests/ in your GitHub repository.
  2. Any .py files you add inside this directory will be discovered and run with pytest inside the built image after the image has successfully built.
  3. Any Jupyter Notebook (.ipynb) files inside this directory will be run with pytest-notebook, and the notebook is considered to have failed if the outputs of the code execution do not match the outputs already in the notebook. A nice diff of the outputs is shown if they differ. See the pytest-notebook docs for more information.
  4. Optionally, a requirements.txt file inside the image-tests/ directory can list additional libraries installed just for the test.

For example, look at the following image environment repository structure:

my-image/
├── environment.yml
└── image-tests
    ├── mytestnotebook.ipynb
    └── mytest.py

This defines three things:

Examples

mybinder.org

A very popular use case for this Action is to cache builds for mybinder.org. If you desire to cache builds for mybinder.org, you must specify the argument MYBINDERORG_TAG. Some examples of doing this are below:

Cache builds on mybinder.org

Proactively build your environment on mybinder.org for any branch. Alternatively, you can use using GitHub Actions to build an image for BindHub generally, including mybinder.org.

name: Binder
on: [push]

jobs:
  Create-MyBinderOrg-Cache:
    runs-on: ubuntu-latest
    steps:
    - name: cache binder build on mybinder.org
      uses: jupyterhub/repo2docker-action@master
      with:
        NO_PUSH: true
        MYBINDERORG_TAG: ${{ github.event.ref }} # This builds the container on mybinder.org with the branch that was pushed on.

Cache Builds On mybinder.org And Provide A Link

Same example as above, but also comment on a PR with a link to the binder environment. Commenting on the PR is optional, and is included here for informational purposes only. In this example the image will only be cached when the pull request is opened but not if the pull request is updated with subsequent commits.

In this example the image will only be cached when the pull request is opened but not if the pull request is updated with subsequent commits.

name: Binder
on:
  pull_request:
    types: [opened, reopened]

jobs:
  Create-Binder-Badge:
    runs-on: ubuntu-latest
    steps:
    - name: cache binder build on mybinder.org
      uses: jupyterhub/repo2docker-action@master
      with:
        NO_PUSH: true
        MYBINDERORG_TAG: ${{ github.event.pull_request.head.ref }}

    - name: comment on PR with Binder link
      uses: actions/github-script@v1
      with:
        github-token: ${{secrets.GITHUB_TOKEN}}
        script: |
          var BRANCH_NAME = process.env.BRANCH_NAME;
          github.issues.createComment({
            issue_number: context.issue.number,
            owner: context.repo.owner,
            repo: context.repo.repo,
            body: `[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/${context.repo.owner}/${context.repo.repo}/${BRANCH_NAME}) :point_left: Launch a binder notebook on this branch`
          })
      env:
        BRANCH_NAME: ${{ github.event.pull_request.head.ref }}

Use GitHub Actions To Cache The Build For BinderHub

Instead of forcing mybinder.org to cache your builds, you can optionally build a Docker image with GitHub Actions and push that to a Docker registry, so that any BinderHub instance, including mybinder.org only has to pull the image. This might give you more control than triggering a build directly on mybinder.org like the method illustrated above. In this example, you must supply the secrets DOCKER_USERNAME and DOCKER_PASSWORD so that Actions can push to DockerHub. Note that, instead of your actual password, you can use an access token — which may be a more secure option.

In this case, we set BINDER_CACHE to true to enable this option. See the documentation for the parameter BINDER_CACHE in the Optional Inputs section for more information.

name: Test
on: push

jobs:
  binder:
    runs-on: ubuntu-latest
    steps:
    - name: Checkout Code
      uses: actions/checkout@v2
      with:
        ref: ${{ github.event.pull_request.head.sha }}

    - name: update jupyter dependencies with repo2docker
      uses: jupyterhub/repo2docker-action@master
      with:
        DOCKER_USERNAME: ${{ secrets.DOCKER_USERNAME }}
        DOCKER_PASSWORD: ${{ secrets.DOCKER_PASSWORD }}
        BINDER_CACHE: true
        PUBLIC_REGISTRY_CHECK: true

Push Repo2Docker Image To DockerHub

We recommend creating a personal access token and use that as DOCKER_PASSWORD instead of using your dockerhub password.

name: Build Notebook Container
on: [push] # You may want to trigger this Action on other things than a push.
jobs:
  build:
    runs-on: ubuntu-latest
    steps:

    - name: checkout files in repo
      uses: actions/checkout@main

    - name: update jupyter dependencies with repo2docker
      uses: jupyterhub/repo2docker-action@master
      with:
        DOCKER_USERNAME: ${{ secrets.DOCKER_USERNAME }}
        DOCKER_PASSWORD: ${{ secrets.DOCKER_PASSWORD }}

Push Repo2Docker Image To quay.io

DockerHub now has some pretty strong rate limits, so you might want to push to a different docker repository. quay.io is a popular place, and isn't tied to any particular cloud vendor.

  1. Login to quay.io
  2. Create a new repository. This will determine the name of your image, and you will push / pull from it. Your image name will be quay.io/<username>/<repository-name>.
  3. Go to your account settings (under your name in the top right), and select the 'Robot Accounts' option on the left menu.
  4. Click 'Create Robot account', give it a memorable name (such as <hub-name>_image_builder) and click 'Create'
  5. In the next screen, select the repository you just created in step (2), and give the robot account Write permission to the repository.
  6. Once done, click the name of the robot account again. This will give you its username and password.
  7. Create these GitHub secrets for your repository with the credentials from the robot account:
    1. QUAY_USERNAME: user name of the robot account
    2. QUAY_PASSWORD: password of the robot account
  8. Use the following config for your github action.
    name: Build container image
    
    on: [push]
    
    jobs:
      build:
        runs-on: ubuntu-latest
        steps:
    
        - name: checkout files in repo
          uses: actions/checkout@main
    
        - name: update jupyter dependencies with repo2docker
          uses: jupyterhub/repo2docker-action@master
          with: # make sure username & password/token matches your registry
            DOCKER_USERNAME: ${{ secrets.QUAY_USERNAME }}
            DOCKER_PASSWORD: ${{ secrets.QUAY_PASSWORD }}
            DOCKER_REGISTRY: "quay.io"
            IMAGE_NAME: "<quay-username>/<repository-name>"
    
    

Push Repo2Docker Image To Amazon ECR

  1. Login to Amazon AWS Console

  2. Create an individual IAM user who's access key will be used by the GitHub Actions. Make sure the user has permissions to make calls to the Amazon ECR APIs and to push/pull images to the repositories you need. Checkout and follow Amazon IAM best practices for the AWS credentials used in GitHub Actions workflows.

  3. Create a new private repository. This will determine the name of your image, and you will push / pull from it. Your image name will be <aws-account-id>.dkr.ecr.<aws-region>.amazonaws.com/<username>/<repository-name>.

  4. Go to the IAM dashboard, 'Users' section and click on the username created at Step 2. Click on 'Security credentials' tab, right below the 'Summary' section. In the 'Access keys' section, click on the 'Create access key' button. Once done, it will give you an 'Access key ID' and the 'Secret access key'.

  5. Create these GitHub secrets for your repository with the credentials from the robot account:

    1. AWS_ACCESS_KEY_ID: access key id of the IAM user
    2. AWS_SECRET_ACCESS_KEY: secret access key of the IAM user
  6. Use the following config for your github action.

    name: Build container image
    
    on: [push]
    
    jobs:
      build:
        runs-on: ubuntu-latest
        env:
          DOCKER_CONFIG: $HOME/.docker
        steps:
        - name: checkout files in repo
          uses: actions/checkout@main
    
        - name: Configure AWS Credentials
          uses: aws-actions/configure-aws-credentials@v1
          with:
            aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
            aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
            aws-region: <region>
    
        - name: Login to Amazon ECR
          id: login-ecr
          uses: aws-actions/amazon-ecr-login@v1
    
    
        - name: Update jupyter dependencies with repo2docker
          uses: jupyterhub/repo2docker-action@master
          with:
            DOCKER_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
            IMAGE_NAME: "<aws-username>/<repository-name>"
    
    

Push Repo2Docker Image To Google Container Registry

  1. Login to Google Cloud Console

  2. Create (or use an existing) Google Cloud Project with the billing activated. This will be the place where the registry hosting the repo2docker image will live.

  3. Make sure Container Registry API is enabled for this project.

  4. The repository will be created automatically once the first image is pushed. Your image name will be grc.io/<gcp-project-id>/<repository-name>.

  5. Create a Service Account to authenticate the calls made by GitHub Actions to our GCP project:

    • In the Cloud Console, go to the Service Accounts page.
    • Make sure the right project is selected in the drop-down menu above.
    • Click on Create Service Account
    • Enter a service account name — give it a memorable name (such as <hub-name>_image_builder).
    • Grant this service account access to project. As a best practice, grant it only the minimum permissions: Cloud Run Admin, Service Account User, and Storage Admin.
  6. Click on the service account's name you just created and select the Keys tab. Click on the ADD KEY button, select Create new key, then create a JSON key type. The private key will be saved to your computer. Make sure to store it somewhere secure!

  7. Create these GitHub secrets for your repository with the credentials from the robot account:

    1. GCP_SA_KEY: the private key of the service account created in the previous step
    2. GCP_PROJECT_ID: the id of the Google Cloud Project
  8. Use the following config for your github action.

    name: Build container image
    
    on: [push]
    
    jobs:
      build:
        runs-on: ubuntu-latest
        env:
          DOCKER_CONFIG: $HOME/.docker
    
        steps:
        - name: checkout files in repo
          uses: actions/checkout@main
    
        - name: Login to GCR
          uses: docker/login-action@v1
          with:
            registry: gcr.io
            username: _json_key
            password: ${{ secrets.GCP_SA_KEY }}
    
        - name: Update jupyter dependencies with repo2docker
          uses: jupyterhub/repo2docker-action@master
          with:
            DOCKER_REGISTRY: gcr.io
            IMAGE_NAME: ${{ secrets.GCP_PROJECT_ID }}/<repository-name>
    

Push Repo2Docker Image To Google Artifact Registry

  1. Login to Google Cloud Console

  2. Create (or use an existing) Google Cloud Project with the billing activated. This will be the place where the registry hosting the repo2docker image will live.

  3. Make sure Artifact Registry API is enabled for this project.

  4. Create a new artifact repository. This will determine the name and location of your image. Your image name will be <location>-docker.pkg.dev/<gcp-project-id>/<repository-name>

  5. Create a Service Account to authenticate the calls made by GitHub Actions to our GCP project:

    • In the Cloud Console, go to the Service Accounts page.
    • Make sure the right project is selected in the drop-down menu above.
    • Click on Create Service Account
    • Enter a service account name — give it a memorable name (such as <hub-name>_image_builder).
    • Grant this service account access to project. As a best practice, grant it only the minimum permissions: Cloud Run Admin, Service Account User, Storage Admin, Artifact Registry Repository Administrator.
  6. Click on the service account's name you just created and select the Keys tab. Click on the ADD KEY button, select Create new key, then create a JSON key type. The private key will be saved to your computer. Make sure to store it somewhere secure!

  7. Create these GitHub secrets for your repository with the credentials from the robot account:

    1. GCP_SA_KEY: the private key of the service account created in the previous step
    2. GCP_PROJECT_ID: the id of the Google Cloud Project
  8. Use the following config for your github action.

    name: Build container image
    
    on: [push]
    
    jobs:
      build:
        runs-on: ubuntu-latest
        env:
          DOCKER_CONFIG: $HOME/.docker
    
        steps:
        - name: checkout files in repo
          uses: actions/checkout@main
    
        - name: Login to GAR
          uses: docker/login-action@v1
          with:
            registry: <location>-docker.pkg.dev
            username: _json_key
            password: ${{ secrets.GCP_SA_KEY }}
    
        - name: Update jupyter dependencies with repo2docker
          uses: jupyterhub/repo2docker-action@master
          with:
            DOCKER_REGISTRY: <location>-docker.pkg.dev
            IMAGE_NAME: ${{ secrets.GCP_PROJECT_ID }}/<repository-name>
    
    

Push Repo2Docker Image To Azure Container Registry

  1. Login to Azure Portal

  2. Create a new container registry. This will determine the name of your image, and you will push / pull from it. Your image name will be <container-registry-name>.azurecr.io/<repository-name>.

  3. Go to Access Keys option on the left menu.

  4. Enable Admin user so you can use the registry name as username and admin user access key as password to docker login to your container registry.

  5. Create these GitHub secrets for your repository with the credentials from the robot account:

    1. ACR_USERNAME: the registry name
    2. ACR_PASSWORD: the access key of the admin user
  6. Use the following config for your github action.

    name: Build container image
    
    on: [push]
    
    jobs:
      build:
        runs-on: ubuntu-latest
    
        steps:
        - name: checkout files in repo
          uses: actions/checkout@main
    
        - name: Update jupyter dependencies with repo2docker
          uses: jupyterhub/repo2docker-action@master
          with:
            DOCKER_USERNAME: ${{ secrets.ACR_USERNAME }}
            DOCKER_PASSWORD: ${{ secrets.ACR_PASSWORD }}
            DOCKER_REGISTRY: <container-registry-name>.azurecr.io
            IMAGE_NAME: <repository-name>
    
    

Push Repo2Docker Image To Other Registries

If the docker registry accepts a credentials to be passed as a username and password string, you can do it like this.

name: Build Notebook Container
on: [push]
jobs:
  build:
    runs-on: ubuntu-latest
    steps:

    - name: checkout files in repo
      uses: actions/checkout@main

    - name: update jupyter dependencies with repo2docker
      uses: jupyterhub/repo2docker-action@master
      with: # make sure username & password/token matches your registry
        DOCKER_USERNAME: ${{ secrets.DOCKER_USERNAME }}
        DOCKER_PASSWORD: ${{ secrets.DOCKER_PASSWORD }}
        DOCKER_REGISTRY: "gcr.io"

If the docker registry doesn't credentials to be passed as a username and password strong, or if you want to do it in another way, you can configure credentials to the docker registry ahead of time instead. Below is an incomplete example doing that.

name: Build Notebook Container
on: [push]
jobs:
  build:
    runs-on: ubuntu-latest
    steps:

    - name: checkout files in repo
      uses: actions/checkout@main

    # TODO: add a step here to setup credentials to push to your
    #       docker registry before running the repo2docker-action

    - name: update jupyter dependencies with repo2docker
      uses: jupyterhub/repo2docker-action@master
      with:
        DOCKER_REGISTRY: your-registry.example.org
        IMAGE_NAME: your-image-name

Change Image Name

When you do not provide an image name your image name defaults to DOCKER_USERNAME/GITHUB_REPOSITORY_NAME. For example if the user hamelsmu tried to run this Action from this repo, it would be named hamelsmu/repo2docker-action. However, sometimes you may want a different image name, you can accomplish by providing the IMAGE_NAME parameter as illustrated below:

name: Build Notebook Container
on: [push]
jobs:
  build:
    runs-on: ubuntu-latest
    steps:

    - name: checkout files in repo
      uses: actions/checkout@main

    - name: update jupyter dependencies with repo2docker
      uses: jupyterhub/repo2docker-action@master
      with:
        DOCKER_USERNAME: ${{ secrets.DOCKER_USERNAME }}
        DOCKER_PASSWORD: ${{ secrets.DOCKER_PASSWORD }}
        IMAGE_NAME: "hamelsmu/my-awesome-image" # this overrides the image name

Test Image Build

You might want to only test the image build withtout pushing to a registry, for example to test a pull request. You can do this by setting the NO_PUSH parameter to 'true':

name: Build Notebook Container
on: [pull_request]
jobs:
  build-image-without-pushing:
    runs-on: ubuntu-latest
    steps:  
    - name: Checkout PR
      uses: actions/checkout@v2
      with:
        ref: ${{ github.event.pull_request.head.sha }}

    - name: test build
      uses: jupyterhub/repo2docker-action@master
      with:
        NO_PUSH: 'true'
        IMAGE_NAME: "hamelsmu/repo2docker-test"

When you specify a value for the NO_PUSH parameter, you can omit the otherwhise mandatory parameters DOCKER_USERNAME and DOCKER_PASSWORD.

Contributing To repo2docker-action

See the Contributing Guide.