Home

Awesome

KUDO Spark Operator

Developing

Prerequisites

Required software:

For test cluster provisioning and Stub Universe artifacts upload valid AWS access credentials required:

For pulling private repos, a GitHub token is required:

Build steps

GNU Make is used as the main build tool and includes the following main targets:

A typical workflow looks as following:

make clean-all
make cluster-create
make docker-push 
make test
make cluster-destroy

To run tests on a pre-existing cluster with specified operator and spark images, set KUBECONFIG, SPARK_IMAGE_FULL_NAME and OPERATOR_IMAGE_FULL_NAME variables

make test KUBECONFIG=$HOME/.kube/config \
SPARK_IMAGE_FULL_NAME=mesosphere/spark:spark-3.0.0-hadoop-2.9-k8s \
OPERATOR_IMAGE_FULL_NAME=mesosphere/kudo-spark-operator:3.0.0-1.1.0

Package and Release

Release process is semi-automated and based on Github Actions. To make a new release:

git tag -a v3.0.0-1.1.0 -m "KUDO Spark Operator 3.0.0-1.1.0 release"

Pushing the new tag will trigger release workflow, will build the operator package with KUDO, create a new GH release draft with the package attached to it.

Installing and using Spark Operator

Prerequisites

Installation

To install KUDO Spark Operator, run:

make install

This make target runs install_operator.sh script which will install Spark Operator and create Spark Driver roles defined in specs/spark-driver-rbac.yaml. By default, Operator and Driver roles will be created and configured to run in namespace spark-operator. To change the namespace, provide NAMESPACE parameter to make:

make install NAMESPACE=test-namespace

Submitting Spark Application

To submit Spark Application and check its status run:

#switch to operator namespace, e.g.
kubens spark-operator

# create Spark application
kubectl create -f specs/spark-application.yaml

# list applications
kubectl get sparkapplication

# check application status
kubectl describe sparkapplication mock-task-runner

To get started with your app monitoring, please, see also monitoring documentation