Awesome
Kubeflow common for operators
This repo contains the libraries for writing a custom job operators such as tf-operator and pytorch-operator. To write a custom operator, user need to do following steps
-
Generate operator skeleton using kube-builder or operator-sdk
-
Define job crd and reuse common API. Check test_job for full example.
import (
commonv1 "github.com/kubeflow/common/pkg/apis/common/v1"
)
// reuse commonv1 api in your type.go
RunPolicy *commonv1.RunPolicy `json:"runPolicy,omitempty"`
TestReplicaSpecs map[TestReplicaType]*commonv1.ReplicaSpec `json:"testReplicaSpecs"`
- Write a custom controller that implements controller interface, such as the TestJobController and instantiate a testJobController object
testJobController := TestJobController {
...
}
- Instantiate a JobController struct object and pass in the custom controller written in step 1 as a parameter
import "github.com/kubeflow/common/pkg/controller.v1/common"
jobController := common.JobController {
Controller: testJobController,
Config: v1.JobControllerConfiguration{EnableGangScheduling: false},
Recorder: recorder,
}
- Within you main reconcile loop, call the JobController.ReconcileJobs method.
reconcile(...) {
// Your main reconcile loop.
...
jobController.ReconcileJobs(...)
...
}
Note that this repo is still under construction, API compatibility is not guaranteed at this point.
API Reference
Please refer to the API documentation.
The API files are located under pkg/apis/common/v1
:
- constants.go: the constants such as label keys.
- interface.go: the interfaces to be implemented by custom controllers.
- controller.go: the main
JobController
that contains theReconcileJobs
API method to be invoked by user. This is the entrypoint of theJobController
logic. The rest of the code underjob_controller/
folder contains the core logic for theJobController
to work, such as creating and managing worker pods, services, etc.