Skip to content

Latest commit

 

History

History
57 lines (46 loc) · 2.63 KB

README.md

File metadata and controls

57 lines (46 loc) · 2.63 KB

Kubeflow common for operators

Build Status Go Report Card

This repo contains the libraries for writing a custom job operators such as tf-operator and pytorch-operator. To write a custom operator, user need to do following steps

import (
    commonv1 "github.com/kubeflow/common/pkg/apis/common/v1"
)

// reuse commonv1 api in your type.go
RunPolicy *commonv1.RunPolicy                              `json:"runPolicy,omitempty"`
TestReplicaSpecs map[TestReplicaType]*commonv1.ReplicaSpec `json:"testReplicaSpecs"`
 testJobController := TestJobController {
    ...
 }
  • Instantiate a JobController struct object and pass in the custom controller written in step 1 as a parameter
import "github.com/kubeflow/common/pkg/controller.v1/common"

jobController := common.JobController {
    Controller: testJobController,
    Config:     v1.JobControllerConfiguration{EnableGangScheduling: false},
    Recorder:   recorder,
}
    reconcile(...) {
    	// Your main reconcile loop. 
    	...
    	jobController.ReconcileJobs(...)
    	...
    }

Note that this repo is still under construction, API compatibility is not guaranteed at this point.

API Reference

The API fies are located under pkg/apis/common/v1:

  • constants.go: the constants such as label keys.
  • interface.go: the interfaces to be implemented by custom controllers.
  • controller.go: the main JobController that contains the ReconcileJobs API method to be invoked by user. This is the entrypoint of the JobController logic. The rest of code under job_controller/ folder contains the core logic for the JobController to work, such as creating and managing worker pods, services etc.