- You have
kubectl
configured pointing to the target Kubernetes cluster. - You have access to a DataBricks cluster and able to generate PAT token. To generate a token, check generate a DataBricks token.
This will deploy the operator in namespace azure-databricks-operator-system
. If you want to customise
the namespace, you can either search-replace the namespace, or use kustomise
by following the next
section.
- Download the latest release manifests:
wget https://github.com/microsoft/azure-databricks-operator/releases/latest/download/release.zip
unzip release.zip
(optional) Configure maximum number of run reconcilers
- Create the
azure-databricks-operator-system
namespace:
kubectl create namespace azure-databricks-operator-system
- Create Kubernetes secrets with values for
DATABRICKS_HOST
andDATABRICKS_TOKEN
:
kubectl --namespace azure-databricks-operator-system \
create secret generic dbrickssettings \
--from-literal=DatabricksHost="https://xxxx.azuredatabricks.net" \
--from-literal=DatabricksToken="xxxxx"
- Apply the manifests for the Operator and CRDs in
release/config
:
kubectl apply -f release/config
- Change the
MAX_CONCURRENT_RUN_RECONCILES
value inconfig/default/manager_image_patch.yaml
under theenv
section with the desired number of reconcilers
- name: MAX_CONCURRENT_RUN_RECONCILES
value: "1"
By default
MAX_CONCURRENT_RUN_RECONCILES
is set to 1
- Clone the source code:
git clone git@github.com:microsoft/azure-databricks-operator.git
-
Edit file
config/default/kustomization.yaml
file to change your preferences -
Use
kustomize
to generate the final manifests and deploy:
kustomize build config/default | kubectl apply -f -
- Deploy the CRDs:
kubectl apply -f config/crd/bases
- Deploy a sample job, this will create a job in the default namespace:
curl https://raw.githubusercontent.com/microsoft/azure-databricks-operator/master/config/samples/databricks_v1alpha1_djob.yaml | kubectl apply -f -
- Check the Job in Kubernetes:
kubectl get djob
- Check the job is created successfully in DataBricks.
If you encounter any issue, you can check the log of the operator by pulling it from Kubernetes:
# get the pod name of your operator
kubectl --namespace azure-databricks-operator-system get pods
# pull the logs
kubectl --namespace azure-databricks-operator-system logs -f [name_of_the_operator_pod]
To further aid debugging diagnostic metrics are produced by the operator. Please review the metrics page for further information