8.4 KiB
Azure Databricks operator
Introduction
Azure Databricks operator contains two projects. The golang application is a Kubernetes controller that watches CRDs that defines a Databricks job and The Python Flask App which sends commands to the Databricks.
The project was built using
Prerequisites And Assumptions
- You have Minikube,Kind or docker for desktop installed on your local computer with RBAC enabled.
- You have a Kubernetes cluster running.
- You have the kubectl command line (kubectl CLI) installed.
- You have Helm and Tiller installed.
- Configure a Kubernetes cluster in your machine
You need to make sure a kubeconfig file is configured. if you opt AKS, you can use:
az aks get-credentials --resource-group $RG_NAME --name $Cluster_NAME
Basic commands to check your cluster
kubectl config get-contexts
kubectl cluster-info
kubectl version
kubectl get pods -n kube-system
Kubernetes on WSL
On windows command line run kubectl config view
to find the values of [windows-user-name],[minikubeip],[port]
mkdir ~/.kube \
&& cp /mnt/c/Users/[windows-user-name]/.kube/config ~/.kube
kubectl config set-cluster minikube --server=https://<minikubeip>:<port> --certificate-authority=/mnt/c/Users/<windows-user-name>/.minikube/ca.crt
kubectl config set-credentials minikube --client-certificate=/mnt/c/Users/<windows-user-name>/.minikube/client.crt --client-key=/mnt/c/Users/<windows-user-name>/.minikube/client.key
kubectl config set-context minikube --cluster=minikube --user=minikub
More info:
- https://devkimchi.com/2018/06/05/running-kubernetes-on-wsl/
- https://www.jamessturtevant.com/posts/Running-Kubernetes-Minikube-on-Windows-10-with-WSL/
How to use operator
Docs are work in progress
-
Create a secret set values of
DATABRICKS_HOST
andDATABRICKS_TOKEN
kubectl create secret testdatabricks --from-literal=DatabricksHost="https://xxxx.azuredatabricks.net" --from-literal=DatabricksToken="xxxxx"
Make sure your secret name is set correctly in
databricks-operator/config/default/azure_databricks_api_image_patch.yaml
-
To install NotebookJob CRD in the configured Kubernetes cluster in ~/.kube/config, run
kubectl apply -f databricks-operator/config/crds
ormake install -C databricks-operator
-
To deploy controller in the configured Kubernetes cluster in ~/.kube/config, run
kustomize build databricks-operator/config | kubectl apply -f -
-
Change NotebookJob name from
sample1run1
to your desired name, set Databricks notebook path and update the values inmicrosoft_v1beta2_notebookjob.yaml
kubectl apply -f databricks-operator/config/samples/microsoft_v1beta2_notebookjob.yaml
-
Basic commands to check the new Notebookjob
kubectl get crd kubectl -n databricks-operator-system get svc kubectl -n databricks-operator-system get pod kubectl -n databricks-operator-system describe pod databricks-operator-controller-manager-0 kubectl -n databricks-operator-system logs databricks-operator-controller-manager-0 -c dbricks -f kubectl get notebookjob kubectl describe notebookjob kubectl sample1run1
How to extend the operator and build your own images
Updating databricks operator:
This Repo is generated by Kubebuilder.
To Extend the operator databricks-operator
:
-
Run
dep ensure
to download dependencies. It doesn't show any progress bar and takes a while to download all of dependencies. -
Update
pkg\apis\microsoft\v1beta1\notebookjob_types.go
. -
Regenerate CRD
make manifests
. -
Install updated CRD
make install
-
Generate code
make generate
-
Update operator
pkg\controller\notebookjob\notebookjob_controller.go
-
Update tests and run
make test
-
Build
make build
-
Deploy
make docker-build IMG=azadehkhojandi/databricks-operator make docker-push IMG=azadehkhojandi/databricks-operator make deploy
Main Contributors
- Jordan Knight Github, Linkedin
- Paul Bouwer Github, Linkedin
- Lace Lofranco Github, Linkedin
- Allan Targino Github, Linkedin
- Rian Finnegan Github, Linkedin
- Jason Goodselli Github, Linkedin
- Craig Rodger Github, Linkedin
- Justin Chizer Github, Linkedin
- Azadeh Khojandi Github, Linkedin
Resources
Build pipelines
Contributing
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.
When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.