Kubernetes Operator for Databricks

Перейти к файлу

Eliise 265a0a8546 Merge pull request #173 from EliiseS/es/contribute-load-testing-and-mock-api Contribute load testing and mock api		2020-03-17 12:14:48 +00:00
.devcontainer	Fix issue with ginko unable to find package	2020-03-12 11:27:11 +00:00
.vscode	Adding Debugging Documentation (#103 )	2019-11-13 14:02:34 +11:00
api/v1alpha1	update all instances of license header to be MIT	2020-02-06 12:39:10 +00:00
config	Add running load tests in pipeline	2020-03-04 20:08:42 +00:00
controllers	Sets Run to terminal state if it has been deleted from Databricks first (#158 )	2020-02-19 20:50:46 +11:00
docs	Use correct port number	2020-03-13 18:04:13 +00:00
hack	Add timeouts for load test	2020-03-16 14:16:27 +00:00
locust	Reduce space used by load tests	2020-03-13 18:05:52 +00:00
mockapi	Scale down mockAPI	2020-03-16 12:26:22 +00:00
.dockerignore	Add .git to dockerignore	2020-03-17 11:16:00 +00:00
.gitattributes	Remove bin and lfs and add line endings	2020-03-04 20:08:12 +00:00
.gitignore	Remove bin and lfs and add line endings	2020-03-04 20:08:12 +00:00
.golangci.yaml	Issue/142 (#143 )	2020-01-09 11:26:30 +11:00
Dockerfile	Reduce space used by load tests	2020-03-13 18:05:52 +00:00
LICENSE	adding License (#85 )	2019-10-18 12:58:06 +11:00
Makefile	Add timeouts for load test	2020-03-16 14:16:27 +00:00
PROJECT	change group API version from beta1 to alpha1 (#78 )	2019-10-14 09:45:28 +11:00
README.md	Issue/142 (#143 )	2020-01-09 11:26:30 +11:00
azure-pipelines.yaml	Remove redudant docker build	2020-03-06 12:30:07 +00:00
go.mod	Add running load tests in pipeline	2020-03-04 20:08:42 +00:00
go.sum	Add running load tests in pipeline	2020-03-04 20:08:42 +00:00
main.go	update all instances of license header to be MIT	2020-02-06 12:39:10 +00:00

README.md

Azure Databricks operator (for Kubernetes)

This project is experimental. Expect the API to change. It is not recommended for production environments.

Introduction

Kubernetes offers the facility of extending its API through the concept of Operators. This repository contains the resources and code to deploy an Azure Databricks Operator for Kubernetes.

The Databricks operator is useful in situations where Kubernetes hosted applications wish to launch and use Databricks data engineering and machine learning tasks.

Key benefits of using Azure Databricks operator

Easy to use: Azure Databricks operations can be done by using Kubectl there is no need to learn or install data bricks utils command line and it’s python dependency
Security: No need to distribute and use Databricks token, the data bricks token is used by operator
Version control: All the YAML or helm charts which has azure data bricks operations (clusters, jobs, …) can be tracked
Automation: Replicate azure data bricks operations on any data bricks workspace by applying same manifests or helm charts

The project was built using

How to use Azure Databricks operator

Download the latest release manifests:

wget https://github.com/microsoft/azure-databricks-operator/releases/latest/download/release.zip
unzip release.zip

Create the azure-databricks-operator-system namespace:

kubectl create namespace azure-databricks-operator-system

Create Kubernetes secrets with values for DATABRICKS_HOST and DATABRICKS_TOKEN:

kubectl --namespace azure-databricks-operator-system \
    create secret generic dbrickssettings \
    --from-literal=DatabricksHost="https://xxxx.azuredatabricks.net" \
    --from-literal=DatabricksToken="xxxxx"

Apply the manifests for the Operator and CRDs in release/config:

kubectl apply -f release/config

For details deployment guides please see deploy.md

Samples

Create a spark cluster on demand and run a databricks notebook.

Create an interactive spark cluster and Run a databricks job on exisiting cluster.

Create azure databricks secret scope by using kuberentese secrets

For samples and simple use cases on how to use the operator please see samples.md

Quick start

On click start by using vscode

For more details please see contributing.md

Roadmap

Check roadmap.md for what has been supported and what's coming.

Resources

Few topics are discussed in the resources.md

Dev container
Build pipelines
Operator metrics
Kubernetes on WSL

Contributing

For instructions about setting up your environment to develop and extend the operator, please see contributing.md

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

README.md Убрать экранирование Экранировать