General-Purpose Kubernetes Pod Controller
Перейти к файлу
Yuqi Wang 8bb6624bf4
Refine Doc: How to determine whether Authorization is enforced (#9)
2019-01-21 14:38:52 +08:00
bin Support FrameworkBarrier for GangExecution and Add Distributed TensorFlow Training Example (#2) 2018-11-23 14:53:04 +08:00
build Support FrameworkBarrier for GangExecution and Add Distributed TensorFlow Training Example (#2) 2018-11-23 14:53:04 +08:00
cmd Support FrameworkBarrier for GangExecution and Add Distributed TensorFlow Training Example (#2) 2018-11-23 14:53:04 +08:00
doc Refine Doc and Example (#3) 2018-12-17 21:32:22 +08:00
example Refine Doc: How to determine whether Authorization is enforced (#9) 2019-01-21 14:38:52 +08:00
hack Initial FrameworkController: General-Purpose Kubernetes Pod Controller 2018-10-22 08:34:54 +00:00
pkg [BREAKING CHANGE]: Refine AnnotationKey, LabelKey and EnvName (#6) 2019-01-17 17:41:46 +08:00
vendor Initial FrameworkController: General-Purpose Kubernetes Pod Controller 2018-10-22 08:34:54 +00:00
.gitignore Add Framework Interop Doc 2018-10-23 12:22:11 +08:00
Gopkg.lock Initial FrameworkController: General-Purpose Kubernetes Pod Controller 2018-10-22 08:34:54 +00:00
Gopkg.toml Initial FrameworkController: General-Purpose Kubernetes Pod Controller 2018-10-22 08:34:54 +00:00
LICENSE Initial commit 2018-09-28 02:48:21 -07:00
README.md Refine Doc Format (#8) 2019-01-17 19:45:22 +08:00

README.md

FrameworkController

FrameworkController is built to orchestrate all kinds of applications on Kubernetes by a single controller.

These kinds of applications include but not limited to:

  • Stateless and Stateful Service (Nginx, TensorFlow Serving, HBase, Kafka, etc)
  • Stateless and Stateful Batch (KD-Tree Building, Batch Data Processing, etc)
  • Any combination of above applications (Distributed TensorFlow Training, Stream Data Processing, etc)

Why Need It

Problem

In the open source community, there are so many specialized Kubernetes Pod controllers which are built for a specific kind of application, such as Kubernetes StatefulSet Controller, Kubernetes Job Controller, KubeFlow TFJob Operator. However, no one is built for all kinds of applications and combination of the existing ones still cannot support some kinds of applications. So, we have to learn, use, develop, deploy and maintain so many Pod controllers.

Solution

Build a General-Purpose Kubernetes Pod Controller: FrameworkController.

And then we can get below benefits from it:

Architecture

Architecture

Feature

Framework Feature

A Framework represents an application with a set of Tasks:

  1. Executed by Kubernetes Pod
  2. Partitioned to different heterogeneous TaskRoles which share the same lifecycle
  3. Ordered in the same homogeneous TaskRole by TaskIndex
  4. With consistent identity {FrameworkName}-{TaskRoleName}-{TaskIndex} as PodName
  5. With fine grained RetryPolicy for each Task and the whole Framework
  6. With fine grained FrameworkAttemptCompletionPolicy for each TaskRole
  7. Guarantees at most one instance of a specific Task is running at any point in time
  8. Guarantees at most one instance of a specific Framework is running at any point in time

Controller Feature

  1. Highly generalized as it is built for all kinds of applications
  2. Light-weight as it is only responsible for Pod orchestration
  3. Tolerate Pod/ConfigMap unexpected deletion, Node/Network/FrameworkController/Kubernetes failure
  4. Well-defined Framework consistency, state machine and failure model
  5. Idiomatic with Kubernetes official controllers, such as Pod Spec
  6. Compatible with other Kubernetes features, such as Kubernetes Service, Gpu Scheduling, Volume, Logging
  7. Aligned with Kubernetes Controller Design Guidelines and API Conventions

Prerequisite

  1. A Kubernetes cluster, v1.10 or above, on-cloud or on-premise.

Quick Start

  1. Run Controller
  2. Submit Framework

Doc

  1. User Manual
  2. Known Issue and Upcoming Feature
  3. FAQ
  4. Release Note

Official Image

Third Party Controller Wrapper

A specialized wrapper can be built on top of FrameworkController to optimize for a specific kind of application:

Similar Offering On Other Cluster Manager

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.