From c944b67d9eb9587c7978a86cf58873b4cf510093 Mon Sep 17 00:00:00 2001 From: Jason Hansen Date: Fri, 21 Jul 2017 15:22:52 -0700 Subject: [PATCH] ref(docs): reorganize documentation --- README.md | 70 +---- README_zh-CN.md | 23 -- docs/acsengine.md | 224 +++++++------- docs/acsengine.zh-CN.md | 4 +- docs/dcos.md | 2 +- docs/kubernetes.md | 273 +----------------- docs/kubernetes/deploy.md | 132 +++++++++ docs/kubernetes/features.md | 114 ++++++++ docs/{kubernetes.gpu.md => kubernetes/gpu.md} | 2 +- docs/kubernetes/troubleshooting.md | 24 ++ docs/kubernetes/walkthrough.md | 102 +++++++ .../windows.md} | 20 +- docs/serviceprincipal.md | 4 - docs/swarm.md | 2 +- docs/swarmmode.md | 2 +- 15 files changed, 523 insertions(+), 475 deletions(-) create mode 100644 docs/kubernetes/deploy.md create mode 100644 docs/kubernetes/features.md rename docs/{kubernetes.gpu.md => kubernetes/gpu.md} (94%) create mode 100644 docs/kubernetes/troubleshooting.md create mode 100644 docs/kubernetes/walkthrough.md rename docs/{kubernetes.windows.md => kubernetes/windows.md} (90%) diff --git a/README.md b/README.md index ca7797737..e77d26a82 100644 --- a/README.md +++ b/README.md @@ -40,7 +40,9 @@ Please follow these instructions before submitting a PR: should deploy the relevant example cluster definitions to ensure you're not introducing any sort of regression. -## Usage (Template Generation) +## Usage + +### Generate Templates Usage is best demonstrated with an example: @@ -57,72 +59,6 @@ This produces a new directory inside `_output/` that contains an ARM template for deploying Kubernetes into Azure. (In the case of Kubernetes, some additional needed assets are generated and placed in the output directory.) -## Deployment Usage - -Generated templates can be deployed using -[the Azure XPlat CLI (v0.10**.0** only)](https://github.com/Azure/azure-xplat-cli/releases/tag/v0.10.0-May2016), -[the Azure CLI 2.0](https://github.com/Azure/azure-cli) or -[Powershell](https://github.com/Azure/azure-powershell). - -### Deploying with Azure XPlat CLI - -**NOTE:** Some deployments will fail if certain versions of the Azure XPlat CLI are used. It's recommended that you use [Azure XPlat CLI 0.10**.0**](https://github.com/Azure/azure-xplat-cli/releases/tag/v0.10.0-May2016) until a new point release of `0.10.x` is available with the fix. - -```bash -$ azure login - -$ azure account set "" - -$ azure config mode arm - -$ azure group create \ - --name="" \ - --location="" - -$ azure group deployment create \ - --name="" \ - --resource-group="" \ - --template-file="./_output//azuredeploy.json" \ - --parameters-file="./_output//azuredeploy.parameters.json" -``` - -### Deploying with Azure CLI 2.0 -Azure CLI 2.0 is actively improved, so please see [the Azure CLI 2.0 GitHub Repo](https://github.com/Azure/azure-cli) for the latest release and documentation. - -```bash -$ az login - -$ az account set --subscription "" - -$ az group create \ - --name "" \ - --location "" - -$ az group deployment create \ - --name "" \ - --resource-group "" \ - --template-file "./_output//azuredeploy.json" \ - --parameters "./_output//azuredeploy.parameters.json" -``` - -### Deploying with Powershell - -```powershell -Add-AzureRmAccount - -Select-AzureRmSubscription -SubscriptionID - -New-AzureRmResourceGroup ` - -Name ` - -Location - -New-AzureRmResourceGroupDeployment ` - -Name ` - -ResourceGroupName ` - -TemplateFile _output\\azuredeploy.json ` - -TemplateParameterFile _output\\azuredeploy.parameters.json -``` - ## Code of conduct This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq) or contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments. diff --git a/README_zh-CN.md b/README_zh-CN.md index 0d39f8108..d83e22d5a 100644 --- a/README_zh-CN.md +++ b/README_zh-CN.md @@ -56,32 +56,9 @@ needed assets are generated and placed in the output directory.) ## 部署方法 可以使用如下几种方式来部署ARM模板: -[the Azure XPlat CLI (v0.10**.0** only)](https://github.com/Azure/azure-xplat-cli/releases/tag/v0.10.0-May2016), [the Azure CLI 2.0](https://github.com/Azure/azure-cli), [Powershell](https://github.com/Azure/azure-powershell). -### 使用Azure XPlat CLI部署 - -**注意:** 建议使用[Azure XPlat CLI 0.10**.0**](https://github.com/Azure/azure-xplat-cli/releases/tag/v0.10.0-May2016)来部署,使用其他版本的话可能会出现兼容性问题,这些问题会在`0.10.x`版本有更新后修复。 - -```bash -$ azure login (登录中国版Azure需要指定-e AzureChinaCloud参数) - -$ azure account set "" - -$ azure config mode arm - -$ azure group create \ - --name="" \ - --location="" - -$ azure group deployment create \ - --name="" \ - --resource-group="" \ - --template-file="./_output//azuredeploy.json" \ - --parameters-file="./_output//azuredeploy.parameters.json" -``` - ### 使用Azure CLI 2.0部署 **NOTE:** Azure CLI 2.0目前任处于测试阶段,中国地区尚且无法使用。如果部署到国际版的Azure的话可以使用以下流程: diff --git a/docs/acsengine.md b/docs/acsengine.md index 153c04ea6..519474a32 100644 --- a/docs/acsengine.md +++ b/docs/acsengine.md @@ -1,35 +1,111 @@ # Microsoft Azure Container Service Engine -The Azure Container Service Engine (`acs-engine`) generates ARM (Azure Resource Manager) templates for Docker enabled clusters on Microsoft Azure with your choice of DCOS, Kubernetes, or Swarm orchestrators. The input to the tool is a cluster definition. The cluster definition is very similar to (in many cases the same as) the ARM template syntax used to deploy a Microsoft Azure Container Service cluster. +The Azure Container Service Engine (`acs-engine`) generates ARM (Azure Resource Manager) templates for Docker enabled clusters on Microsoft Azure with your choice of DCOS, Kubernetes, or Swarm orchestrators. The input to acs-engine is a cluster definition file which describes the desired cluster, including orchestrator, features, and agents. The structure of the input files is very similar to the public API for Azure Container Service. -# Development in Docker + -The easiest way to get started developing on `acs-engine` is to use Docker. If you already have Docker or "Docker for {Windows,Mac}" then you can get started without needing to install anything extra. +## Install + +Binary downloads for the latest verison of acs-engine are available here: + +* [OSX](https://github.com/Azure/acs-engine/releases/download/v0.4.0/acs-engine-v0.4.0-darwin-amd64.tar.gz) +* [Linux 64bit](https://github.com/Azure/acs-engine/releases/download/v0.4.0/acs-engine-v0.4.0-linux-amd64.tar.gz) +* [Windows 64bit](https://github.com/Azure/acs-engine/releases/download/v0.4.0/acs-engine-v0.4.0-windows-amd64.zip) + +Download `acs-engine` for your operating system. Extract the binary and copy it to your `$PATH`. + +If would prefer to build `acs-engine` from source or are you are interested in contributing to `acs-engine` see [building from source](#build-from-source) below. + +## Usage + +`acs-engine` reads a JSON [cluster definition](../clusterdefinition.md) and generates a number of files that may be submitted to Azure Resource Manager (ARM). The generated files include: + +1. **apimodel.json**: is an expanded version of the cluster definition provided to the generate command. All default or computed values will be expanded during the generate phase. +2. **azuredeploy.json**: represents a complete description of all Azure resources required to fulfill the cluster definition from `apimodel.json`. +3. **azuredeploy.parameters.json**: the parameters file holds a series of custom variables which are used in various locations throughout `azuredeploy.json`. +4. **certificate and access config files**: orchestrators like Kubernetes require certificates and additional configuration files (e.g. Kubernetes apiserver certificates and kubeconfig). + +### Generate Templates + +Here is an example of how to generate a new deployment. This example assumes you are using [examples/kubernetes.json](../examples/kubernetes.json). + +1. Before starting ensure you have generated a valid [SSH Public/Private key pair](ssh.md#ssh-key-generation). +2. edit [examples/kubernetes.json](../examples/kubernetes.json) and fill in the blanks. +3. run `./bin/acs-engine generate examples/kubernetes.json` to generate the templates in the _output/Kubernetes-UNIQUEID directory. The UNIQUEID is a hash of your master's FQDN prefix. +4. now you can use the `azuredeploy.json` and `azuredeploy.parameters.json` for deployment as described in [deployment usage](../acsengine.md#deployment-usage). + +**Note:** If you wish to customize cluster configuaration after the `generate` step, make sure to modify `apimodel.json` in the `_output` directory. This ensures that any computed settings and generated certificates are maintained. For example, if you want to add a second agent pool, edit `apimodel.json` and then run `acs-engine` against that file to generate and updated ARM template. This ensures that during the deploy steps, existing resources remain untouched and new agent pools are created. + + + + +### Deploy Templates + +Generated templates can be deployed using [the Azure CLI 2.0](https://github.com/Azure/azure-cli) or [Powershell](https://github.com/Azure/azure-powershell). + +#### Deploying with Azure CLI 2.0 + +Azure CLI 2.0 is the latest CLI maintained and supported by Microsoft. For installation instructions see [the Azure CLI GitHub repository](https://github.com/Azure/azure-cli#installation) for the latest release. + +```bash +$ az login + +$ az account set --subscription "" + +$ az group create \ + --name "" \ + --location "" + +$ az group deployment create \ + --name "" \ + --resource-group "" \ + --template-file "./_output//azuredeploy.json" \ + --parameters "./_output//azuredeploy.parameters.json" +``` + +#### Deploying with Powershell + +```powershell +Add-AzureRmAccount + +Select-AzureRmSubscription -SubscriptionID + +New-AzureRmResourceGroup ` + -Name ` + -Location + +New-AzureRmResourceGroupDeployment ` + -Name ` + -ResourceGroupName ` + -TemplateFile _output\\azuredeploy.json ` + -TemplateParameterFile _output\\azuredeploy.parameters.json +``` + + + +## Build ACS Engine from Source + +### Docker Development Environment + +The easiest way to start hacking on `acs-engine` is to use a Docker-based environment. If you already have Docker installed then you can get started with a few commands. * Windows (PowerShell): `.\scripts\devenv.ps1` * Linux/OSX (bash): `./scripts/devenv.sh` -This setup mounts the `acs-engine` source directory as a volume into the Docker container. -This means that you can edit your source code normally in your favorite editor on your -machine, while still being able to compile and test inside of the Docker container (the -same environment used in our Continuous Integration system). +This script mounts the `acs-engine` source directory as a volume into the Docker container, which means you can edit your source code in your favorite editor on your machine, while still being able to compile and test inside of the Docker container. This environment mirrors the environment used in the acs-engine continuous integration (CI) system. -When the execution of `devenv.{ps1,sh}` completes, you should find the console logged into the container. +When the script `devenv.ps1` or `devenv.sh` completes, you will be left at a command prompt. -Now we need to do a one-time call to setup the prerequisites. - -``` -make bootstrap -``` - -As a final step, in order to get the `acs-engine` tool ready, you should build the sources with: +Run the following commands to pull the latest dependencies and build the `acs-engine` tool. ``` +# install and download build dependencies +make prereqs +# build the `acs-engine` binary make build ``` -When the build process completes, verify that `acs-engine` is available, invoking the command without parameters. -You should see something like this: +The build process leaves the compiled `acs-engine` binary in the `bin` directory. Make sure everything completed successfully bu running `bin/acs-engine` without any arguments: ``` # ./bin/acs-engine @@ -39,6 +115,7 @@ Usage: acs-engine [command] Available Commands: + deploy deploy an Azure Resource Manager template generate Generate an Azure Resource Manager template help Help about any command version Print the version of ACS-Engine @@ -52,105 +129,50 @@ Use "acs-engine [command] --help" for more information about a command. [Here's a quick demo video showing the dev/build/test cycle with this setup.](https://www.youtube.com/watch?v=lc6UZmqxQMs) -# Downloading and Building ACS Engine Locally +## Building on Windows, OSX, and Linux -ACS Engine can also be built and run natively on Windows, OS X, and Linux. Instructions below: +Building ACS Engine from source has a few requirements for each of the platforms. Download and install the pre-reqs for your platform, Windows, Linux, or Mac: -## Windows +1. Go version 1.8 [installation instructions](https://golang.org/doc/install) +2. Git Version Control [installation instructions](https://git-scm.com/download/) -Requirements: -- Git for Windows. Download and install [here](https://git-scm.com/download/win) -- Go for Windows. Download and install [here](https://golang.org/dl/), accept all defaults. -- Powershell +### Windows -Build Steps: +Setup steps: -1. Setup your go workspace. This example assumes you are using `c:\gopath` as your workspace: - 1. Windows key-R to open the run prompt - 2. `rundll32 sysdm.cpl,EditEnvironmentVariables` to open the system variables - 3. add `c:\go\bin` to your PATH variables - 4. click "new" and add new environment variable GOPATH and set to `c:\gopath` +1. Setup your go workspace. This guide assumes you are using `c:\gopath` as your Go workspace: + 1. Type Windows key-R to open the run prompt + 2. Type `rundll32 sysdm.cpl,EditEnvironmentVariables` to open the system variables + 3. Add `c:\go\bin` to your PATH variables + 4. Click "new" and add new environment variable named `GOPATH` and set the value to `c:\gopath` Build acs-engine: - 1. Windows key-R to open the run prompt - 2. `cmd` to open command prompt - 3. mkdir %GOPATH% - 4. cd %GOPATH% - 5. type `go get github.com/Azure/acs-engine` to get the acs-engine Github project - 6. type `go get all` to get the supporting components - 7. `cd %GOPATH%\src\github.com\Azure\acs-engine` - 8. `go build` to build the project -3. `acs-engine` to see the command line parameters + 1. Type Windows key-R to open the run prompt + 2. Type `cmd` to open a command prompt + 3. Type `mkdir %GOPATH%` to create your gopath + 4. Type `cd %GOPATH%` + 5. Type `go get github.com/Azure/acs-engine` to download acs-engine from GitHub + 6. Type `go get all` to get the supporting components + 7. Type `cd %GOPATH%\src\github.com\Azure\acs-engine` + 8. Type `go build` to build the project + 9. Run `bin\acs-engine` to see the command line parameters -## OS X +### OS X and Linux -Requirements: -- Go for OS X. Download and install [here](https://golang.org/dl/) - -Build Steps: +Setup steps: 1. Open a command prompt to setup your gopath: - 2. `mkdir $HOME/gopath` + 2. `mkdir $HOME/go` 3. edit `$HOME/.bash_profile` and add the following lines to setup your go path ``` export PATH=$PATH:/usr/local/go/bin - export GOPATH=$HOME/gopath + export GOPATH=$HOME/go ``` 4. `source $HOME/.bash_profile` -Build acs-engine: - 1. type `go get github.com/Azure/acs-engine` to get the acs-engine Github project - 2. type `go get all` to get the supporting components - 3. `cd $GOPATH/src/github.com/Azure/acs-engine` - 4. `go build` to build the project - 5. `./acs-engine` to see the command line parameters - -## Linux - -Requirements: -- Go for Linux - - Download the appropriate archive for your system [here](https://golang.org/dl/) - - sudo tar -C /usr/local -xzf go$VERSION.$OS-$ARCH.tar.gz (replace with your downloaded archive) -- `git` - -Build Steps: - - 1. Setup Go path: - 2. `mkdir $HOME/gopath` - 3. edit `$HOME/.profile` and add the following lines to setup your go path - ``` - export PATH=$PATH:/usr/local/go/bin - export GOPATH=$HOME/gopath - ``` - 4. `source $HOME/.profile` Build acs-engine: - 1. type `go get github.com/Azure/acs-engine` to get the acs-engine Github project - 2. type `go get all` to get the supporting components - 3. `cd $GOPATH/src/github.com/Azure/acs-engine` - 4. `go build` to build the project - 5. `./acs-engine` to see the command line parameters - - -# Template Generation - -The `acs-engine` takes a json [cluster definition file](clusterdefinition.md) as a parameter and generates 3 or more of the following files: - -1. **apimodel.json** - this is the cluster configuration file used for generation -2. **azuredeploy.json** - this is the main ARM (Azure Resource Model) template used to deploy a full Docker enabled cluster -3. **azuredeploy.parameters.json** - this is the parameters file used along with azurdeploy.json during deployment and contains configurable parameters -4. **certificate and access config files** - some orchestrators like Kubernetes require certificate generation, and these generated files and access files like the kube config files are stored along side the model and ARM template files. - -As a rule of thumb you should always work with the `apimodel.json` when modifying an existing running deployment. This ensures that all the same settings and certificates are correctly preserved. For example, if you want to add a second agent pool, you would edit `apimodel.json` and then run `acs-engine` against that file to generate the new ARM templates. Then during deployment all existing deployments remain untouched, and only the new agent pools resources are created. - -# Generating a template - -Here is an example of how to generate a new deployment. This example assumes you are using [examples/kubernetes.json](../examples/kubernetes.json). - -1. Before starting ensure you have generated a valid [SSH Public/Private key pair](ssh.md#ssh-key-generation). -2. edit [examples/kubernetes.json](../examples/kubernetes.json) and fill in the blanks. -3. run `./bin/acs-engine generate examples/kubernetes.json` to generate the templates in the _output/Kubernetes-UNIQUEID directory. The UNIQUEID is a hash of your master's FQDN prefix. -4. now you can use the `azuredeploy.json` and `azuredeploy.parameters.json` for deployment as described in [deployment usage](../README.md#deployment-usage). - -# Deploying templates - -For deployment see [deployment usage](../README.md#deployment-usage). + 1. Type `go get github.com/Azure/acs-engine` to get the acs-engine Github project + 2. Type `cd $GOPATH/src/github.com/Azure/acs-engine` to change to the source directory + 3. Type `make rereqs` to install supporting components + 4. Type `make build` to build the project + 5. Type `./bin/acs-engine` to see the command line parameters diff --git a/docs/acsengine.zh-CN.md b/docs/acsengine.zh-CN.md index a7531678d..7480908d1 100644 --- a/docs/acsengine.zh-CN.md +++ b/docs/acsengine.zh-CN.md @@ -139,8 +139,8 @@ ACS引擎使用json格式的[集群定义文件](clusterdefinition.md)作为输 1. 首先需要准备一个[SSH 公钥私钥对](ssh.md#ssh-key-generation). 2. 编辑[examples/kubernetes.json](../examples/kubernetes.json)将其需要的参数配置好. 3. 运行`./bin/acs-engine generate examples/kubernetes.json`命令在_output/Kubernetes-UNIQUEID目录中生成对应的模板。(UNIQUEID是master节点的FQDN前缀的hash值) -4. 按照README中指定的方式使用`azuredeploy.json`和`azuredeploy.parameters.json`部署容器集群 [deployment usage](../README.md#deployment-usage). +4. 按照README中指定的方式使用`azuredeploy.json`和`azuredeploy.parameters.json`部署容器集群 [deployment usage](../acsengine.md#deployment-usage). # 部署方法 -[部署方式请参考这里](../README.md#deployment-usage). +[部署方式请参考这里](../acsengine.md#deployment-usage). diff --git a/docs/dcos.md b/docs/dcos.md index c42537498..54c6e9a2c 100644 --- a/docs/dcos.md +++ b/docs/dcos.md @@ -8,7 +8,7 @@ Here are the steps to deploy a simple DC/OS cluster: 2. [generate your ssh key](ssh.md#ssh-key-generation) 3. edit the [DC/OS example](../examples/dcos.json) and fill in the blank strings 4. [generate the template](acsengine.md#generating-a-template) -5. [deploy the output azuredeploy.json and azuredeploy.parameters.json](../README.md#deployment-usage) +5. [deploy the output azuredeploy.json and azuredeploy.parameters.json](../acsengine.md#deployment-usage) ## Walkthrough diff --git a/docs/kubernetes.md b/docs/kubernetes.md index e21adebbb..734f45569 100644 --- a/docs/kubernetes.md +++ b/docs/kubernetes.md @@ -1,267 +1,12 @@ -# Microsoft Azure Container Service Engine - Kubernetes Walkthrough +# Microsoft Azure Container Service Engine - Kubernetes -* [Kubernetes Windows Walkthrough](kubernetes.windows.md) - shows how to create a Kubernetes cluster on Windows. -* [Kubernetes with GPU support Walkthrough](kubernetes.gpu.md) - shows how to create a Kubernetes cluster with GPU support. +* [Create a Kubernetes Cluster](kubernetes/deploy.md) - Create a your first Kubernetes cluster. Start here! +* [Kubernetes Next Steps](kubernetes/walkthrough.md) - You have successfully deployed a Kubernetes cluster, now what? -## Deployment - -Here are the steps to deploy a simple Kubernetes cluster: - -1. [install acs-engine](acsengine.md#downloading-and-building-acs-engine) -2. [generate your ssh key](ssh.md#ssh-key-generation) -3. [generate your service principal](serviceprincipal.md) -4. edit the [Kubernetes example](../examples/kubernetes.json) and fill in the blank strings -5. [generate the template](acsengine.md#generating-a-template) -6. [deploy the output azuredeploy.json and azuredeploy.parameters.json](../README.md#deployment-usage) - * To enable the optional network policy enforcement using calico, you have to - set the parameter during this step according to this [guide](#optional-enable-network-policy-enforcement-using-calico) -7. Temporary workaround when deploying a cluster in a custom VNET with - Kubernetes 1.6.0: - 1. After a cluster has been created in step 6 get id of the route table resource from Microsoft.Network provider in your resource group. - The route table resource id is of the format: - `/subscriptions/SUBSCRIPTIONID/resourceGroups/RESOURCEGROUPNAME/providers/Microsoft.Network/routeTables/ROUTETABLENAME` - 2. Update properties of all subnets in the newly created VNET that are used by Kubernetes cluster to refer to the route table resource by appending the following to subnet properties: - ```shell - "routeTable": { - "id": "/subscriptions//resourceGroups//providers/Microsoft.Network/routeTables/" - } - ``` - - E.g.: - ```shell - "subnets": [ - { - "name": "subnetname", - "id": "/subscriptions//resourceGroups//providers/Microsoft.Network/virtualNetworks//subnets/", - "properties": { - "provisioningState": "Succeeded", - "addressPrefix": "10.240.0.0/16", - "routeTable": { - "id": "/subscriptions//resourceGroups//providers/Microsoft.Network/routeTables/" - } - .... - } - .... - } - ] - ``` - -## Walkthrough - -Once your Kubernetes cluster has been created you will have a resource group containing: - -1. 1 master accessible by SSH on port 22 or kubectl on port 443 - -2. a set of nodes in an availability set. The nodes can be accessed through a master. See [agent forwarding](ssh.md#key-management-and-agent-forwarding-with-windows-pageant) for an example of how to do this. - -The following image shows the architecture of a container service cluster with 1 master, and 2 agents: - -![Image of Kubernetes cluster on azure](images/kubernetes.png) - -In the image above, you can see the following parts: - -1. **Master Components** - The master runs the Kubernetes scheduler, api server, and controller manager. Port 443 is exposed for remote management with the kubectl cli. -2. **Nodes** - the Kubernetes nodes run in an availability set. Azure load balancers are dynamically added to the cluster depending on exposed services. -3. **Common Components** - All VMs run a kubelet, Docker, and a Proxy. -4. **Networking** - All VMs are assigned an ip address in the 10.240.0.0/16 network. Each VM is assigned a /24 subnet for their pod CIDR enabling IP per pod. The proxy running on each VM implements the service network 10.0.0.0/16. - -All VMs are in the same private VNET and are fully accessible to each other. - - -## Optional: Enable network policy enforcement using Calico - -Using the default configuration, Kubernetes allows communication between all -Pods within a cluster. To ensure that Pods can only be accessed by authorized -Pods, a policy enforcement is needed. To enable policy enforcement using Calico refer to the [cluster definition](https://github.com/Azure/acs-engine/blob/master/docs/clusterdefinition.md#kubernetesconfig) document under networkPolicy. There is also a reference cluster definition available [here](https://github.com/Azure/acs-engine/blob/master/examples/networkpolicy/kubernetes-calico.json). - -This will deploy a Calico node controller to every instance of the cluster -using a Kubernetes DaemonSet. After a successful deployment you should be able -to see these Pods running in your cluster: - -``` -kubectl get pods --namespace kube-system -l k8s-app=calico-node -o wide -NAME READY STATUS RESTARTS AGE IP NODE -calico-node-034zh 2/2 Running 0 2h 10.240.255.5 k8s-master-30179930-0 -calico-node-qmr7n 2/2 Running 0 2h 10.240.0.4 k8s-agentpool1-30179930-1 -calico-node-z3p02 2/2 Running 0 2h 10.240.0.5 k8s-agentpool1-30179930-0 -``` - -Per default Calico still allows all communication within the cluster. Using Kubernetes' NetworkPolicy API, you can define stricter policies. Good resources to get information about that are: - -* [NetworkPolicy User Guide](https://kubernetes.io/docs/user-guide/networkpolicies/) -* [NetworkPolicy Example Walkthrough](https://kubernetes.io/docs/getting-started-guides/network-policy/walkthrough/) -* [Calico Kubernetes](http://docs.projectcalico.org/v2.0/getting-started/kubernetes/) - -## Managed Disks - -[Managed disks](../examples/disks-managed/README.md) are supported for both node OS disks and Kubernetes persistent volumes. - -Related [upstream PR](https://github.com/kubernetes/kubernetes/pull/46360) for details. - -### Using Kubernetes Persistent Volumes - -By default, each ACS-Engine cluster is bootstrapped with several StorageClass resources. This bootstrapping is handled by the addon-manager pod that creates resources defined under /etc/kubernetes/addons directory on master VMs. - -#### Non-managed Disks - -The default storage class has been set via the Kubernetes admission controller `DefaultStorageClass`. - -The default storage class will be used if persistent volume resources don't specify a storage class as part of the resource definition. - -The default storage class uses non-managed blob storage and will provision the blob within an existing storage account present in the resource group or provision a new storage account. - -Non-managed persistent volume types are available on all VM sizes. - -#### Managed Disks - -As part of cluster bootstrapping, two storage classes will be created to provide access to create Kubernetes persistent volumes using Azure managed disks. - -Nodes will be labelled as follows if they support managed disks: - -``` -storageprofile=managed -storagetier= -``` - -They are managed-premium and managed-standard and map to Standard_LRS and Premium_LRS managed disk types respectively. - -In order to use these storage classes the following conditions must be met. - -* The cluster must be running Kubernetes version 1.7.2 or greater. Refer to this [example](../examples/kubernetesversions/kubernetes1.7.1.json) for how to provision a Kubernetes cluster of a specific version. -* The node must support managed disks. See this [example](../examples/disks-managed/kubernetes-vmas.json) to provision nodes with managed disks. You can also confirm if a node has managed disks using kubectl. - -```console -kubectl get nodes -l storageprofile=managed -NAME STATUS AGE VERSION -k8s-agent1-23731866-0 Ready 24m v1.7.2 -``` - -* The VM size must support the type of managed disk type requested. For example, Premium VM sizes with managed OS disks support both managed-standard and managed-premium storage classes whereas Standard VM sizes with managed OS disks only support managed-standard storage class. - -* If you have mixed node cluster (both non-managed and managed disk types). You must use [affinity or nodeSelectors](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/) on your resource definitions in order to ensure that workloads are scheduled to VMs that support the underlying disk requirements. - -For example -``` -spec: - affinity: - nodeAffinity: - requiredDuringSchedulingIgnoredDuringExecution: - nodeSelectorTerms: - - matchExpressions: - - key: storageprofile - operator: In - values: - - managed -```` - -## Create your First Kubernetes Service - -After completing this walkthrough you will know how to: - * access Kubernetes cluster via SSH, - * deploy a simple Docker application and expose to the world, - * the location of the Kube config file and how to access the Kubernetes cluster remotely, - * use `kubectl exec` to run commands in a container, - * and finally access the Kubernetes dashboard. - -1. After successfully deploying the template write down the master FQDNs (Fully Qualified Domain Name). - 1. If using Powershell or CLI, the output parameter is in the OutputsString section named 'masterFQDN' - 2. If using Portal, to get the output you need to: - 1. navigate to "resource group" - 2. click on the resource group you just created - 3. then click on "Succeeded" under *last deployment* - 4. then click on the "Microsoft.Template" - 5. now you can copy the output FQDNs and sample SSH commands - ![Image of docker scaling](images/portal-kubernetes-outputs.png) - -2. SSH to the master FQDN obtained in step 1. - -3. Explore your nodes and running pods: - 1. to see a list of your nodes type `kubectl get nodes`. If you want full detail of the nodes, add `-o yaml` to become `kubectl get nodes -o yaml`. - 2. to see a list of running pods type `kubectl get pods --all-namespaces`. - -4. Start your first Docker image by typing `kubectl run nginx --image nginx`. This will start the nginx Docker container in a pod on one of the nodes. - -5. Type `kubectl get pods -o yaml` to see the full details of the nginx deployment. You can see the host IP and the podIP. The pod IP is assigned from the pod CIDR on the host. Run curl to the pod ip to see the nginx output, eg. `curl 10.244.1.4` - - ![Image of curl to podIP](images/kubernetes-nginx1.png) - -6. The next step is to expose the nginx deployment as a Kubernetes service on the private service network 10.0.0.0/16: - 1. expose the service with command `kubectl expose deployment nginx --port=80`. - 2. get the service IP `kubectl get service` - 3. run curl to the IP, eg. `curl 10.0.105.199` - - ![Image of curl to service IP](images/kubernetes-nginx2.png) - -7. The final step is to expose the service to the world. This is done by changing the service type from `ClusterIP` to `LoadBalancer`: - 1. edit the service: `kubectl edit svc/nginx` - 2. change `type` from `ClusterIP` to `LoadBalancer` and save it. This will now cause Kubernetes to create an Azure Load Balancer with a public IP. - 3. the change will take about 2-3 minutes. To watch the service change from "pending" to an external ip type `watch 'kubectl get svc'` - - ![Image of watching the transition from pending to external ip](images/kubernetes-nginx3.png) - - 4. once you see the external IP, you can browse to it in your browser: - - ![Image of browsing to nginx](images/kubernetes-nginx4.png) - -8. The next step in this walkthrough is to show you how to remotely manage your Kubernetes cluster. First download Kubectl to your machine and put it in your path: - * [Windows Kubectl](https://storage.googleapis.com/kubernetes-release/release/v1.6.0/bin/windows/amd64/kubectl.exe) - * [OSX Kubectl](https://storage.googleapis.com/kubernetes-release/release/v1.6.0/bin/darwin/amd64/kubectl) - * [Linux](https://storage.googleapis.com/kubernetes-release/release/v1.6.0/bin/linux/amd64/kubectl) - -9. The Kubernetes master contains the kube config file for remote access under the home directory ~/.kube/config. Download this file to your machine, set the KUBECONFIG environment variable, and run kubectl to verify you can connect to cluster: - * Windows to use pscp from [putty](http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html). Ensure you have your certificate exposed through [pageant](ssh.md#key-management-and-agent-forwarding-with-windows-pageant): - ``` - # MASTERFQDN is obtained in step1 - pscp -P 22 azureuser@MASTERFQDN:.kube/config . - SET KUBECONFIG=%CD%\config - kubectl get nodes - ``` - * OS X or Linux: - ``` - # MASTERFQDN is obtained in step1 - scp azureuser@MASTERFQDN:.kube/config . - export KUBECONFIG=`pwd`/config - kubectl get nodes - ``` -10. The next step is to show you how to remotely run commands in a remote Docker container: - 1. Run `kubectl get pods` to show the name of your nginx pod - 2. using your pod name, you can run a remote command on your pod. eg. `kubectl exec nginx-701339712-retbj date` - 3. try running a remote bash session. eg. `kubectl exec nginx-701339712-retbj -it bash`. The following screen shot shows these commands: - - ![Image of curl to podIP](images/kubernetes-remote.png) - -11. The final step of this tutorial is to show you the dashboard: - 1. run `kubectl proxy` to directly connect to the proxy - 2. in your browser browse to the [dashboard](http://127.0.0.1:8001/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard/#/workload?namespace=_all) - 3. browse around and explore your pods and services. - ![Image of Kubernetes dashboard](images/kubernetes-dashboard.png) - -## Troubleshooting - -### Scaling up or down - -Scaling your cluster up or down requires different parameters and template than the create. More details here [Scale up](../examples/scale-up/README.md) - -If your cluster is not reachable, you can run the following command to check for common failures. - -### Misconfigured Service Principal - -If your Service Principal is misconfigured, none of the Kubernetes components will come up in a healthy manner. -You can check to see if this the problem: - -```shell -ssh -i ~/.ssh/id_rsa USER@MASTERFQDN sudo journalctl -u kubelet | grep --text autorest -``` - -If you see output that looks like the following, then you have **not** configured the Service Principal correctly. -You may need to check to ensure the credentials were provided accurately, and that the configured Service Principal has -read and **write** permissions to the target Subscription. - -`Nov 10 16:35:22 k8s-master-43D6F832-0 docker[3177]: E1110 16:35:22.840688 3201 kubelet_node_status.go:69] Unable to construct api.Node object for kubelet: failed to get external ID from cloud provider: autorest#WithErrorUnlessStatusCode: POST https://login.microsoftonline.com/72f988bf-86f1-41af-91ab-2d7cd011db47/oauth2/token?api-version=1.0 failed with 400 Bad Request: StatusCode=400` - -3. [Link](serviceprincipal.md) to documentation on how to create/configure a service principal for an ACS-Engine Kubernetes cluster. - -## Known issues and mitigations +* [Troubleshooting](kubernetes/troubleshooting.md) - Running into issues? Start here to troubleshoot Kubernetes. +* [Features](kubernetes/features.md) - Guide to alpha, beta, and stable functionality in acs-engine. + +## Known Issues ### Node "NotReady" due to lost TCP connection @@ -270,10 +15,10 @@ This is a known upstream kubernetes [issue #41916](https://github.com/kubernetes ACS-Engine partially mitigates this issue on Linux by detecting dead TCP connections more quickly via **net.ipv4.tcp_retries2=8**. -## Learning More +## Additional Kubernetes Resources Here are recommended links to learn more about Kubernetes: 1. [Kubernetes Bootcamp](https://kubernetesbootcamp.github.io/kubernetes-bootcamp/index.html) - shows you how to deploy, scale, update, and debug containerized applications. 2. [Kubernetes Userguide](http://kubernetes.io/docs/user-guide/) - provides information on running programs in an existing Kubernetes cluster. -3. [Kubernetes Examples](https://github.com/kubernetes/kubernetes/tree/master/examples) - provides a number of examples on how to run real applications with Kubernetes. +3. [Kubernetes Examples](https://github.com/kubernetes/kubernetes/tree/master/examples) - provides a number of examples on how to run real applications with Kubernetes. \ No newline at end of file diff --git a/docs/kubernetes/deploy.md b/docs/kubernetes/deploy.md new file mode 100644 index 000000000..8f69f9b1f --- /dev/null +++ b/docs/kubernetes/deploy.md @@ -0,0 +1,132 @@ +# Deploy a Kubernetes Cluster + +## Install Pre-requisites + +All the commands in this guide require both the Azure CLI and `acs-engine`. Follow the [installation instructions to download acs-engine before continuing](../acsengine.md#install-acs-engine) or [compile from source](../acsengine.md#build-from-source). + +For installation instructions see [the Azure CLI GitHub repository](https://github.com/Azure/azure-cli#installation) for the latest release. + +## Overview + +`acs-engine` reads a cluster definition which describes the size, shape, and configuration of your cluster. This guide takes the default configuration of one master and two linux agents. If you would like to change the configuration, edit `examples/kubernetes.json` before continuing. + +The `acs-engine deploy` command automates creation of a Service Principal, Resource Group and SSH key for your cluster. If operators need more control or are are intersted in the individual steps see the ["Long Way" section below](#the-long-way). + +## Gather Information + +* The subscription in which you would like to provision the cluster. This is a uuid which can be found with `az account list -o table`. +* A `dnsPrefix` which forms part of the the hostname for your cluster (e.g. staging, prodwest, blueberry). The DNS prefix must be unique so pick a random name. +* A location to provision the cluster e.g. `westus2`. + +``` +$ az account list -o table +Name CloudName SubscriptionId State IsDefault +----------------------------------------------- ----------- ------------------------------------ ------- ----------- +Contoso Subscription AzureCloud 51ac25de-afdg-9201-d923-8d8e8e8e8e8e Enabled True +``` + +## Deploy + +For this example, the subscription id is `51ac25de-afdg-9201-d923-8d8e8e8e8e8e`, the DNS prefix is `contoso-apple`, and location is `westus2`. + +Run `acs-engine deploy` with the appropriate argumets: + +``` +$ acs-engine deploy --subscription-id 51ac25de-afdg-9201-d923-8d8e8e8e8e8e --dns-prefix contoso-apple --location westus2 --auto-suffix --api-model examples/kubernetes.json +WARN[0005] apimodel: missing masterProfile.dnsPrefix will use "contoso-apple-59769a59" +WARN[0005] --resource-group was not specified. Using the DNS prefix from the apimodel as the resource group name: contoso-apple-59769a59 +WARN[0008] apimodel: ServicePrincipalProfile was empty, creating application... +WARN[0017] created application with applicationID (7e2d433f-d039-48b8-87dc-83fa4dfa38d4) and servicePrincipalObjectID (db6167e1-aeed-407a-b218-086589759442). +WARN[0017] apimodel: ServicePrincipalProfile was empty, assigning role to application... +WARN[0017] Failed to create role assignment (will retry): "authorization.RoleAssignmentsClient#Create: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code=\"PrincipalNotFound\" Message=\"Principal db6167e1aeed407ab218086589759442 does not exist in the directory 72f988bf-86f1-41af-91ab-2d7cd011db47.\"" +WARN[0020] Failed to create role assignment (will retry): "authorization.RoleAssignmentsClient#Create: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code=\"PrincipalNotFound\" Message=\"Principal db6167e1aeed407ab218086589759442 does not exist in the directory 72f988bf-86f1-41af-91ab-2d7cd011db47.\"" +INFO[0034] Starting ARM Deployment (contoso-apple-59769a59-1423145182). This will take some time... +INFO[0393] Finished ARM Deployment (contoso-apple-59769a59-1423145182). +``` + +`acs-engine` will output the generated Azure Resource Manager (ARM) templates, ssh keys and kubeconfig files in `_output/contoso-apple-59769a59` directory: + * `_output/contoso-apple-59769a59/azureuser_rsa` + * `_output/contoso-apple-59769a59/kubeconfig/kubeconfig.uswest2.json` + +Access your cluster using `kubectl`: + +``` +$ KUBECONFIG=_output/contoso-apple-59769a59/kubeconfig/kubeconfig.westus2.json kubectl cluster-info +Kubernetes master is running at https://contoso-apple-59769a59.westus2.cloudapp.azure.com +Heapster is running at https://contoso-apple-59769a59.westus2.cloudapp.azure.com/api/v1/proxy/namespaces/kube-system/services/heapster +KubeDNS is running at https://contoso-apple-59769a59.westus2.cloudapp.azure.com/api/v1/proxy/namespaces/kube-system/services/kube-dns +kubernetes-dashboard is running at https://contoso-apple-59769a59.westus2.cloudapp.azure.com/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard + +To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'. +``` + + + +## ACS Engine the Long Way + +### Step 1: Generate an SSH Key + +In addition to using Kubernetes APIs to interact with the clusters, cluster operators may access the master and agent machines using SSH. + +If you don't have an SSH key [cluster operators may generate a new one](../ssh.md#ssh-key-generation). + +### Step 2: Create a Service Principal + +Kubernetes clusters have integrated support for various cloud providers as core functionality. On Azure, acs-engine uses a Service Principal to interact with Azure Resource Manager (ARM). Follow the instructions to [create a new service principal](../serviceprincipal.md) + +### Step 3: Edit your Cluster Definition + +ACS Engine consumes a cluster definition which outlines the desired shape, size, and configuration of Kubernetes. There are a number of features that can be enabled through the cluster definition, check the `examples` directory for a number of... examples. + +Edit the [simple Kubernetes cluster definition](../examples/kubernetes.json) and fill out the required values: + +* `dnsPrefix`: must be a globally unique name and will form part of the hostname (e.g. myprod1, staging, leapinglama), be unique! +* `keyData`: must contain the public portion of an SSH key, this will be associated with the `adminUsername` value found in the same section of the cluster definition (e.g. 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABA....') +* `servicePrincipalClientID`: this is the appId uuid or name from step 3 +* `servicePrincipalClientSecret`: this is the password or randomly-generated password from step 3 + +### Step 4: Generate the Templates + +The generate command takes a cluster definition and outputs a number of templates which describe your Kubernetes cluster. By default, `generate` will create a new directory named after your cluster nested in the `_output` directory. If my dnsPrefix was `larry` my cluster templates would be found in `_output/larry-`. + +Run `acs-engine generate examples/kubernetes.json` + +### Step 5: Submit your Templates to Azure Resource Manager (ARM) + +6. [deploy the output azuredeploy.json and azuredeploy.parameters.json](../acsengine.md#deployment-usage) + * To enable the optional network policy enforcement using calico, you have to + set the parameter during this step according to this [guide](../kuberntes.md#optional-enable-network-policy-enforcement-using-calico) + + +### Step 6: If using an existing VNET + +7. Temporary workaround when deploying a cluster in a custom VNET with + Kubernetes 1.6.0: + 1. After a cluster has been created in step 6 get id of the route table resource from Microsoft.Network provider in your resource group. + The route table resource id is of the format: + `/subscriptions/SUBSCRIPTIONID/resourceGroups/RESOURCEGROUPNAME/providers/Microsoft.Network/routeTables/ROUTETABLENAME` + 2. Update properties of all subnets in the newly created VNET that are used by Kubernetes cluster to refer to the route table resource by appending the following to subnet properties: + ```shell + "routeTable": { + "id": "/subscriptions//resourceGroups//providers/Microsoft.Network/routeTables/" + } + ``` + + E.g.: + ```shell + "subnets": [ + { + "name": "subnetname", + "id": "/subscriptions//resourceGroups//providers/Microsoft.Network/virtualNetworks//subnets/", + "properties": { + "provisioningState": "Succeeded", + "addressPrefix": "10.240.0.0/16", + "routeTable": { + "id": "/subscriptions//resourceGroups//providers/Microsoft.Network/routeTables/" + } + .... + } + .... + } + ] + ``` diff --git a/docs/kubernetes/features.md b/docs/kubernetes/features.md new file mode 100644 index 000000000..7072c7e70 --- /dev/null +++ b/docs/kubernetes/features.md @@ -0,0 +1,114 @@ +# Features + +|Feature|Status|API Version|Example|Description| +|---|---|---|---|---| +|Managed Disks|Beta|`vlabs`|[kubernetes-vmas.json](../../examples/disks-managed/kubernetes-vmss.json)|[Description](#feat-managed-disks)| +|Managed Identity|Alpha|`vlabs`|[kubernetes-msi.json](../../examples/managed-identity/kubernetes-msi.json)|[Description](#feat-kubernetes-msi)| +|Calico Network Policy|Alpha|`vlabs`|[kubernetes-calico.json](../../examples/networkpolicy/kubernetes-calico.json)|[Description](#feat-calico)| + + + +## Managed Identity + +Enabling Managed Identity configures acs-engine to include and use MSI identities for all interactions with the Azure Resource Manager (ARM) API. + +Instead of using a static servic principal written to `/etc/kubernetes/azure.json`, Kubernetes will use a dynamic, time-limited token fetched from the MSI extension running on master and agent nodes. This support is currently alpha and requires Kubernetes v1.7.2 or newer. + +Enable Managed Identity by adding `useManagedIdentity` in `kubernetesConfig`. + +```json +"kubernetesConfig": { + "useManagedIdentity": true, + "customHyperkubeImage": "docker.io/colemickens/hyperkube-amd64:3b15e8a446fa09d68a2056e2a5e650c90ae849ed" +} +``` + + + +## Managed Disks + +[Managed disks](../examples/disks-managed/README.md) are supported for both node OS disks and Kubernetes persistent volumes. + +Related [upstream PR](https://github.com/kubernetes/kubernetes/pull/46360) for details. + +### Using Kubernetes Persistent Volumes + +By default, each ACS-Engine cluster is bootstrapped with several StorageClass resources. This bootstrapping is handled by the addon-manager pod that creates resources defined under /etc/kubernetes/addons directory on master VMs. + +#### Non-managed Disks + +The default storage class has been set via the Kubernetes admission controller `DefaultStorageClass`. + +The default storage class will be used if persistent volume resources don't specify a storage class as part of the resource definition. + +The default storage class uses non-managed blob storage and will provision the blob within an existing storage account present in the resource group or provision a new storage account. + +Non-managed persistent volume types are available on all VM sizes. + +#### Managed Disks + +As part of cluster bootstrapping, two storage classes will be created to provide access to create Kubernetes persistent volumes using Azure managed disks. + +Nodes will be labelled as follows if they support managed disks: + +``` +storageprofile=managed +storagetier= +``` + +They are managed-premium and managed-standard and map to Standard_LRS and Premium_LRS managed disk types respectively. + +In order to use these storage classes the following conditions must be met. + +* The cluster must be running Kubernetes version 1.7.2 or greater. Refer to this [example](../examples/kubernetesversions/kubernetes1.7.1.json) for how to provision a Kubernetes cluster of a specific version. +* The node must support managed disks. See this [example](../examples/disks-managed/kubernetes-vmas.json) to provision nodes with managed disks. You can also confirm if a node has managed disks using kubectl. + +```console +kubectl get nodes -l storageprofile=managed +NAME STATUS AGE VERSION +k8s-agent1-23731866-0 Ready 24m v1.7.2 +``` + +* The VM size must support the type of managed disk type requested. For example, Premium VM sizes with managed OS disks support both managed-standard and managed-premium storage classes whereas Standard VM sizes with managed OS disks only support managed-standard storage class. + +* If you have mixed node cluster (both non-managed and managed disk types). You must use [affinity or nodeSelectors](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/) on your resource definitions in order to ensure that workloads are scheduled to VMs that support the underlying disk requirements. + +For example +``` +spec: + affinity: + nodeAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + nodeSelectorTerms: + - matchExpressions: + - key: storageprofile + operator: In + values: + - managed +``` + + + +## Network Policy Enforcement with Calico + +Using the default configuration, Kubernetes allows communication between all +Pods within a cluster. To ensure that Pods can only be accessed by authorized +Pods, a policy enforcement is needed. To enable policy enforcement using Calico refer to the [cluster definition](https://github.com/Azure/acs-engine/blob/master/docs/clusterdefinition.md#kubernetesconfig) document under networkPolicy. There is also a reference cluster definition available [here](https://github.com/Azure/acs-engine/blob/master/examples/networkpolicy/kubernetes-calico.json). + +This will deploy a Calico node controller to every instance of the cluster +using a Kubernetes DaemonSet. After a successful deployment you should be able +to see these Pods running in your cluster: + +``` +kubectl get pods --namespace kube-system -l k8s-app=calico-node -o wide +NAME READY STATUS RESTARTS AGE IP NODE +calico-node-034zh 2/2 Running 0 2h 10.240.255.5 k8s-master-30179930-0 +calico-node-qmr7n 2/2 Running 0 2h 10.240.0.4 k8s-agentpool1-30179930-1 +calico-node-z3p02 2/2 Running 0 2h 10.240.0.5 k8s-agentpool1-30179930-0 +``` + +Per default Calico still allows all communication within the cluster. Using Kubernetes' NetworkPolicy API, you can define stricter policies. Good resources to get information about that are: + +* [NetworkPolicy User Guide](https://kubernetes.io/docs/user-guide/networkpolicies/) +* [NetworkPolicy Example Walkthrough](https://kubernetes.io/docs/getting-started-guides/network-policy/walkthrough/) +* [Calico Kubernetes](http://docs.projectcalico.org/v2.0/getting-started/kubernetes/) diff --git a/docs/kubernetes.gpu.md b/docs/kubernetes/gpu.md similarity index 94% rename from docs/kubernetes.gpu.md rename to docs/kubernetes/gpu.md index 4b53e9251..814c5870a 100644 --- a/docs/kubernetes.gpu.md +++ b/docs/kubernetes/gpu.md @@ -4,7 +4,7 @@ Here are the steps to deploy a simple Kubernetes cluster with multi-GPU support: -1. [Install a Kubernetes cluster][Kubernetes Walkthrough](kubernetes.md) - shows how to create a Kubernetes cluster. +1. [Install a Kubernetes cluster][Kubernetes Walkthrough](deploy.md) - shows how to create a Kubernetes cluster. > NOTE: Make sure to configure the agent nodes with vm size `Standard_NC12` or above to utilize the GPUs 2. Install drivers: diff --git a/docs/kubernetes/troubleshooting.md b/docs/kubernetes/troubleshooting.md new file mode 100644 index 000000000..f5367429e --- /dev/null +++ b/docs/kubernetes/troubleshooting.md @@ -0,0 +1,24 @@ +## Troubleshooting + +### Scaling up or down + +Scaling your cluster up or down requires different parameters and template than the create. More details here [Scale up](../../examples/scale-up/README.md) + +If your cluster is not reachable, you can run the following command to check for common failures. + +### Misconfigured Service Principal + +If your Service Principal is misconfigured, none of the Kubernetes components will come up in a healthy manner. +You can check to see if this the problem: + +```shell +ssh -i ~/.ssh/id_rsa USER@MASTERFQDN sudo journalctl -u kubelet | grep --text autorest +``` + +If you see output that looks like the following, then you have **not** configured the Service Principal correctly. +You may need to check to ensure the credentials were provided accurately, and that the configured Service Principal has +read and **write** permissions to the target Subscription. + +`Nov 10 16:35:22 k8s-master-43D6F832-0 docker[3177]: E1110 16:35:22.840688 3201 kubelet_node_status.go:69] Unable to construct api.Node object for kubelet: failed to get external ID from cloud provider: autorest#WithErrorUnlessStatusCode: POST https://login.microsoftonline.com/72f988bf-86f1-41af-91ab-2d7cd011db47/oauth2/token?api-version=1.0 failed with 400 Bad Request: StatusCode=400` + +3. [Link](../serviceprincipal.md) to documentation on how to create/configure a service principal for an ACS-Engine Kubernetes cluster. diff --git a/docs/kubernetes/walkthrough.md b/docs/kubernetes/walkthrough.md new file mode 100644 index 000000000..df9212d39 --- /dev/null +++ b/docs/kubernetes/walkthrough.md @@ -0,0 +1,102 @@ +## Walkthrough + +Once your Kubernetes cluster has been created you will have a resource group containing: + +1. 1 master accessible by SSH on port 22 or kubectl on port 443 + +2. a set of nodes in an availability set. The nodes can be accessed through a master. See [agent forwarding](../ssh.md#key-management-and-agent-forwarding-with-windows-pageant) for an example of how to do this. + +The following image shows the architecture of a container service cluster with 1 master, and 2 agents: + +![Image of Kubernetes cluster on azure](../images/kubernetes.png) + +In the image above, you can see the following parts: + +1. **Master Components** - The master runs the Kubernetes scheduler, api server, and controller manager. Port 443 is exposed for remote management with the kubectl cli. +2. **Nodes** - the Kubernetes nodes run in an availability set. Azure load balancers are dynamically added to the cluster depending on exposed services. +3. **Common Components** - All VMs run a kubelet, Docker, and a Proxy. +4. **Networking** - All VMs are assigned an ip address in the 10.240.0.0/16 network. Each VM is assigned a /24 subnet for their pod CIDR enabling IP per pod. The proxy running on each VM implements the service network 10.0.0.0/16. + +All VMs are in the same private VNET and are fully accessible to each other. + +## Create your First Kubernetes Service + +After completing this walkthrough you will know how to: + * access Kubernetes cluster via SSH, + * deploy a simple Docker application and expose to the world, + * the location of the Kube config file and how to access the Kubernetes cluster remotely, + * use `kubectl exec` to run commands in a container, + * and finally access the Kubernetes dashboard. + +1. After successfully deploying the template write down the master FQDNs (Fully Qualified Domain Name). + 1. If using Powershell or CLI, the output parameter is in the OutputsString section named 'masterFQDN' + 2. If using Portal, to get the output you need to: + 1. navigate to "resource group" + 2. click on the resource group you just created + 3. then click on "Succeeded" under *last deployment* + 4. then click on the "Microsoft.Template" + 5. now you can copy the output FQDNs and sample SSH commands + ![Image of docker scaling](../images/portal-kubernetes-outputs.png) + +2. SSH to the master FQDN obtained in step 1. + +3. Explore your nodes and running pods: + 1. to see a list of your nodes type `kubectl get nodes`. If you want full detail of the nodes, add `-o yaml` to become `kubectl get nodes -o yaml`. + 2. to see a list of running pods type `kubectl get pods --all-namespaces`. + +4. Start your first Docker image by typing `kubectl run nginx --image nginx`. This will start the nginx Docker container in a pod on one of the nodes. + +5. Type `kubectl get pods -o yaml` to see the full details of the nginx deployment. You can see the host IP and the podIP. The pod IP is assigned from the pod CIDR on the host. Run curl to the pod ip to see the nginx output, eg. `curl 10.244.1.4` + + ![Image of curl to podIP](../images/kubernetes-nginx1.png) + +6. The next step is to expose the nginx deployment as a Kubernetes service on the private service network 10.0.0.0/16: + 1. expose the service with command `kubectl expose deployment nginx --port=80`. + 2. get the service IP `kubectl get service` + 3. run curl to the IP, eg. `curl 10.0.105.199` + + ![Image of curl to service IP](../images/kubernetes-nginx2.png) + +7. The final step is to expose the service to the world. This is done by changing the service type from `ClusterIP` to `LoadBalancer`: + 1. edit the service: `kubectl edit svc/nginx` + 2. change `type` from `ClusterIP` to `LoadBalancer` and save it. This will now cause Kubernetes to create an Azure Load Balancer with a public IP. + 3. the change will take about 2-3 minutes. To watch the service change from "pending" to an external ip type `watch 'kubectl get svc'` + + ![Image of watching the transition from pending to external ip](../images/kubernetes-nginx3.png) + + 4. once you see the external IP, you can browse to it in your browser: + + ![Image of browsing to nginx](../images/kubernetes-nginx4.png) + +8. The next step in this walkthrough is to show you how to remotely manage your Kubernetes cluster. First download Kubectl to your machine and put it in your path: + * [Windows Kubectl](https://storage.googleapis.com/kubernetes-release/release/v1.6.0/bin/windows/amd64/kubectl.exe) + * [OSX Kubectl](https://storage.googleapis.com/kubernetes-release/release/v1.6.0/bin/darwin/amd64/kubectl) + * [Linux](https://storage.googleapis.com/kubernetes-release/release/v1.6.0/bin/linux/amd64/kubectl) + +9. The Kubernetes master contains the kube config file for remote access under the home directory ~/.kube/config. Download this file to your machine, set the KUBECONFIG environment variable, and run kubectl to verify you can connect to cluster: + * Windows to use pscp from [putty](http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html). Ensure you have your certificate exposed through [pageant](../ssh.md#key-management-and-agent-forwarding-with-windows-pageant): + ``` + # MASTERFQDN is obtained in step1 + pscp -P 22 azureuser@MASTERFQDN:.kube/config . + SET KUBECONFIG=%CD%\config + kubectl get nodes + ``` + * OS X or Linux: + ``` + # MASTERFQDN is obtained in step1 + scp azureuser@MASTERFQDN:.kube/config . + export KUBECONFIG=`pwd`/config + kubectl get nodes + ``` +10. The next step is to show you how to remotely run commands in a remote Docker container: + 1. Run `kubectl get pods` to show the name of your nginx pod + 2. using your pod name, you can run a remote command on your pod. eg. `kubectl exec nginx-701339712-retbj date` + 3. try running a remote bash session. eg. `kubectl exec nginx-701339712-retbj -it bash`. The following screen shot shows these commands: + + ![Image of curl to podIP](../images/kubernetes-remote.png) + +11. The final step of this tutorial is to show you the dashboard: + 1. run `kubectl proxy` to directly connect to the proxy + 2. in your browser browse to the [dashboard](http://127.0.0.1:8001/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard/#/workload?namespace=_all) + 3. browse around and explore your pods and services. + ![Image of Kubernetes dashboard](../images/kubernetes-dashboard.png) diff --git a/docs/kubernetes.windows.md b/docs/kubernetes/windows.md similarity index 90% rename from docs/kubernetes.windows.md rename to docs/kubernetes/windows.md index 7f0486834..5526f9968 100644 --- a/docs/kubernetes.windows.md +++ b/docs/kubernetes/windows.md @@ -4,12 +4,12 @@ Here are the steps to deploy a simple Kubernetes cluster with Windows: -1. [install acs-engine](acsengine.md#downloading-and-building-acs-engine) -2. [generate your ssh key](ssh.md#ssh-key-generation) -3. [generate your service principal](serviceprincipal.md) +1. [install acs-engine](../acsengine.md#downloading-and-building-acs-engine) +2. [generate your ssh key](../ssh.md#ssh-key-generation) +3. [generate your service principal](../serviceprincipal.md) 4. edit the [Kubernetes windows example](../examples/windows/kubernetes.json) and fill in the blank strings -5. [generate the template](acsengine.md#generating-a-template) -6. [deploy the output azuredeploy.json and azuredeploy.parameters.json](../README.md#deployment-usage) +5. [generate the template](../acsengine.md#generating-a-template) +6. [deploy the output azuredeploy.json and azuredeploy.parameters.json](../acsengine.md#deployment-usage) 7. Temporary workaround when deploying a cluster in a custom VNET with Kubernetes 1.6.0: 1. After a cluster has been created in step 6 get id of the route table resource from Microsoft.Network provider in your resource group. The route table resource id is of the format: @@ -46,13 +46,13 @@ Once your Kubernetes cluster has been created you will have a resource group con 1. 1 master accessible by SSH on port 22 or kubectl on port 443 -2. a set of windows and linux nodes. The windows nodes can be accessed through an RDP SSH tunnel via the master node. To do this, follow these [instructions](ssh.md#create-port-80-tunnel-to-the-master), replacing port 80 with 3389. Since your windows machine is already using port 3389, it is recommended to use 3390 to Windows Node 0, 10.240.245.5, 3391 to Windows Node 1, 10.240.245.6, and so on as shown in the following image: +2. a set of windows and linux nodes. The windows nodes can be accessed through an RDP SSH tunnel via the master node. To do this, follow these [instructions](../ssh.md#create-port-80-tunnel-to-the-master), replacing port 80 with 3389. Since your windows machine is already using port 3389, it is recommended to use 3390 to Windows Node 0, 10.240.245.5, 3391 to Windows Node 1, 10.240.245.6, and so on as shown in the following image: -![Image of Windows RDP tunnels](images/rdptunnels.png) +![Image of Windows RDP tunnels](../images/rdptunnels.png) The following image shows the architecture of a container service cluster with 1 master, and 2 agents: -![Image of Kubernetes cluster on azure with Windows](images/kubernetes-windows.png) +![Image of Kubernetes cluster on azure with Windows](../images/kubernetes-windows.png) In the image above, you can see the following parts: @@ -80,7 +80,7 @@ After completing this walkthrough you will know how to: 4. then click on the "Microsoft.Template" 5. now you can copy the output FQDNs and sample SSH commands - ![Image of docker scaling](images/portal-kubernetes-outputs.png) + ![Image of docker scaling](../images/portal-kubernetes-outputs.png) 2. SSH to the master FQDN obtained in step 1. @@ -219,7 +219,7 @@ After completing this walkthrough you will know how to: 8. Type `watch kubectl get pods` to watch the deployment of the service that takes about 10 minutes. Once running, type `kubectl get svc` and for the app names `aspnet-webapi-todo` copy the external address and open in your webbrowser. As shown in the following image, the traffic flows from your webbrowser to the ASP.Net WebAPI frontend and then to the hybrid container. - ![Image of hybrid traffic flow](images/hybrid-trafficflow.png) + ![Image of hybrid traffic flow](../images/hybrid-trafficflow.png) ## Troubleshooting diff --git a/docs/serviceprincipal.md b/docs/serviceprincipal.md index 95e13eded..97d4bdad1 100644 --- a/docs/serviceprincipal.md +++ b/docs/serviceprincipal.md @@ -34,10 +34,6 @@ There are several ways to create a Service Principal in Azure Active Directory: az vm list-sizes --location westus ``` -* **With the legacy [Azure XPlat CLI](https://github.com/Azure/azure-xplat-cli)** - - Instructions: ["Use Azure CLI to create a service principal to access resources"](https://azure.microsoft.com/en-us/documentation/articles/resource-group-authenticate-service-principal-cli/) - * **With [PowerShell](https://github.com/Azure/azure-powershell)** Instructions: ["Use Azure PowerShell to create a service principal to access resources"](https://azure.microsoft.com/en-us/documentation/articles/resource-group-authenticate-service-principal/) diff --git a/docs/swarm.md b/docs/swarm.md index 44c56c005..ff28a0c7a 100644 --- a/docs/swarm.md +++ b/docs/swarm.md @@ -8,7 +8,7 @@ Here are the steps to deploy a simple Swarm cluster: 2. [generate your ssh key](ssh.md#ssh-key-generation) 3. edit the [Swarm example](../examples/swarm.json) and fill in the blank strings 4. [generate the template](acsengine.md#generating-a-template) -5. [deploy the output azuredeploy.json and azuredeploy.parameters.json](../README.md#deployment-usage) +5. [deploy the output azuredeploy.json and azuredeploy.parameters.json](../acsengine.md#deployment-usage) ## Walkthrough diff --git a/docs/swarmmode.md b/docs/swarmmode.md index e3dd07ad6..66dd963db 100644 --- a/docs/swarmmode.md +++ b/docs/swarmmode.md @@ -8,7 +8,7 @@ Here are the steps to deploy a simple Swarm Mode cluster: 2. [generate your ssh key](ssh.md#ssh-key-generation) 3. edit the [Swarm Mode example](../examples/swarmmode.json) and fill in the blank strings 4. [generate the template](acsengine.md#generating-a-template) -5. [deploy the output azuredeploy.json and azuredeploy.parameters.json](../README.md#deployment-usage) +5. [deploy the output azuredeploy.json and azuredeploy.parameters.json](../acsengine.md#deployment-usage) ## Walkthrough