Azure Red Hat OpenShift RP
Перейти к файлу
Jim Minter 7ff540a66e
get tenantID from MSI
2019-12-02 19:23:58 -06:00
.github/workflows remove shell scripts and include vendored repo 2019-11-29 18:56:58 -06:00
cmd/rp move environment checks 2019-12-01 22:50:42 -06:00
deploy add more actions 2019-12-01 22:37:14 -06:00
docs pull DNS out of pkg/install 2019-12-01 11:06:26 -06:00
examples add cluster SP and validation 2019-12-01 19:57:42 -06:00
hack move environment checks 2019-12-01 22:50:42 -06:00
pkg get tenantID from MSI 2019-12-02 19:23:58 -06:00
swagger/redhatopenshift/resource-manager client generation updates 2019-11-30 14:04:43 -06:00
vendor pull DNS out of pkg/install 2019-12-01 11:06:26 -06:00
.gitignore add /rp to .gitignore 2019-11-27 20:43:45 -06:00
Dockerfile move healthz back onto 8443 and use authentication middleware 2019-11-26 07:22:00 -06:00
Gopkg.lock pull DNS out of pkg/install 2019-12-01 11:06:26 -06:00
Gopkg.toml client generation updates 2019-11-30 14:04:43 -06:00
LICENSE Initial commit 2019-10-15 22:43:52 -05:00
Makefile client generation updates 2019-11-30 14:04:43 -06:00
README.md update actions 2019-12-01 21:01:48 -06:00
TODO.md TODO update 2019-11-25 14:34:42 +00:00
env.example add cluster SP and validation 2019-12-01 19:57:42 -06:00

README.md

github.com/jim-minter/rp

Install

  1. Install the following:

    • go 1.12 or later
    • az client
  2. Log in to Azure:

    az login
    
  3. You will need a publicly resolvable DNS zone resource in your Azure subscription. RH ARO engineering: use the osadev.cloud zone in the dns resource group.

  4. You will need an RP AAD application with client secret authentication enabled. RH ARO engineering: use aro-v4-rp-shared and ask for the credentials.

  5. You will need a "first party" AAD application with client certificate authentication enabled. This mimics the first party production AAD application. A suitable key/certificate file can be generated using the following helper utility:

    # Non-RH ARO engineering only
    go run ./hack/genkey -extKeyUsage client "$RP_AAD_APPLICATION_NAME"
    

    RH ARO engineering: use the aro-v4-fp-shared AAD application. Ask for the key/certificate file (aro-v4-fp-shared.pem) to load into your key vault.

  6. You will need to set up the RP role definitions and assignments in your Azure subscription. This mimics the RBAC that ARM sets up. With at least User Access Administrator permissions on your subscription, do:

    # Non-RH ARO engineering only
    FP_SERVICEPRINCIPAL_ID=$(az ad sp list --all --query "[?appId=='$FP_AAD_APPLICATION_ID'].objectId" -o tsv)
    
    az deployment create -l eastus --template-file deploy/development-rbac.json --parameters "fpServicePrincipalId=$FP_SERVICEPRINCIPAL_ID"
    

    RH ARO engineering: the above step has already been done.

  7. You will need your own cluster AAD application with client secret authentication enabled.

    AZURE_CLUSTER_CLIENT_ID=$(az ad app create --display-name user-$USER-v4 --query appId -o tsv)
    az ad sp create --id "$AZURE_CLUSTER_CLIENT_ID"
    AZURE_CLUSTER_CLIENT_SECRET=$(az ad app credential reset --id $AZURE_CLUSTER_CLIENT_ID --query password -o tsv)
    
  8. Copy env.example to env, edit the values and source the env file. This file holds (only) the environment variables necessary for the RP to run.

    • AZURE_TENANT_ID: Azure tenant UUID

    • AZURE_SUBSCRIPTION_ID: Azure subscription UUID

    • AZURE_FP_CLIENT_ID: RP "first party" application client UUID

    • AZURE_CLIENT_ID: RP AAD application client UUID

    • AZURE_CLIENT_SECRET: RP AAD application client secret

    • AZURE_CLUSTER_CLIENT_ID: Cluster AAD application client UUID

    • AZURE_CLUSTER_CLIENT_SECRET: Cluster AAD application client secret

    • LOCATION: Azure location where RP and cluster(s) will run (default: eastus)

    • RESOURCEGROUP: Name of a new resource group which will contain the RP resources

    • PULL_SECRET: A cluster pull secret retrieved from Red Hat OpenShift Cluster Manager

    • RP_MODE: Set to development when not in production.

    cp env.example env
    vi env
    . ./env
    
  9. Choose the RP deployment parameters:

    • COSMOSDB_ACCOUNT: Name of a new CosmosDB account
    • DOMAIN: DNS subdomain shared by all clusters (RH: $RESOURCEGROUP.osadev.cloud)
    • KEYVAULT_NAME: Name of a new key vault
    • ADMIN_OBJECT_ID: AAD object ID for key vault admin(s) (RH: az ad group list --query "[?displayName=='Engineering'].objectId" -o tsv)
    • RP_SERVICEPRINCIPAL_ID: AAD object ID for RP principal (RH: az ad sp list --all --query "[?appDisplayName=='aro-v4-rp-shared'].objectId" -o tsv)
    • FP_SERVICEPRINCIPAL_ID: AAD object ID for "first party" service principal (RH: az ad sp list --all --query "[?appDisplayName=='aro-v4-fp-shared'].objectId" -o tsv)
  10. Create the resource group and deploy the RP resources:

    COSMOSDB_ACCOUNT=$RESOURCEGROUP
    DOMAIN=$RESOURCEGROUP.osadev.cloud
    KEYVAULT_NAME=$RESOURCEGROUP
    ADMIN_OBJECT_ID=$(az ad group list --query "[?displayName=='Engineering'].objectId" -o tsv)
    RP_SERVICEPRINCIPAL_ID=$(az ad sp list --all --query "[?appDisplayName=='aro-v4-rp-shared'].objectId" -o tsv)
    FP_SERVICEPRINCIPAL_ID=$(az ad sp list --all --query "[?appDisplayName=='aro-v4-fp-shared'].objectId" -o tsv)
    
    az group create -g "$RESOURCEGROUP" -l "$LOCATION"
    
    az group deployment create -g "$RESOURCEGROUP" --mode complete --template-file deploy/development-rp.json --parameters "location=$LOCATION" "databaseAccountName=$COSMOSDB_ACCOUNT" "domainName=$DOMAIN" "keyvaultName=$KEYVAULT_NAME" "adminObjectId=$ADMIN_OBJECT_ID" "rpServicePrincipalId=$RP_SERVICEPRINCIPAL_ID" "fpServicePrincipalId=$FP_SERVICEPRINCIPAL_ID"
    
  11. Load the application key/certificate into the key vault:

    AZURE_KEY_FILE=aro-v4-fp-shared.pem
    
    az keyvault certificate import --vault-name "$KEYVAULT_NAME" --name azure --file "$AZURE_KEY_FILE"
    
  12. Generate a self-signed serving key/certificate and load it into the key vault:

    TLS_KEY_FILE=localhost.pem
    
    go run ./hack/genkey localhost
    az keyvault certificate import --vault-name "$KEYVAULT_NAME" --name tls --file "$TLS_KEY_FILE"
    
  13. Create a glue record in the parent DNS zone:

    PARENT_DNS_RESOURCEGROUP=dns
    
    az network dns record-set ns create --resource-group "$PARENT_DNS_RESOURCEGROUP" --zone "$(cut -d. -f2- <<<"$DOMAIN")" --name "$(cut -d. -f1 <<<"$DOMAIN")"
    
    for ns in $(az network dns zone show --resource-group "$RESOURCEGROUP" --name "$DOMAIN" --query nameServers -o tsv); do az network dns record-set ns add-record --resource-group "$PARENT_DNS_RESOURCEGROUP" --zone "$(cut -d. -f2- <<<"$DOMAIN")" --record-set-name "$(cut -d. -f1 <<<"$DOMAIN")" --nsdname $ns; done
    

Running the RP

go run ./cmd/rp

Useful commands

export VNET_RESOURCEGROUP=$RESOURCEGROUP-vnet
az group create -g "$VNET_RESOURCEGROUP" -l "$LOCATION"
az network vnet create -g "$VNET_RESOURCEGROUP" -n vnet --address-prefixes 10.0.0.0/9
az role assignment create --role Owner --assignee-object-id "$(az ad sp list --all --query "[?appId=='$AZURE_FP_CLIENT_ID'].objectId" -o tsv)" --scope "/subscriptions/$AZURE_SUBSCRIPTION_ID/resourceGroups/$VNET_RESOURCEGROUP/providers/Microsoft.Network/virtualNetworks/vnet"
az role assignment create --role Owner --assignee-object-id "$(az ad sp list --all --query "[?appId=='$AZURE_CLUSTER_CLIENT_ID'].objectId" -o tsv)" --scope "/subscriptions/$AZURE_SUBSCRIPTION_ID/resourceGroups/$VNET_RESOURCEGROUP/providers/Microsoft.Network/virtualNetworks/vnet"

export CLUSTER=cluster
  • Register a subscription:
curl -k -X PUT "https://localhost:8443/subscriptions/$AZURE_SUBSCRIPTION_ID?api-version=2.0" -H 'Content-Type: application/json' -d '{"state": "Registered", "properties": {"tenantId": "'"$AZURE_TENANT_ID"'"}}'
  • Create a cluster:
az network vnet subnet create -g "$VNET_RESOURCEGROUP" --vnet-name vnet -n "$CLUSTER-master" --address-prefixes 10.$((RANDOM & 127)).$((RANDOM & 255)).0/24
az network vnet subnet create -g "$VNET_RESOURCEGROUP" --vnet-name vnet -n "$CLUSTER-worker" --address-prefixes 10.$((RANDOM & 127)).$((RANDOM & 255)).0/24

envsubst <examples/cluster-v20191231.json | curl -k -X PUT "https://localhost:8443/subscriptions/$AZURE_SUBSCRIPTION_ID/resourceGroups/$CLUSTER/providers/Microsoft.RedHatOpenShift/openShiftClusters/$CLUSTER?api-version=2019-12-31-preview" -H 'Content-Type: application/json' -d @-
  • Get a cluster:
curl -k "https://localhost:8443/subscriptions/$AZURE_SUBSCRIPTION_ID/resourceGroups/$CLUSTER/providers/Microsoft.RedHatOpenShift/openShiftClusters/$CLUSTER?api-version=2019-12-31-preview"
  • Get a cluster's kubeadmin credentials:
curl -k -X POST "https://localhost:8443/subscriptions/$AZURE_SUBSCRIPTION_ID/resourceGroups/$CLUSTER/providers/Microsoft.RedHatOpenShift/openShiftClusters/$CLUSTER/credentials?api-version=2019-12-31-preview" -H 'Content-Type: application/json' -d '{}'
  • List clusters in resource group:
curl -k "https://localhost:8443/subscriptions/$AZURE_SUBSCRIPTION_ID/resourceGroups/$CLUSTER/providers/Microsoft.RedHatOpenShift/openShiftClusters?api-version=2019-12-31-preview"
  • List clusters in subscription:
curl -k "https://localhost:8443/subscriptions/$AZURE_SUBSCRIPTION_ID/providers/Microsoft.RedHatOpenShift/openShiftClusters?api-version=2019-12-31-preview"
  • Scale a cluster:
COUNT=4

curl -k -X PATCH "https://localhost:8443/subscriptions/$AZURE_SUBSCRIPTION_ID/resourceGroups/$CLUSTER/providers/Microsoft.RedHatOpenShift/openShiftClusters/$CLUSTER?api-version=2019-12-31-preview" -H 'Content-Type: application/json' -d '{"properties": {"workerProfiles": [{"name": "worker", "count": '"$COUNT"'}]}}'
  • Delete a cluster:
curl -k -X DELETE "https://localhost:8443/subscriptions/$AZURE_SUBSCRIPTION_ID/resourceGroups/$CLUSTER/providers/Microsoft.RedHatOpenShift/openShiftClusters/$CLUSTER?api-version=2019-12-31-preview"

az network vnet subnet delete -g "$VNET_RESOURCEGROUP" --vnet-name vnet -n "$CLUSTER-master"
az network vnet subnet delete -g "$VNET_RESOURCEGROUP" --vnet-name vnet -n "$CLUSTER-worker"
  • Delete a subscription:
curl -k -X PUT "https://localhost:8443/subscriptions/$AZURE_SUBSCRIPTION_ID?api-version=2.0" -H 'Content-Type: application/json' -d '{"state": "Deleted", "properties": {"tenantId": "'"$AZURE_TENANT_ID"'"}}'
  • List operations:
curl -k "https://localhost:8443/providers/Microsoft.RedHatOpenShift/operations?api-version=2019-12-31-preview"

Basic architecture

  • pkg/frontend is intended to become a spec-compliant RP web server. It is backed by CosmosDB. Incoming PUT/DELETE requests are written to the database with an non-terminal (Updating/Deleting) provisioningState.

  • pkg/backend reads documents with non-terminal provisioningStates, asynchronously updates them and finally updates document with a terminal provisioningState (Succeeded/Failed). The backend updates the document with a heartbeat - if this fails, the document will be picked up by a different worker.

  • As CosmosDB does not support document patch, care is taken to correctly pass through any fields in the internal model which the reader is unaware of (see github.com/ugorji/go/codec.MissingFielder). This is intended to help in upgrade cases and (in the future) with multiple microservices reading from the database in parallel.

  • Care is taken to correctly use optimistic concurrency to avoid document corruption through concurrent writes (see RetryOnPreconditionFailed).

  • The pkg/api architecture differs somewhat from github.com/openshift/openshift-azure: the intention is to fix the broken merge semantics and try pushing validation into the versioned APIs to improve error reporting.

  • Everything is intended to be crash/restart/upgrade-safe, horizontally scaleable, upgradeable...

Debugging

  • Get an admin kubeconfig:

    hack/get-admin-kubeconfig.sh /subscriptions/$AZURE_SUBSCRIPTION_ID/resourceGroups/$CLUSTER/providers/Microsoft.RedHatOpenShift/openShiftClusters/$CLUSTER
    export KUBECONFIG=admin.kubeconfig
    oc version
    
  • SSH to the bootstrap node:

    hack/ssh-bootstrap.sh /subscriptions/$AZURE_SUBSCRIPTION_ID/resourceGroups/$CLUSTER/providers/Microsoft.RedHatOpenShift/openShiftClusters/$CLUSTER