a7b89c66c8 | ||
---|---|---|
.. | ||
regions | ||
templates | ||
README.md | ||
acceptance-tests.sh | ||
commander.sh | ||
deploy_utils.py | ||
deployments.py | ||
env.py | ||
requirements.txt | ||
rollouts.py | ||
tasks.py |
README.md
SUMO K8s deployment
Local Requirements:
- Python 3
- Connectivity to SUMO Kubernetes cluster(s)
Release convention
- Releases to the development environment can happen either from the main or from a feature branch.
- Releases to the stage environment happen from the main branch.
- Releases to the production environment happen from the main branch after successful QA testing in the stage environment.
Because there might be a need to deploy while testing merged changes in main,
deployments to production and stage may happen from the production branch.
The production branch maintains an image that can be released in production without further validation.
Deploying SUMO
Setup
-
Move to the SUMO
k8s
directorycd k8s
-
Install needed packages
virtualenv venv pip install -r ./requirements.txt
-
Create symbolic links to secrets for each namespace and region.
e.g.
ln -s ~/sumo-encrypted/k8s/secrets/frankfurt/sumo-stage-secrets.yaml ./regions/frankfurt/stage-secrets.yaml
Files to be created:
- ./regions/frankfurt/stage-secrets.yaml
- ./regions/frankfurt/prod-secrets.yaml
- ./regions/oregon/stage-secrets.yaml
- ./regions/oregon/prod-secrets.yaml
-
(Optional) Create symbolic links to
kubectl
andkubeconfig
to use in each region. Note thatkubectl
should match the version of the cluster, or have a version skew of 1 minor version.e.g.
ln -s ~/bin/kubectl ./regions/frankfurt/kubectl
Files to be created:
- ./regions/frankfurt/kubectl
- ./regions/frankfurt/kubeconfig
- ./regions/oregon/kubectl
- ./regions/oregon/kubeconfig
Configuration for deploying into an EKS cluster
-
Ensure that your
kubectl
binary matches the cluster version (check the cluster version in the table below). -
Authorize yourself into AWS. After making sure you are part of the right LDAP groups, run
$(maws)
to get a short lived AWS token. More on installing maws here. -
Get the
kubeconfig
file runningaws eks update-kubeconfig --name <cluster_name> --region <region> --alias <cluster_name>
adding the cluster name and region. You can find a table with the EKS clusters and regions below. This command will by default append the configuration to~/.kube/config
. After that, you should be able to talk to the cluster and perform operations likekubectl get namespaces
.
Here there is a list with the names of the clusters per region:
Region | Cluster Name | Version |
---|---|---|
eu-central-1 | sumo-eks-eu-central-1 | 1.21 |
us-west-2 | sumo-eks-us-west-2 | 1.21 |
Deploy SUMO with commander (recommended)
Choose a region and env:
| Region | Env | Namespace | Cluster | Version | |---|---|---|---| | frankfurt | stage | sumo-stage | eks | 1.21 | | frankfurt | prod | sumo-prod | eks | 1.21 | | oregon | stage | sumo-stage | eks | 1.21 | | oregon | prod | sumo-prod | eks | 1.21 |
-
Update the settings file with new image tags and settings.
-
Update the settings file with new image tags and settings.
-
Deploy with commander without secrets
./commander.sh deploy <region> <environment> <git sha>
E.g.
./commander.sh deploy frankfurt dev d7be392
-
Deploy with commander with secrets
./commander.sh deploy <region> <environment> <git sha> secrets
E.g.
./commander.sh deploy frankfurt dev d7be392 secrets
Deploying SUMO (low level)
-
Update the settings file with new image tags and settings.
-
If desired, perform a dry-run to see rendered K8s settings:
invoke -f ./regions/frankfurt/dev.yaml deployments.create-celery
invoke -f ./regions/frankfurt/dev.yaml deployments.create-cron
invoke -f ./regions/frankfurt/dev.yaml deployments.create-web
dry-run mode is enabled if you do not specify
--apply
.
- Apply changes to K8s:
invoke -f ./regions/frankfurt/dev.yaml deployments.create-celery --apply
invoke -f ./regions/frankfurt/dev.yaml deployments.create-cron --apply
invoke -f ./regions/frankfurt/dev.yaml deployments.create-web --apply
- Monitor the status of the rollout until it completes:
invoke -f ./regions/frankfurt/dev.yaml rollouts.status-celery
invoke -f ./regions/frankfurt/dev.yaml rollouts.status-cron
invoke -f ./regions/frankfurt/dev.yaml rollouts.status-web
- In an emergency, if the rollout is causing failures, you can roll-back to the previous state.
invoke -f ./regions/frankfurt/dev.yaml rollouts.rollback-celery
invoke -f ./regions/frankfurt/dev.yaml rollouts.rollback-cron
invoke -f ./regions/frankfurt/dev.yaml rollouts.rollback-web
Acceptance Tests
Run basic acceptance tests with
./acceptance-tests.sh <URL>
E.g. ./acceptance-tests.sh https://dev.sumo.mozit.cloud
List of invoke available tasks
$ invoke -l
Available tasks:
deployments.check-environment Ensure that a .yaml file has been specified
deployments.create-all Create web, cron and celery deployments
deployments.create-celery Create or update a SUMO celery deployment
deployments.create-cron Create or update a SUMO cron deployment
deployments.create-service Create or update a SUMO service with ELB
deployments.create-web Create or update a SUMO web deployment
deployments.delete-all Delete existing web, cron and celery deployments
deployments.delete-celery Delete an existing SUMO celery deployment
deployments.delete-cron Delete an existing SUMO cron deployment
deployments.delete-service Delete an existing SUMO service ELB
deployments.delete-web Delete an existing SUMO web deployment
rollouts.check-environment Ensure that a .yaml file has been specified
rollouts.rollback-celery Undo a celery deployment
rollouts.rollback-cron Undo a cron deployment
rollouts.rollback-web Undo a web deployment
rollouts.status-celery Check rollout status of a SUMO web deployment
rollouts.status-cron Check rollout status of a SUMO web deployment
rollouts.status-web Check rollout status of a SUMO web deployment
kubectl client version note
All Kubernetes clusters that serve SUMO are currently deployed on EKS version 1.21.2
When connecting to an older K8s cluster, you may need to download an older version of Kubectl that matches the version of the server.
Newer clients connecting to older K8s servers may display error messages such as:
Error from server (NotFound): the server could not find the requested resource
To determine the K8s server version:
$ kubectl version --short
Client Version: v1.21.5
Server Version: v1.21.2-eks-06eac09
Check the values that are part of Server Version
. The output above shows 1.21
(or v.1.21.2-eks-...
).
Next, download a binary following the instructions located here.
curl -LO https://storage.googleapis.com/kubernetes-release/release/v1.21.5/bin/linux/$(uname -m)/kubectl
chmod +x ./kubectl
# it may be useful to rename the binary to include K8s version number:
mv ./kubectl ./kubectl1.21.5
Next, set the KUBECTL_BIN
environment variable to override the name of the kubectl binary:
# next, set the KUBECTL_BIN environment variable to override the name of the kubectl binary:
KUBECTL_BIN=/home/metadave/kubectl1.21.5 invoke -f ./regions/frankfurt/dev.yaml deployments.create-web --apply
The default KUBECTL_BIN value is
kubectl
.
Checking the Celery queue
In order to check the size of an instance's celery queue, open a shell on a pod in that namespace:
kubectl exec -ti sumo-dev-celery-abc -- /bin/bash
Get the name of the queue you want to query:
./manage.py celery inspect active_queues
Open a python shell:
./manage.py shell_plus
Get the url of the redis instance:
In [1]: settings.BROKER_URL
Out[1]: 'redis://hostname:port/db'
Set up the redis client:
import redis
r = redis.Redis(host='hostname', port='port', db=db)
Then query the length of the queue, in this case called 'celery':
In [4]: r.llen('celery')
Out[4]: 44
The last item in the queue can be fetched with:
l = r.lrange('celery', -1, -1)
This must then be parsed and de-pickled:
import pickle, base64, json
pickle.loads(base64.decodestring(json.loads(l[0])['body']))