ApplicationInsights-Kubernetes

Содержание

Goal for the user experience
Why container id is optional while pod name is required
Use case analysis
How does it work when the container id wasn't supplied
Provide pod name
Manually setup container id
Algorithm

Both container id and pod name are useful telemetry payloads. Due to technical reasons, container id is considered optional but pod name is required.

They could be both detected automatically in most cases, except in case of windows based multi-container pod. See the table below for a quick reference.

This post explains why we consider container id optional while pod name required from development perspective. But before we dive in, here is an overall goal:

Goal for the user experience

For end user, use the following table to quickly check whether manual configuration is needed.

	Linux Container	Windows Container
Single Container Pod	No manual settings needed.	No manual settings needed.
Multiple Container Pod	No manual settings needed.	When no manual settings: Container id missing from telemetry. User could choose to set environment variable of `ContainerId` or `ContainerName` for full telemetry.

Again, in general, no manual configuration is required. But for Windows-based-multiple-container pod, setup the container id manually will bring the experience on par with other cases. And this page will explain why.

Why container id is optional while pod name is required

With container id, pod name can be queried. However, container id is not easy to detect for application running inside windows container.
- Refer to Design for ContainerIdProviders for more details.
Pod name is easier to fetch. It is the key to figure out other K8s related info like a ReplicaSet or the Deployment.

So, the mindset is that when pod name is detected or provided manually, container id will be inferred by the best effort. And in case of container id can't be determined, leave it out of the telemetry without breaking the working process.

To reach the goal of optional container id, let's break down how is container id used today:

Use case analysis

Container Id

There are 3 major use cases for container id:
1. as a property on the telemetry that provides container info when it was logged.
2. when pod name is not provided, used to locate the pod the container is running inside.
3. used to fetch the container status at the initial stage of the application. Telemetry enhancement will only happen after the container became Ready.
Pod Name

Pod name by itself is useful in telemetry, and it is also used to find out other information like Deployment, ReplicaSet, Node and so on. Pod Name could also be used to find out the container id when there's only 1 container inside the pod.

How does it work when the container id wasn't supplied

Assuming Pod name is provided:

Easy case first, when there is only 1 container inside the pod, that container id will be used.
Now, there is more than 1 container inside the pod, it is recommended to set the environment variable of ContainerName manually. Otherwise:
- For telemetry: properties will be left out when the container id is not there.
- For container status, it will wait until any one of the containers is ready before continuing.

That means under the assumption that pod name is there, it will always work. How to get the pod name?

Provide pod name

Compared to getting containerId, getting the pod name is easier. It is automatically set and supports overwritten.

According to container information, pod name will be set on the environment variable of $HOSTNAME, and that will be intercepted by ApplicationInsights.Kubernetes.

The hostname of a Container is the name of the Pod in which the Container is running. It is available through the hostname command or the gethostname function call in libc. Pod name will be automatically fetched through the environment variable of $HOSTNAME.
If the user chooses to overwrite the value, that is also supported. The user will need to use the environment variable of $APPINSIGHTS_KUBERNETES_POD_NAME. The value could be hardcoded or it could come down from the downward API.
- Refer to Use Pod fields as values for environment variables for how to do it. A quick snippet for example:
```
...
env:
- name: APPINSIGHTS_KUBERNETES_POD_NAME
  valueFrom:
    fieldRef:
      fieldPath: metadata.name
...
```
As an alternative, given container id exists, while no pod name is provided by the ways described above (or the provided value doesn't match what is returned by the cluster), it will be queried automatically by going over all available pods in the namespace, and match the first one with the container id.

So, manually providing a correct container id will fulfill the workflow as well. Here's how to provide the container id manually:

Manually setup container id

In case of container id really needed for telemetry, it is possible to set an environment variable of ~~ContainerId~~ContainerName to provide the value. Refer to how to provide container name manually if you are using a version higher than 6.1.1-beta2 (Recommended). Otherwise, refer to Design for ContainerIdProviders for using ContainerId.

Algorithm

Based on the description, here's an overview of the algorithm for locating pod names: