Home

Awesome

CI Coverage Status GitHub release (latest SemVer)

<!-- DocToc command: doctoc --notitle README.md --update-only --maxlevel 2 --> <!-- START doctoc generated TOC please keep comment here to allow auto update --> <!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE --> <!-- END doctoc generated TOC please keep comment here to allow auto update -->

Kubenurse

Kubenurse is a little service that monitors all network connections in a Kubernetes cluster. Kubenurse measures request durations, records errors and exports those metrics in Prometheus format.

Here's an overview of the checks performed by kubenurse, which are exposed as labels for the various duration/error prometheus metrics.

kubenurse request types

Grafana dashboard

Once the kubenurse pods are up and running and scraped by your metrics agent, you can import the example dashboard to start scrutinizing network latencies and errors.

Grafana overview

Metrics

All performed checks expose metrics which can be used to monitor/alert:

At /metrics you will find the following metrics:

metric namelabelsdescription
kubenurse request durationtype(deprecated since v1.13.0) latency histogram for request duration, replaced with the metric below.
kubenurse httpclient request duration secondstypelatency histogram for request duration, partitioned by request type
kubenurse httpclient trace request duration secondstype, eventlatency histogram for httpclient trace metric instrumentation, partitioned by request type and httptrace connection events
kubenurse httpclient requests totaltype, code, methodcounter for the total number of http requests, partitioned by HTTP code, method, and request type
kubenurse errors totaltype, eventerror counter, partitioned by httptrace event and request type
kubenurse neighbourhood incoming checksn\agauge which reports how many unique neighbours have queried the current pod in the last minute

For metrics partitioned with a type label, it is possible to precisely know which request type increased an error counter, or to compare the latencies of multiple request types, for example compare how your service and ingress latencies differ.

Some event labels include dns_start, got_conn, tls_handshake_done, and more. the details can be seen in the httptrace.go file.

Deployment

You can get the Docker image from Docker Hub. The examples directory contains manifests which can be used to deploy kubenurse to the kube-system namespace of your cluster.

Helm deployment

You can also deploy kubenurse with Helm, the Chart can be found in repository https://postfinance.github.io/kubenurse/ or directory ./helm/kubenurse/. The following command can be used to install kubenurse with Helm: helm upgrade [RELEASE_NAME] --install --repo https://postfinance.github.io/kubenurse/ kubenurse.

Helm parameters

<details> <summary>helm parameters list</summary>
SettingDescriptionDefault
daemonset.image.repositoryThe repository namepostfinance/kubenurse
daemonset.image.tagThe tag/ version of the imagev1.4.0
daemonset.podLabelsAdditional labels to be added to the pods of the daemonset[]
daemonset.podAnnotationsAdditional annotations to be added to the pods of the daemonset[]
daemonset.podSecurityContextThe security context of the daemonset{}
daemonset.priorityClassNameThe priority class name for the daemonset pods""
daemonset.containerSecurityContextThe security context of the containers within the pods of the daemonset{}
daemonset.containerResourcesThe container resources of the containers within the pods of the daemonset{}
daemonset.containerImagePullPolicyThe container image pull policy the pods of the daemonsetIfNotPresent
daemonset.tolerationsThe tolerations of the daemonsetSee Default tolerations below
daemonset.dnsConfigSpecifies the DNS parameters of the pods in the daemonset{}
daemonset.volumeMountsAdditional volumeMounts to be added to the pods of the daemonset[]
daemonset.volumesAdditional volumes to be added to the daemonset[]
daemonset.rollingUpdate.maxUnavailableThe maximum number of DaemonSet pods that can be unavailable during the update34%
daemonset.rollingUpdate.maxSurgeThe maximum number of nodes with an existing available DaemonSet pod that can have an updated pod during an update
serviceMonitor.enabledAdds a ServiceMonitor for use with Prometheus-operatorfalse
serviceMonitor.labelsAdditional labels to be added to the ServiceMonitor{}
serviceMonitor.relabelingsAdditional relabelings to be added to the endpoint of the ServiceMonitor[]
serviceAccount.nameThe name of the service account which is usedRelease.Name
service.nameThe name of service which exposes the kubenurse application8080-8080
service.portThe port number of the service8080
service.labelsAdditional labels to be added to the Service
ingress.enabledEnable/ Disable the ingresstrue
ingress.classNameThe classname of the ingress controller (e.g. the nginx ingress controller)nginx
ingress.urlThe url of the ingress; e.g. kubenurse.westeurope.cloudapp.example.comdummy-kubenurse.example.com
insecureSet KUBENURSE_INSECURE environment variabletrue
allow_unschedulableSets KUBENURSE_ALLOW_UNSCHEDULABLE environment variablefalse
neighbour_filterSets KUBENURSE_NEIGHBOUR_FILTER environment variableapp.kubernetes.io/name=kubenurse
neighbour_limitSets KUBENURSE_NEIGHBOUR_LIMIT environment variable10
histogram_bucketsSets KUBENURSE_HISTOGRAM_BUCKETS environment variable
extra_caSets KUBENURSE_EXTRA_CA environment variable
extra_checksSets KUBENURSE_EXTRA_CHECKS environment variable
kubernetes_service_dnsSets KUBERNETES_SERVICE_DNS environment variable
check_api_server_directSets KUBENURSE_CHECK_API_SERVER_DIRECT environment variabletrue
check_api_server_dnsSets KUBENURSE_CHECK_API_SERVER_DNS environment variabletrue
check_me_ingressSets KUBENURSE_CHECK_ME_INGRESS environment variabletrue
check_me_serviceSets KUBENURSE_CHECK_ME_SERVICE environment variabletrue
check_neighbourhoodSets KUBENURSE_CHECK_NEIGHBOURHOOD environment variabletrue
check_intervalSets KUBENURSE_CHECK_INTERVAL environment variable5s
reuse_connectionsSets KUBENURSE_REUSE_CONNECTIONS environment variablefalse
use_tlsSets KUBENURSE_USE_TLS environment variablefalse
cert_fileSets KUBENURSE_CERT_FILE environment variable
cert_keySets KUBENURSE_CERT_KEY environment variable
</details>

Configuration

<details> <summary>kubenurse environment variables list</summary>

Following variables are injected to the Pod by Kubernetes and should not be defined manually:

The DNS name of the API server can be configured with the KUBERNETES_SERVICE_DNS environment variable, and it defaults to kubernetes.default.svc.cluster.local.

</details>

HTTP Endpoints

The kubenurse service listens for http requests on port 8080 (optionally https on port 8443) and exposes endpoints:

The /alive endpoint returns a JSON like this with status code 200 if everything is OK else 500:

{
  "api_server_direct": "ok",
  "api_server_dns": "ok",
  "me_ingress": "ok",
  "me_service": "ok",
  "hostname": "kubenurse-1234-x2bwx",
  "neighbourhood_state": "ok",
  "neighbourhood": [
   {
    "PodName": "kubenurse-1234-8fh2x",
    "PodIP": "10.10.10.67",
    "HostIP": "10.12.12.66",
    "NodeName": "k8s-66.example.com",
    "Phase": "Running"
   },
   {
    "PodName": "kubenurse-1234-ffjbs",
    "PodIP": "10.10.10.138",
    "HostIP": "10.12.12.89",
    "NodeName": "k8s-89.example.com",
    "Phase": "Running"
   }
  ],
  "headers": {
   "Accept": [
    "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8"
   ],
   "Accept-Encoding": [
    "gzip, deflate, br"
   ],
   ...
  }
}

Health Checks

Every five seconds, the checks described below are run.

API Server Direct

Checks the /version endpoint of the Kubernetes API Server through the direct link (KUBERNETES_SERVICE_HOST, KUBERNETES_SERVICE_PORT).

Metric type: api_server_direct

API Server DNS

Checks the /version endpoint of the Kubernetes API Server through the Cluster DNS URL https://kubernetes.default.svc:$KUBERNETES_SERVICE_PORT. This also verifies a working kube-dns deployment.

Metric type: api_server_dns

Me Ingress

Checks if the kubenurse is reachable at the /alwayshappy endpoint behind the ingress. This address is provided by the environment variable KUBENURSE_INGRESS_URL that could look like https://kubenurse.example.com. This also verifies a correct upstream DNS resolution.

Metric type: me_ingress

Me Service

Checks if the kubenurse is reachable at the /alwayshappy endpoint through the Kubernetes service. The address is provided by the environment variable KUBENURSE_SERVICE_URL that could look like http://kubenurse.mynamespace.default.svc:8080. This also verifies a working kube-proxy setup.

Metric type: me_service

Neighbourhood

Checks if every neighbour kubenurse is reachable at the /alwayshappy endpoint. Neighbours are discovered by querying the kube-apiserver for every Pod in the KUBENURSE_NAMESPACE with label KUBENURSE_NEIGHBOUR_FILTER. The request is done directly to the Pod-IP (port 8080, or 8443 if TLS is enabled) and the metric types contains the prefix path_ and the hostname of the kubelet on which the neighbour kubenurse should run. Only kubenurses on nodes that are schedulable are considered as neighbours, this can be changed by setting KUBENURSE_ALLOW_UNSCHEDULABLE="true".

Metric type: path_$KUBELET_HOSTNAME

Neighbourhood filtering

The number of checks for the neighbourhood used to grow as $O(N^2)$, which rendered kubenurse impractical on large clusters, as documented in issue #55. To combat this, a node filtering feature was implemented, which works as follows

Here's an example with 6 nodes, where each node queries the next 3 nodes:

node filtering drawing

Thanks to this, every node is making queries to the same 10 nodes, unless one of those nodes disappears, in which case kubenurse will pick the next node in the sorted checksums list. This comes with several advantages:

Per default, the neighbourhood filtering is set to 10 nodes, which means that on cluster with more than 10 nodes, each kubenurse will query exactly 10 nodes, as described above.

Neighbourhood incoming checks metric

It is possible to check that each node receives the proper number of neighbourhood queries with the kubenurse_neighbourhood_incoming_checks metric. If you have the neighbourhood limit set to e.g. 10, then this metric should be equal to 10 on all nodes, with some variations during a rollout restart.

To bypass the node filtering feature, you simply need to set the KUBENURSE_NEIGHBOUR_LIMIT environment variable to 0.