info@novaima.com

Monitoring Kubernetes Service Performance with Grafana and Prometheus

Monitoring Kubernetes Service Performance with Grafana and Prometheus

Monitoring many moving parts in complex Kubernetes deployments is critical. In this blog, I discuss a brief example of service performance monitoring in a Kubernetes cluster. The purpose of the example is to simulate an increase in service usage that Kubernetes platform operators need to be aware of to avoid critical outages.

The setup for this simulation consists of two Kubernetes pods with one instance of a service running in each pod. Grafana and Prometheus monitor the pods. The prepackaged Grafana dashboard reports any increase in CPU usage by the pods.

Prerequisites

In this example, I’m using Kubernetes’ deployment of an HTTP server to act as a Kubernetes service. For more details on how to deploy this server, please read the following Kubernetes documentation.

Before creating a Kubernetes service, make sure you first install and configure the following infrastructure elements:

  • Minikube is a tool that allows you to run Kubernetes on the local machine. My minikube is running in VirtualBox hypervisor, but you can use other hypervisors.
  • Kubectl is a command-line tool for controlling Kubernetes clusters.
  • Prometheus is a metrics collection and alerting system. Grafana is analytics and visualization software.

Deploy Grafana and Prometheus to the Kubernetes:

First, install the Helm. Then add the repository of stable charts:

helm repo add stable https://kubernetes-charts.storage.googleapis.com

Install the prometheus-operator:

helm install my-prometheus-operator stable/prometheus-operator

Please note that I’m using a single macOS machine for this Kubernetes cluster, but you can configure a similar setup by using different OS environments.

Scale Kubernetes Resources

Let’s create two Kubernetes pods, each exposing the hello-minikube service. Start a Kubernetes dashboard which makes scaling easy:

$ minikube dashboard &
[1] 28998

To replicate the hello-minikube resource, perform “Scale a resource” action from the Kubernetes Dashboard shown in Figure 1 below and create two replicas:

Figure 1: Scale a resource at Kubernetes dashboard

Alternatively, execute the following kubectl command:

kubectl scale -n default replicaset hello-minikube-64b64df8c9 --replicas=2

There are two hello-minikube pods now:

$ kubectl get pod | grep hello-minikube
hello-minikube-64b64df8c9-2s4rh 1/1 Running
9 7d7h
hello-minikube-64b64df8c9-w8wkz 1/1 Running
7 6d3h

 

Start Prometheus and Grafana

The next step is to start using Prometheus and Grafana for performance monitoring of the Kubernetes services.

Now, let’s expose Prometheus and Grafana on ports 9090 and 3000, respectively:

$ kubectl port-forward $(kubectl get pods --selector=app.kubernetes.io/name=grafana --output=jsonpath="{.items..metadata.name}") 3000 &
[2] 29016
$ Forwarding from 127.0.0.1:3000 -> 3000
Forwarding from [::1]:3000 -> 3000

$ kubectl port-forward -n default prometheus-my-prometheus-operator-prometheus-0 9090:9090 &
[3] 29020
$ Forwarding from 127.0.0.1:9090 -> 9090
Forwarding from [::1]:9090 -> 9090

Use the following URLs to access Grafana and Prometheus on the local machine:

http://localhost:3000
http://localhost:9090

Please note that the default credentials for Grafana are:

Login: admin
Password: prom-operator

 

Observing Service Performance

Generating HTTP requests to hello-minikube service increases CPU usage by the pods where hello-minikube service resides. That increase can be observed and measured by Grafana and Prometheus.

Use the  following minikube command to determine the URL for HTTP requests:

$ minikube service hello-minikube --url
http://192.168.99.100:32086

Let’s generate twenty HTTP GET requests to the  hello-minikube service:

$ for i in `seq 1 20`; do curl http://192.168.99.100:32086; done
Hostname: hello-minikube-64b64df8c9-w8wkz
Pod Information:
-no pod information available-
Server values:
server_version=nginx: 1.13.3 - lua: 10008
Request Information:
client_address=172.17.0.1
method=GET
real path=/
query=
request_version=1.1
request_scheme=http
request_uri=http://192.168.99.100:8080/
Request Headers:
accept=*/*
host=192.168.99.100:32086
user-agent=curl/7.64.1
Request Body:
-no body in request-
Hostname: hello-minikube-64b64df8c9-2s4rh
Pod Information:
-no pod information available-
Server values:
server_version=nginx: 1.13.3 - lua: 10008
Request Information:
client_address=172.17.0.1
method=GET
real path=/
query=
request_version=1.1
request_scheme=http
request_uri=http://192.168.99.100:8080/
Request Headers:
accept=*/*
host=192.168.99.100:32086
user-agent=curl/7.64.1
Request Body:
-no body in request-

HTTP GET requests are evenly distributed between two hello-minikube pods: hello-minikube-64b64df8c9-w8wkz and hello-minikube-64b64df8c9-2s4rh. On the following charts, you can see two CPU usage spikes for each of the hello-minikube pods.

Spikes are caused by two groups of twenty HTTP GET requests, generated one after another. Each group of twenty HTTP GET requests is initiated by for-loop with curl command that I previously explained.

The dashboard in Figure 2 below shows spikes in CPU usage for hello-minikube-64b64df8c9-w8wkz pod:

Figure 2: Grafana Compute Resource Dashboard – CPU Usage chart for hello-minikube-64b64df8c9-w8wkz pod

The next chart shows spikes in CPU usage for hello-minikube-64b64df8c9-2s4rh pod:

Figure 3: Grafana Compute Resource Dashboard – CPU Usage chart for hello-minikube- hello-minikube-64b64df8c9-2s4rh pod

Summary

In this blog, I discuss the simulation of a real-life scenario that may lead to service performance deterioration.

The simulation setup includes Kubernetes service, which is implemented as an HTTP server and runs in two Kubernetes pods. Curl commands generate HTTP requests to Kubernetes service, which results in an increase in CPU usage by Kubernetes service. Prometheus and Grafana that are running in the same Kubernetes cluster detect a surge in CPU usage. The surge of CPU utilization is monitored to avoid service performance issues.

Contact Novaima Follow Novaima