Skip to content
Snippets Groups Projects

Requirements

Kubernetes Cluster

For executing benchmarks, access to Kubernetes cluster is required. We suggest to create a dedicated namespace for executing our benchmarks. The following services need to be available as well.

Prometheus

We suggest to use the Prometheus Operator and create a dedicated Prometheus instance for these benchmarks.

If Prometheus Operator is not already available on your cluster, a convenient way to install is via the unofficial Prometheus Operator Helm chart. As you may not need an entire cluster monitoring stack, you can use our Helm configuration to only install the operator:

helm install prometheus-operator stable/prometheus-operator -f infrastructure/prometheus/helm-values.yaml

After installation, you need to create a Prometheus instance:

kubectl apply -f infrastructure/prometheus/prometheus.yaml

You might also need to apply the ServiceAccount, ClusterRole and the CusterRoleBinding, depending on your cluster's security policies.

For the individual benchmarking components to be monitored, ServiceMonitors are used. See the corresponding sections below for how to install them.

Grafana

As with Prometheus, we suggest to create a dedicated Grafana instance. Grafana with our default configuration can be installed with Helm:

helm install grafana stable/grafana -f infrastructure/grafana/values.yaml

The official Grafana Helm Chart repository provides further documentation including a table of configuration options.

We provide ConfigMaps for a Grafana dashboard and a Grafana data source.

Create the Configmap for the dashboard:

kubectl apply -f infrastructure/grafana/dashboard-config-map.yaml

Create the Configmap for the data source:

kubectl apply -f infrastructure/grafana/prometheus-datasource-config-map.yaml

A Kafka cluster

One possible way to set up a Kafka cluster is via Confluent's Helm Charts. For using these Helm charts and conjuction with the Prometheus Operator (see below), we provide a patch for these helm charts. Note that this patch is only required for observation and not for the actual benchmark execution and evaluation.

Our patched Confluent Helm Charts

To use our patched Confluent Helm Charts clone the chart's repsoitory. We also provide a default configuration. If you do not want to deploy 10 Kafka and 3 Zookeeper instances, alter the configuration file accordingly. To install Confluent's Kafka and use the configuration:

helm install my-confluent <path-to-cp-helm-charts> -f infrastructure/kafka/values.yaml

To let Prometheus scrape Kafka metrics, deploy a ServiceMonitor:

kubectl apply -f infrastructure/kafka/service-monitor.yaml

Other options for Kafka

Other Kafka deployments, for example, using Strimzi, should work in similiar way.

The Kafka Lag Exporter

Lightbend's Kafka Lag Exporter can be installed via Helm. We also provide a default configuration. To install it:

helm install kafka-lag-exporter https://github.com/lightbend/kafka-lag-exporter/releases/download/v0.6.0/kafka-lag-exporter-0.6.0.tgz -f infrastructure/kafka-lag-exporter/values.yaml

To let Prometheus scrape Kafka lag metrics, deploy a ServiceMonitor:

kubectl apply -f infrastructure/kafka-lag-exporter/service-monitor.yaml

Python 3.7

For executing benchmarks and analyzing their results, a Python 3.7 installation is required. We suggest to use a virtual environment placed in the .venv directory.

As set of requirements is needed for the analysis Jupyter notebooks and the execution tool. You can install them with the following command (make sure to be in your virtual environment if you use one):

pip install -r requirements.txt 

Required Manual Adjustments

Depending on your setup, some additional adjustments may be necessary:

  • Change Kafka and Zookeeper servers in the Kubernetes deployments (uc1-application etc.) and run_XX.sh scripts
  • Change Prometheus' URL in lag_analysis.py
  • Change the path to your Python 3.7 virtual environment in the run_XX.sh schripts (to find the venv's bin/activate)
  • Change the name of your Kubernetes namespace for Prometheus' ClusterRoleBinding
  • Please let us know if there are further adjustments necessary

Execution

The ./run_loop.sh is the entrypoint for all benchmark executions. Is has to be called as follows:

./run_loop.sh <use-case> <wl-values> <instances> <partitions> <cpu-limit> <memory-limit> <commit-interval> <duration>
  • <use-case>: Stream processing use case to be benchmarked. Has to be one of 1, 2, 3 or 4.
  • <wl-values>: Values for the workload generator to be tested, separated by commas. For example 100000, 200000, 300000.
  • <instances>: Numbers of instances to be benchmarked, separated by commas. For example 1, 2, 3, 4.
  • <partitions>: Number of partitions for Kafka topics. Optional. Default 40.
  • <cpu-limit>: Kubernetes CPU limit. Optional. Default 1000m.
  • <memory-limit>: Kubernetes memory limit. Optional. Default 4Gi.
  • <commit-interval>: Kafka Streams' commit interval in milliseconds. Optional. Default 100.
  • <duration>: Duration in minutes subexperiments should be executed for. Optional. Default 5.