From 5e0a28410f89ac6da93b8d62fd5637610690db7e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?S=C3=B6ren=20Henning?= <soeren.henning@email.uni-kiel.de> Date: Mon, 17 Jan 2022 14:07:52 +0100 Subject: [PATCH] Add docs for generic SLO checker --- docs/creating-an-execution.md | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/docs/creating-an-execution.md b/docs/creating-an-execution.md index e70893e7e..263d630ff 100644 --- a/docs/creating-an-execution.md +++ b/docs/creating-an-execution.md @@ -58,7 +58,29 @@ As a Benchmark may define multiple supported load and resource types, an Executi ## Definition of SLOs SLOs provide a way to quantify whether a certain load intensity can be handled by a certain amount of provisioned resources. -An Execution must at least specify one SLO to be checked. +In Theodolite, SLO are evaluated by requesting monitoring data from Prometheus and analyzing it in a benchmark-specific way. +An Execution must at least define one SLO to be checked. + +A good choice to get started is defining an SLO of type `generic`: + +```yaml +- sloType: "generic" + prometheusUrl: "http://prometheus-operated:9090" + offset: 0 + properties: + externalSloUrl: "http://localhost:8082" + promQLQuery: "sum by(job) (kafka_streams_stream_task_metrics_dropped_records_total>=0)" + warmup: 60 # in seconds + queryAggregation: max + repetitionAggregation: median + operator: lte + threshold: 1000 +``` + +All you have to do is to define a [PromQL query](https://prometheus.io/docs/prometheus/latest/querying/basics/) describing which metrics should be requested (`promQLQuery`) and how the resulting time series should be evaluated. With `queryAggregation` you specify how the resulting time series is aggregated to a single value and `repetitionAggregation` describes how the results of multiple repetitions are aggregated. Possible values are +`mean`, `median`, `mode`, `sum`, `count`, `max`, `min`, `std`, `var`, `skew`, `kurt` as well as percentiles such as `p99` or `p99.9`. The result of aggregation all repetitions is checked against `threshold`. This check is performed using an `operator`, which describes that the result must be "less than" (`lt`), "less than equal" (`lte`), "greater than" (`gt`) or "greater than equal" (`gte`) to the threshold. + +In case you need to evaluate monitoring data in a more flexible fashion, you can also change the value of `externalSloUrl` to your custom SLO checker. Have a look at the source code of the [generic SLO checker](https://github.com/cau-se/theodolite/tree/master/slo-checker/generic) to get started. ## Experimental Setup @@ -72,7 +94,7 @@ The experimental setup can be configured by: ## Configuration Overrides -In cases where only small modifications of a system under test should be benchmarked, it is not necessarily required to [create a new benchmark](creating-a-benchmark). +In cases where only small modifications of a system under test should be benchmarked, it is not necessary to [create a new benchmark](creating-a-benchmark). Instead, also Executions allow to do small reconfigurations, such as switching on or off a specific Pod scheduler. This is done by defining `configOverrides` in the Execution. Each override consists of a patcher, defining which Kubernetes resource should be patched in which way, and a value the patcher is applied with. -- GitLab