Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • she/theodolite
1 result
Show changes
Commits on Source (312)
Showing
with 3019 additions and 82 deletions
......@@ -91,7 +91,7 @@ test-docs-links:
build-docs-crds:
stage: build
image:
name: ghcr.io/fybrik/crdoc:0.6.1
name: ghcr.io/fybrik/crdoc:0.6.2
entrypoint: [""]
script: /crdoc --resources theodolite/crd/ --template docs/api-reference/crds.tmpl --output docs/api-reference/crds.ref.md
artifacts:
......@@ -160,7 +160,7 @@ test-helm:
# Display initial pods, etc.
- cd helm
- helm dependencies update .
- helm install theodolite . -f preconfigs/minimal.yaml --wait
- helm install theodolite . -f preconfigs/minimal.yaml --wait --timeout 10m0s
- kubectl get nodes -o wide
- kubectl get pods --all-namespaces -o wide
- kubectl get services --all-namespaces -o wide
......@@ -573,6 +573,16 @@ smoketest-uc3-kstreams:
DOCKER_COMPOSE_DIR: "uc3-kstreams"
JAVA_PROJECT_DEPS: "uc3-kstreams,kstreams-commons,uc3-load-generator,load-generator-commons"
smoketest-uc3-flink:
extends: .smoketest-benchmarks
needs:
- deploy-uc3-flink
- deploy-uc3-load-generator
variables:
DOCKER_COMPOSE_DIR: "uc3-flink"
JAVA_PROJECT_DEPS: "uc3-flink,flink-commons,uc3-load-generator,load-generator-commons"
smoketest-uc3-beam-flink:
extends: .smoketest-benchmarks
needs:
......@@ -747,7 +757,7 @@ deploy-theodolite:
test-slo-checker-lag-trend:
stage: test
needs: []
image: python:3.7-slim
image: python:3.8-slim
before_script:
- cd slo-checker/record-lag
script:
......@@ -760,26 +770,10 @@ test-slo-checker-lag-trend:
- when: manual
allow_failure: true
test-slo-checker-dropped-records-kstreams:
stage: test
needs: []
image: python:3.7-slim
before_script:
- cd slo-checker/dropped-records
script:
- pip install -r requirements.txt
- cd app
- python -m unittest
rules:
- changes:
- slo-checker/dropped-records/**/*
- when: manual
allow_failure: true
test-slo-checker-generic:
stage: test
needs: []
image: python:3.7-slim
image: python:3.8-slim
before_script:
- cd slo-checker/generic
script:
......@@ -810,24 +804,6 @@ deploy-slo-checker-lag-trend:
when: manual
allow_failure: true
deploy-slo-checker-dropped-records-kstreams:
stage: deploy
extends:
- .kaniko-push
needs:
- test-slo-checker-dropped-records-kstreams
before_script:
- cd slo-checker/dropped-records
variables:
IMAGE_NAME: theodolite-slo-checker-dropped-records-kstreams
rules:
- changes:
- slo-checker/dropped-records/**/*
if: "$CR_PUBLIC_HOST && $CR_PUBLIC_ORG && $CR_PUBLIC_USER && $CR_PUBLIC_PW"
- if: "$CR_PUBLIC_HOST && $CR_PUBLIC_ORG && $CR_PUBLIC_USER && $CR_PUBLIC_PW"
when: manual
allow_failure: true
deploy-slo-checker-generic:
stage: deploy
extends:
......@@ -855,12 +831,12 @@ deploy-random-scheduler:
- .kaniko-push
needs: []
before_script:
- cd execution/infrastructure/random-scheduler
- cd util/random-scheduler
variables:
IMAGE_NAME: theodolite-random-scheduler
rules:
- changes:
- execution/infrastructure/random-scheduler/**/*
- util/random-scheduler/**/*
if: "$CR_PUBLIC_HOST && $CR_PUBLIC_ORG && $CR_PUBLIC_USER && $CR_PUBLIC_PW"
- if: "$CR_PUBLIC_HOST && $CR_PUBLIC_ORG && $CR_PUBLIC_USER && $CR_PUBLIC_PW"
when: manual
......@@ -873,6 +849,7 @@ deploy-buildimage-docker-compose-jq:
needs: []
variables:
DOCKER_VERSION: 20.10.12
BUILD_ARG_DOCKER_VERSION: $DOCKER_VERSION
IMAGE_NAME: theodolite-build-docker-compose-jq
IMAGE_TAG: $DOCKER_VERSION
before_script:
......
......@@ -8,10 +8,10 @@ authors:
given-names: Wilhelm
orcid: "https://orcid.org/0000-0001-6625-4335"
title: Theodolite
version: "0.8.0"
version: "0.9.0"
repository-code: "https://github.com/cau-se/theodolite"
license: "Apache-2.0"
doi: "10.1016/j.bdr.2021.100209"
doi: "10.1007/s10664-022-10162-1"
preferred-citation:
type: article
authors:
......@@ -21,9 +21,9 @@ preferred-citation:
- family-names: Hasselbring
given-names: Wilhelm
orcid: "https://orcid.org/0000-0001-6625-4335"
doi: "10.1016/j.bdr.2021.100209"
journal: "Big Data Research"
month: 7
title: "Theodolite: Scalability Benchmarking of Distributed Stream Processing Engines in Microservice Architectures"
volume: 25
year: 2021
doi: "10.1007/s10664-022-10162-1"
journal: "Empirical Software Engineering"
month: 8
title: "A Configurable Method for Benchmarking Scalability of Cloud-Native Applications"
volume: 27
year: 2022
![Theodolite](docs/assets/logo/theodolite-horizontal-transparent.svg)
# Theodolite
> A theodolite is a precision optical instrument for measuring angles between designated visible points in the horizontal and vertical planes. -- <cite>[Wikipedia](https://en.wikipedia.org/wiki/Theodolite)</cite>
Theodolite is a framework for benchmarking the horizontal and vertical scalability of stream processing engines. It consists of three modules:
Theodolite is a framework for benchmarking the horizontal and vertical scalability of cloud-native applications in Kubernetes.
## Quickstart
Theodolite runs scalability benchmarks in Kubernetes. Follow our [quickstart guide](https://www.theodolite.rocks/quickstart.html) to get started.
## Documentation
## Theodolite Benchmarking Tool
Documentation on Theodolite itself as well as regarding its benchmarking method can be found on the [Theodolite website](https://www.theodolite.rocks).
Theodolite aims to benchmark scalability of stream processing engines for real use cases. Microservices that apply stream processing techniques are usually deployed in elastic cloud environments. Hence, Theodolite's cloud-native benchmarking framework deploys its components in a cloud environment, orchestrated by Kubernetes. It is recommended to install Theodolite with the package manager Helm. The Theodolite Helm chart along with instructions how to install it can be found in the [`helm`](helm) directory.
## Project Structure
## Theodolite Analysis Tools
* Core of Theodolite is its Kubernetes Operator, implemented in Kotlin. The source-code can be found in [`theodolite`](theodolite).
* Theodolite's Helm chart and templates are maintained in [`helm`](helm).
* We provide Juptyer notebooks for analyzing and visualizing the results of benchmark executions in [`analysis`](analysis).
* Theodolite comes with 4 application benchmarks, which are based on typical use cases for stream processing within microservices. Implementations of these benchmarks with several state-of-the art stream processing frameworks as well as corresponding load generators can be found in [`theodolite-benchmarks`](theodolite-benchmarks). This includes both the source code of the implementations as well as benchmark definitions for Theodolite in [`theodolite-benchmarks/definitions`](theodolite-benchmarks/definitions).
* The source code of Theodolite's SLO checkers are located in [`slo-checker`](slo-checker).
* The documentation, which is hosted on [theodolite.rocks](https://www.theodolite.rocks), is located in [`docs`](docs).
Theodolite's benchmarking method maps load intensities to the resource amounts that are required for processing them. A plot showing how resource demand evolves with an increasing load allows to draw conclusions about the scalability of a stream processing engine or its deployment. Theodolite provides Jupyter notebooks for creating such plots based on benchmarking results from the execution framework. More information can be found in [Theodolite analysis tool](analysis).
## Contributing
## Theodolite Benchmarks
We are happy to accept any kind of contributions to Theodolite.
This includes reporting any issues you find using Theodolite, bug fixes and improvements as well as integrating your research within the project.
Theodolite comes with 4 application benchmarks, which are based on typical use cases for stream processing within microservices. For each benchmark, a corresponding load generator is provided. Currently, this repository provides benchmark implementations for Apache Kafka Streams and Apache Flink. The benchmark sources can be found in [Thedolite benchmarks](theodolite-benchmarks).
See our website to [start contributing](https://www.theodolite.rocks/development/).
## How to Cite
If you use Theodolite, please cite
> Sören Henning and Wilhelm Hasselbring. (2021). Theodolite: Scalability Benchmarking of Distributed Stream Processing Engines in Microservice Architectures. Big Data Research, Volume 25. DOI: [10.1016/j.bdr.2021.100209](https://doi.org/10.1016/j.bdr.2021.100209). arXiv:[2009.00304](https://arxiv.org/abs/2009.00304).
> Sören Henning and Wilhelm Hasselbring. “A Configurable Method for Benchmarking Scalability of Cloud-Native Applications”. In: *Empirical Software Engineering* 27. 2022. DOI: [10.1007/s10664-022-10162-1](https://doi.org/10.1007/s10664-022-10162-1).
When referring to our stream processing benchmarks, please cite
> Sören Henning and Wilhelm Hasselbring. “Theodolite: Scalability Benchmarking of Distributed Stream Processing Engines in Microservice Architectures”. In: *Big Data Research* 25. 2021. DOI: [10.1016/j.bdr.2021.100209](https://doi.org/10.1016/j.bdr.2021.100209). arXiv:[2009.00304](https://arxiv.org/abs/2009.00304).
See our website for a [list of publications](https://www.theodolite.rocks/publications.html) directly related to Theodolite.
\ No newline at end of file
FROM jupyter/base-notebook
FROM jupyter/base-notebook:python-3.8
COPY . /home/jovyan
......
# Theodolite Build Images
This directory contains some Dockerfiles for images required for Theodolite build infrastructure.
FROM docker:${DOCKER_VERSION:-latest}
ARG DOCKER_VERSION=latest
FROM docker:${DOCKER_VERSION}
RUN apk update && \
apk add jq && \
......
......@@ -5,16 +5,16 @@
"codeRepository": "https://github.com/cau-se/theodolite",
"dateCreated": "2020-03-13",
"datePublished": "2020-07-27",
"dateModified": "2022-07-18",
"dateModified": "2023-07-18",
"downloadUrl": "https://github.com/cau-se/theodolite/releases",
"name": "Theodolite",
"version": "0.8.0",
"version": "0.9.0",
"description": "Theodolite is a framework for benchmarking the horizontal and vertical scalability of cloud-native applications.",
"developmentStatus": "active",
"relatedLink": [
"https://www.theodolite.rocks"
],
"referencePublication": "https://doi.org/10.1016/j.bdr.2021.100209",
"referencePublication": "https://doi.org/10.1007/s10664-022-10162-1",
"programmingLanguage": [
"Kotlin",
"Java",
......@@ -28,10 +28,10 @@
"@type": "Person",
"givenName": "Sören",
"familyName": "Henning",
"email": "soeren.henning@email.uni-kiel.de",
"email": "soeren.henning@jku.at",
"affiliation": {
"@type": "Organization",
"name": "Department of Computer Science, Kiel University"
"name": "JKU/Dynatrace Co-Innovation Lab, LIT CPS Lab, Johannes Kepler University Linz"
}
}
]
......
GEM
remote: https://rubygems.org/
specs:
activesupport (6.0.4.8)
activesupport (6.0.6.1)
concurrent-ruby (~> 1.0, >= 1.0.2)
i18n (>= 0.7, < 2)
minitest (~> 5.1)
......@@ -14,8 +14,8 @@ GEM
execjs
coffee-script-source (1.11.1)
colorator (1.1.0)
commonmarker (0.23.6)
concurrent-ruby (1.1.10)
commonmarker (0.23.9)
concurrent-ruby (1.2.0)
dnsruby (1.61.9)
simpleidn (~> 0.1)
em-websocket (0.5.3)
......@@ -237,9 +237,9 @@ GEM
jekyll (>= 3.5, < 5.0)
jekyll-feed (~> 0.9)
jekyll-seo-tag (~> 2.1)
minitest (5.15.0)
minitest (5.17.0)
multipart-post (2.1.1)
nokogiri (1.13.6-x86_64-linux)
nokogiri (1.14.3-x86_64-linux)
racc (~> 1.4)
octokit (4.22.0)
faraday (>= 0.9)
......@@ -248,7 +248,7 @@ GEM
pathutil (0.16.2)
forwardable-extended (~> 2.6)
public_suffix (4.0.7)
racc (1.6.0)
racc (1.6.2)
rainbow (3.1.1)
rb-fsevent (0.11.1)
rb-inotify (0.10.1)
......@@ -280,7 +280,7 @@ GEM
unf_ext (0.0.8.1)
unicode-display_width (1.8.0)
yell (2.2.2)
zeitwerk (2.5.4)
zeitwerk (2.6.6)
PLATFORMS
x86_64-linux
......
......@@ -39,5 +39,5 @@ crdoc --resources ../theodolite/crd/ --template api-reference/crds.tmpl --outpu
With the following command, crdoc is executed in Docker:
```sh
docker run --rm -v "`pwd`/../theodolite/crd/":/crd -v "`pwd`/api-reference":/api-reference ghcr.io/fybrik/crdoc:0.6.1 --resources /crd/ --template /api-reference/crds.tmpl --output /api-reference/crds.md
docker run --rm -v "`pwd`/../theodolite/crd/":/crd -v "`pwd`/api-reference":/api-reference ghcr.io/fybrik/crdoc:0.6.2 --resources /crd/ --template /api-reference/crds.tmpl --output /api-reference/crds.md
```
......@@ -55,7 +55,7 @@ Resource Types:
<td>true</td>
</tr>
<tr>
<td><b><a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#objectmeta-v1-meta">metadata</a></b></td>
<td><b><a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#objectmeta-v1-meta">metadata</a></b></td>
<td>object</td>
<td>Refer to the Kubernetes API documentation for the fields of the `metadata` field.</td>
<td>true</td>
......@@ -2240,7 +2240,7 @@ Contains the Kafka configuration.
<td>true</td>
</tr>
<tr>
<td><b><a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#objectmeta-v1-meta">metadata</a></b></td>
<td><b><a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#objectmeta-v1-meta">metadata</a></b></td>
<td>object</td>
<td>Refer to the Kubernetes API documentation for the fields of the `metadata` field.</td>
<td>true</td>
......
......@@ -62,7 +62,7 @@ Resource Types:
<td>true</td>
</tr>
<tr>
<td><b><a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#objectmeta-v1-meta">metadata</a></b></td>
<td><b><a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#objectmeta-v1-meta">metadata</a></b></td>
<td>object</td>
<td>Refer to the Kubernetes API documentation for the fields of the `metadata` field.</td>
<td>true</td>
......
......@@ -27,6 +27,18 @@ Patchers can be seen as functions which take a value as input and modify a Kuber
* **properties**:
* loadGenMaxRecords: 150000
* **DataVolumeLoadGeneratorReplicaPatcher**: Takes the total load that should be generated and computes the number of instances needed for this load based on the `maxVolume` ((load + maxVolume - 1) / maxVolume) and calculates the load per instance (loadPerInstance = load / instances). The number of instances are set for the load generator and the given variable is set to the load per instance.
* **type**: "DataVolumeLoadGeneratorReplicaPatcher"
* **resource**: "osp-load-generator-deployment.yaml"
* **properties**:
* maxVolume: "50"
* container: "workload-generator"
* variableName: "DATA_VOLUME"
* **ReplicaPatcher**: Allows to modify the number of Replicas for a kubernetes deployment.
* **type**: "ReplicaPatcher"
* **resource**: "uc1-kstreams-deployment.yaml"
* **EnvVarPatcher**: Modifies the value of an environment variable for a container in a Kubernetes deployment.
* **type**: "EnvVarPatcher"
* **resource**: "uc1-load-generator-deployment.yaml"
......@@ -34,6 +46,14 @@ Patchers can be seen as functions which take a value as input and modify a Kuber
* container: "workload-generator"
* variableName: "NUM_SENSORS"
* **ConfigMapYamlPatcher**: allows to add/modify a key-value pair in a YAML file of a ConfigMap
* **type**: "ConfigMapYamlPatcher"
* **resource**: "flink-configuration-configmap.yaml"
* **properties**:
* fileName: "flink-conf.yaml"
* variableName: "jobmanager.memory.process.size"
* **value**: "4Gb"
* **NodeSelectorPatcher**: Changes the node selection field in Kubernetes resources.
* **type**: "NodeSelectorPatcher"
* **resource**: "uc1-load-generator-deployment.yaml"
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -8,12 +8,13 @@ nav_order: 2
> A theodolite is a precision optical instrument for measuring angles between designated visible points in the horizontal and vertical planes. -- <cite>[Wikipedia](https://en.wikipedia.org/wiki/Theodolite)</cite>
Theodolite is a framework for benchmarking the horizontal and vertical scalability of cloud-native applications.
Inspired by its namesake, Theodolite is a framework for benchmarking the horizontal and vertical scalability of [cloud-native applications](https://github.com/cncf/toc/blob/main/DEFINITION.md).
It relies on Kubernetes, the de-facto standard for orchestrating cloud-native applications, as a platform to define and execute benchmarks.
Theodolite adopts established definitions of scalability in cloud computing for its benchmarking method. It quantifies
scalability by running isolated experiments for different load intensities and provisioned resource amounts, which assess whether specified SLOs are fulfilled. [Two metrics are available](metrics): The demand metric describes how the amount of minimal required resources evolve with increasing load intensities, while the capacity metric describes how the maximal processable load evolves with increasing resources. Hence, both metrics are functions. <!--Example?-->
The terms load, resources and SLOs are consciously kept abstract as Theodolite leaves it to the benchmark designer to define what type of load, resources, and SLOs should be evaluated. For example, horizontal scalability can be benchmarked by varying the amount of Kubernetes Pods, while vertical scalability can be benchmarked by varying CPU and memory constraints of Pods.
To balance statistical grounding and time-efficient benchmark execution, Theodolite comes with different [heuristic for
evaluating the search space](search-strategies) of load and resource combinations. Other configuration options include the number of repetitions, the experiment and warm-up duration, as well as the amount of different load and resource values to be evaluated.
\ No newline at end of file
To balance statistical grounding and time-efficient benchmark execution, Theodolite comes with different [heuristic for evaluating the search space](search-strategies) of load and resource combinations. Other configuration options include the number of repetitions, the experiment and warm-up duration, as well as the amount of different load and resource values to be evaluated. Increasing the experiment duration and number of repetitions helps to reduce the variance of the results. However, it also increases the time needed to execute a benchmark.
---
title: Creating Benchmarks
has_children: false
has_children: true
nav_order: 5
---
......@@ -70,11 +70,14 @@ spec:
In Theodolite, the system under test (SUT), the load generator as well as additional infrastructure (e.g., a middleware) are described by Kubernetes resources files.
All resources defined for the SUT and the load generator are started and stopped for each SLO experiment, with SUT resources being started before the load generator.
Infrastructure resources live over the entire duration of a benchmark run. They avoid time-consuming recreation of software components like middlewares, but should be used with caution to not let previous SLO experiments influence latte ones.
Infrastructure resources are kept alive throughout the entire duration of a benchmark run. They avoid time-consuming recreation of software components like middlewares, but should be used with caution so that earlier SLO experiments do not influence later ones.
### Resources
The recommended way to link Kubernetes resources files from a Benchmark is by bundling them in one or multiple ConfigMaps and refer to that ConfigMap from `sut.resources`, `loadGenerator.resources` or `infrastructure.resources`.
**Note:** Theodolite requires that each resources file contains only a single resource (i.e., YAML document).
To create a ConfigMap from all the Kubernetes resources in a directory run:
```sh
......@@ -175,7 +178,8 @@ An Execution must at least define one SLO to be checked.
A good choice to get started is defining an SLO of type `generic`:
```yaml
- sloType: "generic"
- name: droppedRecords
sloType: generic
prometheusUrl: "http://prometheus-operated:9090"
offset: 0
properties:
......@@ -191,6 +195,9 @@ A good choice to get started is defining an SLO of type `generic`:
All you have to do is to define a [PromQL query](https://prometheus.io/docs/prometheus/latest/querying/basics/) describing which metrics should be requested (`promQLQuery`) and how the resulting time series should be evaluated. With `queryAggregation` you specify how the resulting time series is aggregated to a single value and `repetitionAggregation` describes how the results of multiple repetitions are aggregated. Possible values are
`mean`, `median`, `mode`, `sum`, `count`, `max`, `min`, `std`, `var`, `skew`, `kurt` as well as percentiles such as `p99` or `p99.9`. The result of aggregation all repetitions is checked against `threshold`. This check is performed using an `operator`, which describes that the result must be "less than" (`lt`), "less than equal" (`lte`), "greater than" (`gt`) or "greater than equal" (`gte`) to the threshold.
If you do not want to have a static threshold, you can also define it relatively to the tested load with `thresholdRelToLoad` or relatively to the tested resource value with `thresholdRelToResources`. For example, setting `thresholdRelToLoad: 0.01` means that in each experiment, the threshold is 1% of the generated load.
Even more complex thresholds can be defined with `thresholdFromExpression`. This field accepts a mathematical expression with two variables `L` and `R` for the load and resources, respectively. The previous example with a threshold of 1% of the generated load can thus also be defined with `thresholdFromExpression: 0.01*L`. For further details of allowed expressions, see the documentation of the underlying [exp4j](https://github.com/fasseg/exp4j) library.
In case you need to evaluate monitoring data in a more flexible fashion, you can also change the value of `externalSloUrl` to your custom SLO checker. Have a look at the source code of the [generic SLO checker](https://github.com/cau-se/theodolite/tree/main/slo-checker/generic) to get started.
## Kafka Configuration
......
---
title: Development
title: Contributing
has_children: true
nav_order: 10
---
\ No newline at end of file
---
# Contributing
Theodolite is open-source research software. We welcome everyone to contribute to this project.
Contributions are not limited to code contributions, instead we welcome and recognize everything concerning:
* Raising issues, questions and suggestions for using Theodolite
* Fixing bugs or implementing new features
* Improving the documentation
* Using Theodolite as part of your (not necessarily scientific) research
* Reporting on your scalability evaluations with Theodolite
## Start Contributing
If you have bug reports, feature requests, questions or suggestions, you may create a [GitHub issue](https://github.com/cau-se/theodolite/issues) or directly [contact Theodolite's maintainers](../project-info).
You can also create a [GitHub pull request](https://github.com/cau-se/theodolite/pulls) if you have already implemented bug fixes and improvements.
If you would like to get more involved in Theodolite's project development and maintenance, you may contact us as well so we can set you up an account for [our internal GitLab](../project-info#project-management).
## Internal Project Structure
Theodolite is organized as a monorepo containing multiple largely independent modules in subdirectories.
See the project's [`README.md`](https://github.com/cau-se/theodolite/blob/main/README.md#project-structure) for an overview of all modules.
Each module directory provides a dedicated `README.md` file describing how to build, test, package,... the corresponding module.