Skip to content
Snippets Groups Projects
Commit 2b48d30d authored by Sören Henning's avatar Sören Henning
Browse files

Add docs for load generator

parent d1de7b09
No related branches found
No related tags found
No related merge requests found
Pipeline #6091 passed
---
title: Available Benchmarks
has_children: false
has_children: true
nav_order: 7
---
# Theodolite Benchmarks
Theodolite comes with 4 application benchmarks, which are based on typical use cases for stream processing within microservices. For each benchmark, a corresponding load generator is provided. Currently, Theodolite provides benchmark implementations for Apache Kafka Streams and Apache Flink.
Theodolite comes with 4 application benchmarks, which are based on typical use cases for stream processing within microservices. For each benchmark, a corresponding [load generator](load-generator) is provided. Currently, Theodolite provides benchmark implementations for Apache Kafka Streams and Apache Flink.
Theodolite's benchmarks are based on typical use cases for stream processing within microservices. Specifically, all benchmarks represent some sort of microservice doing Industrial Internet of Things data analytics.
......
---
title: Load Generators
parent: Available Benchmarks
has_children: false
nav_order: 1
---
# Load Generator Framework
Theodolite's benchmarks come with a flexible load generator framework. It is used to create load on the [4 Theodolite benchmarks](#prebuild-container-images), but can also be applied to create [custom load generators](#creating-a-custom-load-generator).
It is particularly designed for scalability: Just spin up multiple instances of the load generator and the instances automatically divide the load to be generated among themselves.
## Prebuild container images
For each benchmark, we provide a [load generator as OCI container image](https://github.com/orgs/cau-se/packages?tab=packages&q=workload-generator). These load generators simulate smart power meters in an industrial facility, which generate measurement records at a fixed rate. Records are published to an Apache Kafka topic (default) or sent as POST requests to an HTTP endpoint.
You can simply run a load generator container, for example, for benchmark UC1 with:
```sh
docker run ghcr.io/cau-se/theodolite-uc1-workload-generator
```
### Message format
Messages generated by the load generators represent a single measurement of [active power](https://en.wikipedia.org/wiki/AC_power#Active,_reactive,_apparent,_and_complex_power_in_sinusoidal_steady-state). The corresponding message type is specified as [`ActivePowerRecords`](https://github.com/cau-se/titan-ccp-common/blob/master/src/main/avro/ActivePower.avdl)
defined with Avro. It consists of an identifier for simulated power sensor, a timestamp in epoch milliseconds and the actual measured (simulated) value in watts.
When sending generated records via Apache Kafka, these records are serialized with the [Confluent Schema Registry](https://docs.confluent.io/platform/current/schema-registry).
If the load generator is configured to send records as HTTP POST requests, records are serialized as JSON according to the following format:
```json
{
"identifier": "sensor-id",
"timestamp": 1645564942000,
"valueInW": 1234.56
}
```
### Configuration
The prebuild container images can be configured with the following environment variables:
| Environment Variable | Description | Default |
|:----|:----|:----|
| `BOOTSTRAP_SERVER` | Address (`hostname:port`) of another load generator instance to form a cluster with. Can also be this instance. | `localhost:5701` |
| `KUBERNETES_DNS_NAME` | Kubernetes service name to discover other load generators to form a cluster with. Must be a fully qualified domain name (FQDN), e.g., something like `<service>.<namespace>.svc.cluster.local`. * Requires `BOOTSTRAP_SERVER` not to be set. | |
| `PORT` | Port used for for coordination among load generator instances. | 5701 |
| `PORT_AUTO_INCREMENT` | If set to true and the specified PORT is already used, use the next higher one. Useful if multiple instances should run on the same host, without configuring each instance individually. | true |
| `CLUSTER_NAME_PREFIX` | Only required if unrelated load generators form a cluster. | theodolite-load-generation |
| `TARGET` | The target system the load generator send messages to. Valid values are: `kafka`, `http`. | `kafka` |
| `KAFKA_BOOTSTRAP_SERVERS` | A list of host/port pairs to use for establishing the initial connection to the Kafka cluster. See [Kafka producer config: `bootstrap.servers`](https://kafka.apache.org/documentation/#producerconfigs_bootstrap.servers) for more information. Only used if Kafka is set as `TARGET`. | `localhost:9092` |
| `KAFKA_INPUT_TOPIC` | Name of the Kafka topic, which should receive the generated messages. Only used if Kafka is set as `TARGET`. | input |
| `SCHEMA_REGISTRY_URL` | URL of the [Confluent Schema Registry](https://docs.confluent.io/platform/current/schema-registry). | `http://localhost:8081` |
| `KAFKA_BATCH_SIZE` | Value for the Kafka producer configuration: [`batch.size`](https://kafka.apache.org/documentation/#producerconfigs_batch.size). Only used if Kafka is set as `TARGET`. | see Kafka producer config: [`batch.size`](https://kafka.apache.org/documentation/#producerconfigs_batch.size) |
| `KAFKA_LINGER_MS` | Value for the Kafka producer configuration: [`linger.ms`](https://kafka.apache.org/documentation/#producerconfigs_linger.ms). Only used if Kafka is set as `TARGET`. | see Kafka producer config: [`linger.ms`](https://kafka.apache.org/documentation/#producerconfigs_linger.ms) |
| `KAFKA_BUFFER_MEMORY` | Value for the Kafka producer configuration: [`buffer.memory`](https://kafka.apache.org/documentation/#producerconfigs_buffer.memory) Only used if Kafka is set as `TARGET`. | see Kafka producer config: [`buffer.memory`](https://kafka.apache.org/documentation/#producerconfigs_buffer.memory) |
| `HTTP_URL` | The URL the load generator should post messages to. Only used if HTTP is set as `TARGET`. | |
| `NUM_SENSORS` | The amount of simulated sensors. | 10 |
| `PERIOD_MS` | The time in milliseconds between generating two messages for the same sensor. With our Theodolite benchmarks, we apply an [open workload model](https://www.usenix.org/legacy/event/nsdi06/tech/full_papers/schroeder/schroeder.pdf) in which new messages are generated at a fixed rate, without considering the think time of the target server nor the time required for generating a message. | 1000 |
| `VALUE` | The constant `valueInW` of an `ActivePowerRecord`. | 10 |
| `THREADS` | Number of worker threads used to generate the load. | 4 |
Please note that there are some additional configuration options for benchmark [UC4's load generator](https://github.com/cau-se/theodolite/blob/master/theodolite-benchmarks/uc4-load-generator/src/main/java/theodolite/uc4/workloadgenerator/LoadGenerator.java).
## Creating a custom load generator
To create a custom load generator, you need to import the [load-generator-commons](https://github.com/cau-se/theodolite/tree/master/theodolite-benchmarks/load-generator-commons) project. You can then create an instance of the `LoadGenerator` object and call its `run` method:
```java
LoadGenerator loadGenerator = new LoadGenerator()
.setClusterConfig(clusterConfig)
.setLoadDefinition(new WorkloadDefinition(
new KeySpace(key_prefix, numSensors),
duration))
.setGeneratorConfig(new LoadGeneratorConfig(
recordGenerator,
recordSender))
.withThreads(threads);
loadGenerator.run();
```
Alternatively, you can also start with a load generator populated with a default configuration or created from environment variables and then adjust the `LoadGenerator` as desired:
```java
LoadGenerator loadGeneratorFromDefaults = LoadGenerator.fromDefaults()
LoadGenerator loadGeneratorFromEnv = LoadGenerator.fromEnvironment();
```
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment