diff --git a/docs/theodolite-benchmarks/index.md b/docs/theodolite-benchmarks/index.md index 93e617a1fa95f1dd62a8a9545bfdf9c12fd46953..5fcd42255d81c0a2d5cf2107164874075561ae52 100644 --- a/docs/theodolite-benchmarks/index.md +++ b/docs/theodolite-benchmarks/index.md @@ -1,6 +1,6 @@ --- title: Theodolite Benchmarks -has_children: true +has_children: false nav_order: 7 --- @@ -8,4 +8,29 @@ nav_order: 7 Theodolite comes with 4 application benchmarks, which are based on typical use cases for stream processing within microservices. For each benchmark, a corresponding load generator is provided. Currently, Theodolite provides benchmark implementations for Apache Kafka Streams and Apache Flink. -<!-- TODO How to install them--> \ No newline at end of file + +Theodolite's benchmarks are based on typical use cases for stream processing within microservices. Specifically, all benchmarks represent some sort of microservice doing Industrial Internet of Things data analytics. + +## UC1: Database Storage + +A simple, but common use case in event-driven architectures is that events or messages should be stored permanently, for example, in a NoSQL database. + + +## UC2: Downsampling + +Another common use case for stream processing architectures is reducing the amount of events, messages, or measurements by aggregating multiple records within consecutive, non-overlapping time windows. Typical aggregations compute the average, minimum, or maximum of measurements within a time window or +count the occurrence of same events. Such reduced amounts of data are required, for example, to save computing resources or to provide a better user experience (e.g., for data visualizations). +When using aggregation windows of ï¬xed size that succeed each other without gaps (called [tumbling windows](https://kafka.apache.org/20/documentation/streams/developer-guide/dsl-api.html#tumbling-time-windows) in many stream processing enegines), the (potentially varying) message frequency is reduced to a constant value. +This is also referred to as downsampling. Downsampling allows for applying many machine learning methods that require data of a ï¬xed frequency. + + +## UC3: Time Attribute-Based Aggregation + +A second type of temporal aggregation is aggregating messages that have the same time attribute. Such a time attribute is, for example, the hour of day, day of week, or day in the year. This type of aggregation can be used to compute, for example, an average course over the day, the week, or the year. It allows to demonstrate or discover seasonal patterns in the data. + +## UC4: Hierarchical Aggregation + +For analyzing sensor data, often not only the individual measurements of sensors are of interest, but also aggregated data for +groups of sensors. When monitoring energy consumption in industrial facilities, for example, comparing the total consumption +of machine types often provides better insights than comparing the consumption of all individual machines. Additionally, it may +be necessary to combine groups further into larger groups and adjust these group hierarchies at runtime. diff --git a/docs/theodolite-benchmarks/overview.md b/docs/theodolite-benchmarks/overview.md deleted file mode 100644 index 55f2e4682d327709e6627992c39739cd5cbe51aa..0000000000000000000000000000000000000000 --- a/docs/theodolite-benchmarks/overview.md +++ /dev/null @@ -1,34 +0,0 @@ ---- -title: Benchmark Overview -has_children: false -parent: Theodolite Benchmarks -nav_order: 7 ---- - -# Theodolite Benchmark Overview - -Theodolite's benchmarks are based on typical use cases for stream processing within microservices. Specifically, all benchmarks represent some sort of microservice doing Industrial Internet of Things data analytics. - -## UC1: Database Storage - -A simple, but common use case in event-driven architectures is that events or messages should be stored permanently, for example, in a NoSQL database. - - -## UC2: Downsampling - -Another common use case for stream processing architectures is reducing the amount of events, messages, or measurements by aggregating multiple records within consecutive, non-overlapping time windows. Typical aggregations compute the average, minimum, or maximum of measurements within a time window or -count the occurrence of same events. Such reduced amounts of data are required, for example, to save computing resources or to provide a better user experience (e.g., for data visualizations). -When using aggregation windows of ï¬xed size that succeed each other without gaps (called [tumbling windows](https://kafka.apache.org/20/documentation/streams/developer-guide/dsl-api.html#tumbling-time-windows) in many stream processing enegines), the (potentially varying) message frequency is reduced to a constant value. -This is also referred to as downsampling. Downsampling allows for applying many machine learning methods that require data of a ï¬xed frequency. - - -## UC3: Time Attribute-Based Aggregation - -A second type of temporal aggregation is aggregating messages that have the same time attribute. Such a time attribute is, for example, the hour of day, day of week, or day in the year. This type of aggregation can be used to compute, for example, an average course over the day, the week, or the year. It allows to demonstrate or discover seasonal patterns in the data. - -## UC4: Hierarchical Aggregation - -For analyzing sensor data, often not only the individual measurements of sensors are of interest, but also aggregated data for -groups of sensors. When monitoring energy consumption in industrial facilities, for example, comparing the total consumption -of machine types often provides better insights than comparing the consumption of all individual machines. Additionally, it may -be necessary to combine groups further into larger groups and adjust these group hierarchies at runtime.