Fix Beam Samza scaling

assigned to @stu203404

This problem arises using scale in docker-compose.

Problem is caused due to duplicate samza app.id and job.id which results in id=kafka_admin_consumer-ucapplication-1 for every instance.

This can be solved by setting these manually different from each other via variables in samza standalone.properties.

Cleaner solution is adding beam --jobInstance:$HOSTNAME in the docker file. This gives every instance the container id of docker.

mentioned in merge request !187 (merged)

closed

reopened

instanz1.log

instanz2.log

As reported by @stu203404, when setting --jobInstance work is not shared between instances. Instead every instances starts an own job and processes the data of all Kafka partitions. We still have to find a solution here, so I'm reoping this issue.

changed the description

I think, we finally solved this issue. I resetted the use of --jobInstance in !187 (merged). As far as I observed, instances are crashing if there are more instances than Kafka partitions Samza is reading from. This is okay for us as these instances would remain idle anyway (although it's strange that this is documented nowhere). We experienced the reported issue when running Kafka and the load generator from our Docker-Compose files and the load generator was starting "too fast". In that case, the load generator created the topic with only partition. Hence, when running two instances, one is crashing. I opened #306 to tackle this issue.

@stu203404 Feel free to close this ticket after verifying that everything works as intended.

Using the Docker-compose files the same problem applies to the beam Flink implementation. There are just as many Flink TaskManager working as partitions available. Except from that the flink implementations seems to work.

What happens to the remaining Task Managers? Are they crashing (like with Samza) or they just staying idle (like with Kafka Streams)?

mentioned in commit e14323d5

closed with merge request !187 (merged)

Fix Beam Samza scaling

Designs

Child items ...

Activity