Scalable Stream Processing using Queue

In this story for penetration test on kafka LOCUST is going to be applied. In each api call by locust a random data at current time will be produced into kafka. Faust is the real-time stream processing service, by it self can not be scaled; in order to process record in heavy loads during locust penetration test, Celery a distribued task queue does the real process in a scalable distributed queue.

This project repo could be found here


make pg-kafka

Then api service:

make api

creating streaming:

make stream

scalable queue up:

docker-compose scale queue=3

This will bring 3 instance of celery processor

In next section, the locust will create asynchronous load testing on api and we check for celery instances.


make locust

The locust is up on localhost:8090:

Test parameters

Producing 250 msg per second in average into kafka.

Five instance of queue at same time:

distributed processing data . . .

doing some data engineering