Scalable Stream Processing using Queue

In this story for penetration test on kafka LOCUST is going to be applied. In each api call by locust a random data at current time will be produced into kafka. Faust is the real-time stream processing service, by it self can not be scaled; in order to process record in heavy loads during locust penetration test, Celery a distribued task queue does the real process in a scalable distributed queue.

This project repo could be found here

Pre-Build

make pg-kafka

Then api service:

make api

creating streaming:

make stream

scalable queue up:

docker-compose scale queue=3

This will bring 3 instance of celery processor

In next section, the locust will create asynchronous load testing on api and we check for celery instances.

Locust

make locust

The locust is up on localhost:8090:

Test parameters

Producing 250 msg per second in average into kafka.

Five instance of queue at same time:

distributed processing data . . .

doing some data engineering