After setting up the environment, it is now time to simulate the beer ratings flowing in. As explained, I will start off several generators simultaneously. To generate some (intended) data skew away from the average, several generators will have the same structural event definition, however they will be different in the combination of users, beers and ratings upper and lower bounds. Of course this is also based on my personal preference – who said my demonstration scenario should be fair?
In this blog, I am going to zoom into KSQL and the opportunities it offers for manipulating streaming data in Kafka, by merely using SQL-like statements. One of the neat things about the Confluent Kafka platform, is that it provides additional utilities on top of the core Kafka tools. One of these utilities is the ksql-datagen, which allows users to generate random data based on a simple schema definition in Apache Avro.
For a long time I have been interested in Apache Kafka and its applications. Unfortunately, forced by circumstances, work and other personal endevours, I had not been able to really dive deeper into the matters until Spring 2019. In April I have finally finished the Udemy course “Apache Kafka for Beginners“.
At work, my exposure to Kafka had only been limited, as we were (ultimately) publishing messages onto a Kafka topic using Oracle Service Bus. However, this was actually a Java-built integration, as we wer just pushing the messages onto a JMS queue, which had a MDB listening that propagated the messages to the Kafka cluster.
After completing the first training I got interested, especially in the role of Kafka in real-time event systems and I decided to take another course on Kafka Streams. I was a bit disappointed that this specific course focussed on the Java development quite heavily, and as an exception I decided to abandon the course uncompleted. During one of the Kafka Meetups, I found out that Confluent was actually offering a very interesting alternative to programming the Kafka Streams API in Java, viz. KSQL.