Apache Kafka

Using Vertica with Spark-Kafka: Write using Structured Streaming

Reading Time: 3 minutes In two previous blogs, we explored about Vertica and how it can be connected to Apache Spark. The first blog in this mini series was about reading data from Vertica using Spark and saving that data into Kafka. The next blog explained the reverse flow i.e. reading data from Kafka and writing data to Vertica but in a batch mode. i.e reading data from Kafka Continue Reading

Using Vertica with Spark-Kafka: Writing

Reading Time: 4 minutes In previous blog of this series, we took a glance over the basic definition of Spark and Vertica. We also did a code overview for reading data from Vertica using Spark as DataFrame and saving the data into Kafka. In this blog we will be doing the reverse flow i.e. working on reading the data from Kafka as a DataFrame and writing that DataFrame into Continue Reading

Using Vertica with Spark-Kafka: Reading

Reading Time: 4 minutes We live in a world of Big data where the size of data is so big even for small results. This is the result of an increase in data collection on a rapid scale in the modern world. This massiveness of data brings the requirements of such tools which can work upon such a big chunk of data. I am pretty sure that you guys Continue Reading

Take a deep dive into Kafka – Producer API

Reading Time: 4 minutes I am going to start a series of blogs on Kafka API. This blog is a part of the series. In the series of blogs In this blog, we are going to learn about Producer-API. If you are new to Kafka then I will recommend you to first get some basic idea about Kafka Quickstart from kafka-quickstart . There are many reasons an application might Continue Reading

Build your own Kafka Producer

Reading Time: 2 minutes “It’s Not Whether You Get Knocked Down, It’s Whether You Get Up.” – Inspirational Quote By Vince Lombardi Kafka Producer API allows applications to send streams of data to topics in the Kafka cluster. Looking for a way to implement Custom Kafka Producer in your project. This blog post gives you an end to end solution to implement this functionality using KAFKA API. Introduction There Continue Reading

Monitoring Kafka with Prometheus and Grafana

Reading Time: 3 minutes Kafka monitoring is an operation which is used for the optimization of the Kafka deployment. This process is easy and efficient, by applying one of the existing monitoring solutions instead of building your own. Let’s say, we use Apache Kafka for message transfer and processing and we want to monitor it.But, before learning the steps for monitoring, let’s first understand the prerequisites. Kafka It is Continue Reading

Flinkathon: Guide to setting up a Local Flink Custer

Reading Time: 3 minutes In our previous blog post, Flinkathon: First Step towards Flink’s DataStream API, we created our first streaming application using Apache Flink. It was easy, clean, and concise. However, the real power of Apache Flink is seen on a cluster, where data is processed in a distributed manner, with the advantage of multi-core/multi-memory systems. So, in this blog post, we will see how to set up Continue Reading

Determine Kafka broker health using Kafka stream application’s JMX metrics and setup Grafana alert

Reading Time: 3 minutes As we all know, Kafka exposes the JMX metrics whether it is Kafka broker, connectors or Kafka applications. A few days ago, I got the scenario where I needed to determine Kafka broker health with the help of Kafka stream application’s JMX metrics. It looks bit awkward, right? I should use the broker’s JMX metrics to do this, why am I looking to application JMX Continue Reading

Hawk-Rust Series: Kafka with Rust

Reading Time: 2 minutes In this post, we will see why we used Kafka for Hawk and how we implemented Kafka for Hawk since Hawk is built on Rust it will be interesting to learn how one can use Kafka with Rust.If you don’t have any idea what Hawk is here is a brief overview.Hawk is an image recognition application built in Rust using AWS services.For more details, you Continue Reading

Knolx: Alpakka-Connecting Kafka & ElasticSearch to Akka Streams

Reading Time: 1 minute Hi all, Knoldus has organized a 30 min session on 1st  March 2019 at 3:30 PM. The topic was Alpakka – Connecting Kafka and ElasticSearch to Akka Streams.  Many people have joined and enjoyed the session. I am going to share the slides here. Please let me know if you have any question related to linked slides or video. The slides of the KnolX are here: And Continue Reading

Flinkathon: First Step towards Flink’s DataStream API

Reading Time: 3 minutes In our previous blog posts: Flinkathon: Why Flink is better for Stateful Streaming applications? Flinkathon: What makes Flink better than Kafka Streams? We saw why Apache Flink is a better choice for streaming applications. In this blog post, we will explore how easy it is to express a streaming application using Apache Flink’s DataStream API. DataStream API DataStream API is used to develop regular programs Continue Reading

Flinkathon: What makes Flink better than Kafka Streams?

Reading Time: 2 minutes Initially, I would like you all to focus on a few questions before comparing the frameworks:1. Is there any comparison or similarity between Flink and the Kafka?2. What could be better in Flink over the Kafka?3. Is it the problem or system requirement to use one over the other? Before talking about the Flink betterment and use cases over the Kafka, let’s first understand their Continue Reading

Knoldus Pune Careers - Hiring Freshers

Get a head start on your career at Knoldus. Join us!