Knoldus Blogs

Reactive Java: Understanding Reactive streams

August 8, 2019August 8, 2019Functional Programming, Java, Reactive Application, Reactive Programming, Streaming, Streaming Solutionsfunctional programing, Reactive Java, Reactive Programming, Reactive stream specification, technology

Reading Time: 3 minutes With a lot of buzz in the programming world about “reactive Programming”, a new concept following the same path has been introduced. This is “Reactive streams” backed up by the idea of backpressure. In this blog, we try to understand, what does it mean exactly? What are Reactive Streams? We are here talking about handling streams of data that needs to be handled in an Continue Reading

Reactive Spring: Define a REST endpoint as a continuous stream

July 22, 2019July 22, 2019Java, Reactive Programming, Streaming

Reading Time: 2 minutes In the REST APIs, all Http requests are stateless. We fire the request and get the response, That’s it. It does not keep any state for any HTTP request. The connection between client and server is lost once the transaction ends, so 1 response for 1 request. But sometimes, we get the requirement to have a continuous response for a single request. This continuous response Continue Reading

Using Vertica with Spark-Kafka: Write using Structured Streaming

July 3, 2019July 16, 2019Apache Kafka, Apache Spark, Big Data and Fast Data, Functional Programming, HDFS, Spark, Streaming, Streaming Solutions, Studio-ScalaApache Kafka, Apache Spark, DataFrame, Kafka Spark, Spark, Spark SQL, spark sql kafka, Spark Structured Streaming, Spark to Vertica, Streaming, Structured Streaming, Vertica, Write to vertica

Reading Time: 3 minutes In two previous blogs, we explored about Vertica and how it can be connected to Apache Spark. The first blog in this mini series was about reading data from Vertica using Spark and saving that data into Kafka. The next blog explained the reverse flow i.e. reading data from Kafka and writing data to Vertica but in a batch mode. i.e reading data from Kafka Continue Reading

Flinkathon: Guide to setting up a Local Flink Custer

May 12, 2019May 12, 2019Apache Flink, Apache Kafka, cluster, Flink, Streaming, Studio-ScalaApache Flink Cluster

Reading Time: 3 minutes In our previous blog post, Flinkathon: First Step towards Flink’s DataStream API, we created our first streaming application using Apache Flink. It was easy, clean, and concise. However, the real power of Apache Flink is seen on a cluster, where data is processed in a distributed manner, with the advantage of multi-core/multi-memory systems. So, in this blog post, we will see how to set up Continue Reading

Determine Kafka broker health using Kafka stream application’s JMX metrics and setup Grafana alert

May 10, 2019May 17, 2019Apache Kafka, Monitoring, Streaming, Studio-DevOps, Studio-Scala

Reading Time: 3 minutes As we all know, Kafka exposes the JMX metrics whether it is Kafka broker, connectors or Kafka applications. A few days ago, I got the scenario where I needed to determine Kafka broker health with the help of Kafka stream application’s JMX metrics. It looks bit awkward, right? I should use the broker’s JMX metrics to do this, why am I looking to application JMX Continue Reading

Knolx: Alpakka-Connecting Kafka & ElasticSearch to Akka Streams

May 1, 2019June 19, 2019Akka, akka-streams, Apache Kafka, Elasticsearch, Streaming, Studio-Scalaakka-streams, alpakka, connector, elasticsearch, kafka, reactive streams, Streams

Reading Time: < 1 minute Hi all, Knoldus has organized a 30 min session on 1st March 2019 at 3:30 PM. The topic was Alpakka – Connecting Kafka and ElasticSearch to Akka Streams. Many people have joined and enjoyed the session. I am going to share the slides here. Please let me know if you have any question related to linked slides or video. The slides of the KnolX are here: And Continue Reading

Flinkathon: First Step towards Flink’s DataStream API

April 20, 2019April 20, 2019Apache Flink, Apache Kafka, Big Data and Fast Data, Flink, Streaming, Streaming Solutions, Studio-ScalaDataStream API

Reading Time: 3 minutes In our previous blog posts: Flinkathon: Why Flink is better for Stateful Streaming applications? Flinkathon: What makes Flink better than Kafka Streams? We saw why Apache Flink is a better choice for streaming applications. In this blog post, we will explore how easy it is to express a streaming application using Apache Flink’s DataStream API. DataStream API DataStream API is used to develop regular programs Continue Reading

Flinkathon: What makes Flink better than Kafka Streams?

April 16, 2019April 16, 2019Apache Flink, Apache Kafka, Big Data and Fast Data, cluster, Flink, Streaming, Streaming Solutions, Studio-ScalaApache Kafka, Flink, Flink Streaming, kafka, Kafka Streaming, Kafka Streams, Stream Processing, Streaming, streaming data

Reading Time: 2 minutes Initially, I would like you all to focus on a few questions before comparing the frameworks:1. Is there any comparison or similarity between Flink and the Kafka?2. What could be better in Flink over the Kafka?3. Is it the problem or system requirement to use one over the other? Before talking about the Flink betterment and use cases over the Kafka, let’s first understand their Continue Reading

Kafka: Consumer – Push vs Pull approach

April 7, 2019April 7, 2019Apache Kafka, Big Data and Fast Data, Streaming, Streaming SolutionsArchitecture, Design principles, kafka, Streaming

Reading Time: 2 minutes Have you ever thought about the Push vs Pull approach for the system, which one suits or solves which problem? Another Question why did Kafka choose Pull over Push design for Consumers? Before talking about the Kafka approach, whether the Broker should push the data to consumer or consumer should pull from Kafka? Let’s first understand both of the approaches, as each one has its Continue Reading

Working with Project Reactor: Reactive Streams

April 2, 2019June 14, 2019API's, Java, Reactive Application, Reactive Programming, Streaming, Streaming Solutions, Studio-Scala

Reading Time: 4 minutes .The words “Reactive” and “Streams” often go hand in hand. The streams API of Java 8 is a great tool for making your projects Reactive. But that’s not the only stream you can have. In this blog, I’d like to talk about this awesome project called Project Reactor.

Reactivate your streams with Reactive Streams!!

March 5, 2019March 5, 2019akka-streams, Reactive Application, Streaming, Streaming Solutions, Studio-ScalaAsynchronous, backpressure, Reactive Manifesto, reactive streams, Streams

Reading Time: 5 minutes As you all might have known by now that one of the hot topics for quite some time has been streaming of big data. Day after day, we see tons of streaming technologies out there competing with one another. The obvious reason for that, processing big volumes of data is not enough. We need real-time processing of data, especially when we need to handle continuously increasing Continue Reading

Spark Streaming vs. Structured Streaming

February 28, 2019Apache Spark, Big Data and Fast Data, Spark, Streaming, Streaming Solutions, Studio-ScalaApache Spark, Spark Streaming, Spark Structured Streaming, Streaming, Streaming Spark, Structured Streaming

Reading Time: 6 minutes Fan of Apache Spark? I am too. The reason is simple. Interesting APIs to work with, fast and distributed processing, unlike map-reduce no I/O overhead, fault tolerance and many more. With this much, you can do a lot in this world of Big data and Fast data. From “processing huge chunks of data” to “working on streaming data”, Spark works flawlessly in all. In this Continue Reading

Is Apache Flink the future of Real-time Streaming?

December 28, 2018Apache Flink, Flink, Java, Streaming, Streaming Solutions, Studio-ScalaFlink Streaming, scala, Spark Streaming, Streaming

Reading Time: 5 minutes In our last blog, we had a discussion about the latest version of Spark i.e 2.4 and the new features that it has come up with. While trying to come up with various approaches to improve our performance, we got the chance to explore one of the major contenders in the race, Apache Flink. Apache Flink is an open source platform which is a streaming Continue Reading