streaming data

Visualizing data - blue matrix with data, electronic, digital, abstract, dark blue, data science

Sink Connector: The MarkLogic Kafka Connector

Reading Time: 2 minutes The MarkLogic Kafka connector is a sink connector for receiving messages from Kafka and writing them to a MarkLogic database. The sink pulls messages from the Kafka topics to store in MarkLogic as JSON documents. This acquires messages using Kafka from numerous brokers and then writes to marklogic with no coding required. The connector uses the MarkLogic Data Movement SDK (DMSDK) to store those messages in a Continue Reading

A Quick Demo: Kafka to Flink to Cassandra

Reading Time: 3 minutes Hi Folks!! In this blog, we are going to learn how we can integrate Flink with Kafka and Cassandra to build a simple streaming data pipeline. Apache Flink is a framework and distributed processing engine. it is used for stateful computations over unbounded and bounded data streams.Kafka is a scalable, high performance, low latency platform. It allows reading and writing streams of data like a messaging system.Cassandra: A distributed and wide-column Continue Reading

Flinkathon: What makes Flink better than Kafka Streams?

Reading Time: 2 minutes Initially, I would like you all to focus on a few questions before comparing the frameworks:1. Is there any comparison or similarity between Flink and the Kafka?2. What could be better in Flink over the Kafka?3. Is it the problem or system requirement to use one over the other? Before talking about the Flink betterment and use cases over the Kafka, let’s first understand their Continue Reading

RealTimeProcessing of Data using kafka and Spark

Reading Time: 3 minutes Before Starting it you should know about kafka, spark and what is Real time processing of Data.so let’s do some brief introduction about it. Real Time Processing – Processing the Data that appears to take place instead of storing the data and then processing it or processing the data that stored somewhere else. Kafka – Kafka is the maximum throughput of data from one end to another . Continue Reading