Author: Pinku Swargiary

Streaming from Kafka to PostgreSQL through Spark Structured Streaming

Reading Time: 3 minutes Hello everyone, in this blog we are going to learn how to do a structured streaming in spark with kafka and postgresql in our local system. We will be doing all this using scala so without any furthur pause, lets begin. Setting up the necessities first: Dependencies Set up the required dependencies for scala, spark, kafka and postgresql. 2. PostgreSQL setup Lets start fresh by Continue Reading

Kryo Serialization in Spark

Reading Time: 4 minutes Spark provides two types of serialization libraries: Java serialization and (default) Kryo serialization. For faster serialization and deserialization spark itself recommends to use Kryo serialization in any network-intensive application. Then why is it not set to default : Why Kryo is not set to default in Spark? The only reason Kryo is not set to default is because it requires custom registration. Although, Kryo is Continue Reading