avro

All you need to know about Avro schema

Reading Time: 4 minutes In this post, we are going to dive into the basics of the Avro Schema. We will create a sample avro schema and serialize it to a sample output file and also read the file as an example according to the avro schema. Intro to Avro Apache Avro is a data serialization system developed by Doug Cutting, the father of Hadoop that helps with data Continue Reading

Reading Avro files using Apache Flink

Reading Time: 2 minutes In this blog, we will see how to read the Avro files using Flink. Before reading the files, let’s get an overview of Flink. There are two types of processing – batch and real-time. Batch Processing: Processing based on the data collected over time. Real-time Processing: Processing based on immediate data for an instant result. Real-time processing is in demand and Apache Flink is the Continue Reading

kafka with spark

Apache Spark 2.4: Adding a little more Spark to your code

Reading Time: 5 minutes Continuing with the objectives to make Spark faster, easier, and smarter, Apache Spark recently released its fifth release in the 2.x version line i.e Spark 2.4. We were lucky enough to experiment with it so soon in one of our projects. Today we will try to highlight the major changes in this version that we explored as well as experienced in our project. In our Continue Reading

Error Registering Avro Schema | Multiple Schemas In One Topic

Reading Time: 4 minutes org.apache.kafka.common.errors.SerializationException: Error registering Avro schema: {“type”:”record”,”name”:”schema1″,”namespace”:”test”,”fields”:[{“name”:”Name”,”type”:”string”},{“name”:”Age”,”type”:”int”},{“name”:”Location”,”type”:”string”}]} Caused by: io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: Schema being registered is incompatible with an earlier schema; error code: 409 at io.confluent.kafka.schemaregistry.client.rest.RestService.sendHttpRequest(RestService.java:170) at io.confluent.kafka.schemaregistry.client.rest.RestService.httpRequest(RestService.java:188) at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:245) at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:237) at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:232) at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.registerAndGetId(CachedSchemaRegistryClient.java:59) at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.register(CachedSchemaRegistryClient.java:91) at io.confluent.kafka.serializers.AbstractKafkaAvroSerializer.serializeImpl(AbstractKafkaAvroSerializer.java:72) at io.confluent.kafka.formatter.AvroMessageReader.readMessage(AvroMessageReader.java:158) at kafka.tools.ConsoleProducer$.main(ConsoleProducer.scala:57) at kafka.tools.ConsoleProducer.main(ConsoleProducer.scala) You might have come across a similar exception while working with AVRO schemas. Kafka throws this exception due to a compatibility issue Continue Reading

Avro Communication over TCP Sockets

Reading Time: 2 minutes Storing/Transferring object is a requirement of most applications. What if there is a need for communication between machine having incompatible architecture. Java Serialization won’t work for that. Now, if you are thinking about Serialization Framework then you are right. So, let’s start with one of the Serialization framework Apache Avro. What is Avro? Apache Avro is a language-neutral data serialization system. It’s a schema-based system Continue Reading

Apache Avro

Kafka-Avro-Scala-Example

Reading Time: 3 minutes This post will show you how to write and read messages in Apache Avro format to/from Kafka. Instead of using with plain-text messages, though, we will serialize our messages with Avro. That will allow us to send much more complex data structures over the wire. Avro Apache Avro is a language neutral data serialization format. A avro data is described in a language independent schema. Continue Reading