Avro Communication over TCP Sockets

Reading Time: 2 minutes

Storing/Transferring object is a requirement of most applications. What if there is a need for communication between machine having incompatible architecture. Java Serialization won’t work for that. Now, if you are thinking about Serialization Framework then you are right. So, let’s start with one of the Serialization framework Apache Avro.

What is Avro?

Apache Avro is a language-neutral data serialization system. It’s a schema-based system which serializes the data having built-in schema into a compact binary format, post data can be deserialized by any application having the same schema.

In this post, I will demonstrate how to read schema by using parsers library and send/ receive the serialized data over java socket.

Let’s create a new maven project and add avro dependency in pom.xml

<dependency>
 <groupId>org.apache.avro</groupId>
 <artifactId>avro</artifactId>
 <version>1.8.1</version>
</dependency>

Now, create new avro schema file in schema directory of project say it as employee.avsc

{
 "namespace": "example.avro",
 "type": "record",
 "name": "employee",
 "fields": [
  {"name": "Name", "type": "string"},
  {"name": "id", "type": "string"}
 ]
}

here we are considering a schema of employee having name and id.

Instantiate the Schema.Parser class by passing the file path where the schema is stored to its parse method.

Schema schema = new Schema.Parser().parse(new File("src/schema/employee.avsc"));

After Schema parsing we need to create record using GenericData and store data using put method.

GenericRecord empRecord = new GenericData.Record(schema);
empRecord.put("Name", "jatin");
empRecord.put("id", "E001");

Now, serialize the data using GenericDataumWriter class which converts employee record into a byte stream.

DatumWriter datumWriter = new GenericDatumWriter(schema);

To send avro data, we will be using java socket. Following below I have created a socket client on the port 8090 and initialized the socket output stream.

Socket client = new Socket("localhost", 8090);
OutputStream outToServer = client.getOutputStream();

BinaryEncoder encodes the data in binary format and then sends it using flush method of Encoder. Following below I have created an Encoder using EncoderFactory.

EncoderFactory enc=new EncoderFactory();
BinaryEncoder binaryEncoder=enc.binaryEncoder(outToServer,null);
datumWriter.write(empRecord,binaryEncoder);
binaryEncoder.flush();
outToServer.close();

This was all about serializing the avro values and sending it over socket. Now, let’s understand how to decode the serialized data.

First we need to create a java server socket to listen socket client on port 8090.

serverSocket = new ServerSocket(8090);
Socket server = serverSocket.accept();

Now we perform decoding of streamed data on server socket using BinaryDecoder.

Schema schema = new Schema.Parser().parse(new File("src/schema/employee.avsc"));
DatumReader datumReader = new GenericDatumReader(schema);
GenericRecord employee = new GenericData.Record(schema);
InputStream inputStream = server.getInputStream();
DecoderFactory decoderFactory = new DecoderFactory();
BinaryDecoder binaryDecoder = decoderFactory.binaryDecoder(inputStream, null);
datumReader.read(employee, binaryDecoder);

You can find complete code on GitHub.

Hope you find this blog helpful!!

Happy Reading 🙂

References

Avro Documentation

Java Socket Programming

1 thought on “Avro Communication over TCP Sockets2 min read

Comments are closed.

Discover more from Knoldus Blogs

Subscribe now to keep reading and get access to the full archive.

Continue reading