Avro Communication over TCP Sockets


Storing/Transferring object is a requirement of most applications. What if there is a need for communication between machine having incompatible architecture. Java Serialization won’t work for that. Now, if you are thinking about Serialization Framework then you are right. So, let’s start with one of the Serialization framework Apache Avro.

What is Avro?

Apache Avro is a language-neutral data serialization system. It’s a schema-based system which serializes the data having built-in schema into a compact binary format, post data can be deserialized by any application having the same schema.

In this post, I will demonstrate how to read schema by using parsers library and send/ receive the serialized data over java socket.

Let’s create a new maven project and add avro dependency in pom.xml

<dependency>
 <groupId>org.apache.avro</groupId>
 <artifactId>avro</artifactId>
 <version>1.8.1</version>
</dependency>

Now, create new avro schema file in schema directory of project say it as employee.avsc

{
 "namespace": "example.avro",
 "type": "record",
 "name": "employee",
 "fields": [
  {"name": "Name", "type": "string"},
  {"name": "id", "type": "string"}
 ]
}

here we are considering a schema of employee having name and id.

Instantiate the Schema.Parser class by passing the file path where the schema is stored to its parse method.

Schema schema = new Schema.Parser().parse(new File("src/schema/employee.avsc"));

After Schema parsing we need to create record using GenericData and store data using put method.

GenericRecord empRecord = new GenericData.Record(schema);
empRecord.put("Name", "jatin");
empRecord.put("id", "E001");

Now, serialize the data using GenericDataumWriter class which converts employee record into a byte stream.

DatumWriter datumWriter = new GenericDatumWriter(schema);

To send avro data, we will be using java socket. Following below I have created a socket client on the port 8090 and initialized the socket output stream.

Socket client = new Socket("localhost", 8090);
OutputStream outToServer = client.getOutputStream();

BinaryEncoder encodes the data in binary format and then sends it using flush method of Encoder. Following below I have created an Encoder using EncoderFactory.

EncoderFactory enc=new EncoderFactory();
BinaryEncoder binaryEncoder=enc.binaryEncoder(outToServer,null);
datumWriter.write(empRecord,binaryEncoder);
binaryEncoder.flush();
outToServer.close();

This was all about serializing the avro values and sending it over socket. Now, let’s understand how to decode the serialized data.

First we need to create a java server socket to listen socket client on port 8090.

serverSocket = new ServerSocket(8090);
Socket server = serverSocket.accept();

Now we perform decoding of streamed data on server socket using BinaryDecoder.

Schema schema = new Schema.Parser().parse(new File("src/schema/employee.avsc"));
DatumReader datumReader = new GenericDatumReader(schema);
GenericRecord employee = new GenericData.Record(schema);
InputStream inputStream = server.getInputStream();
DecoderFactory decoderFactory = new DecoderFactory();
BinaryDecoder binaryDecoder = decoderFactory.binaryDecoder(inputStream, null);
datumReader.read(employee, binaryDecoder);

You can find complete code on GitHub.

Hope you find this blog helpful!!

Happy Reading 🙂

References

Avro Documentation

Java Socket Programming

Advertisements
This entry was posted in Scala and tagged , . Bookmark the permalink.

One Response to Avro Communication over TCP Sockets

  1. Ashish says:

    Awesome !!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s