Kafka connector with MongoDB

Reading Time: 3 minutes

The MongoDB Kafka connector is a Confluent-verified connector that persists data from Kafka topics as a data sink into MongoDB as well as publishes changes from MongoDB into Kafka topics as a data source.

Apache Kafka

The Apache Kafka is an open-source publish/subscribe messaging system. Apache Kafka provides a flexible, fault tolerant, and horizontally scalable system to move data throughout datastores and applications. A system is fault tolerant if the system can continue operating even if certain components of the system stop working. A system is horizontally scalable if the system can be expanded to handle larger workloads by adding more machines rather than by improving a machine’s hardware.

Kafka Connect

It is a component of Apache Kafka that solves the problem of connecting Apache Kafka to datastores such as MongoDB. Kafka Connect solves this problem by providing the following resources:

  • A fault tolerant runtime for transferring data to and from datastores.
  • A framework for the Apache Kafka community to share solutions for connecting Apache Kafka to different datastores.

The Kafka Connect framework defines an API for developers to write reusable connectors. Connectors enable Kafka Connect deployments to interact with a specific datastore as a data source or a data sink. The MongoDB Kafka Connector is one of these connectors.

Install the MongoDB Kafka Connector

There are two steps

  • Install the Connector on Confluent platform
  • And then install the Connector on apache kafka

Installing Confluent Hub Client

  1. First :- Install the Confluent Hub Client.

The Confluent Hub client is natively installed as a part of the complete Confluent Platform and located in the /bin directory. If you are using Confluent Community software, you can install the Confluent Hub client separately using the following instructions.

Download and unzip the Confluent Hub tarball.

  1. Copy and paste this link in your browser to download and unzip the Confluent Hub tarball.

http://client.hub.confluent.io/confluent-hub-client-latest.tar.gz

2. Add the contents of the bin directory to your PATH environment variable so that which confluent-hub finds the confluent-hub command.

3. Optional: Verify your installation by typing confluent-hub in your terminal

4. Your output should look like this:

usage: confluent-hub [ ]

Commands are:
help Display help information
install install a component from either Confluent Hub or from a local file

See ‘confluent-hub help ‘ for more information on a specific command.

MongoDB Connector 

Confluent Hub CLI installation

Use the Confluent Hub client to install this connector with:$ confluent-hub install mongodb/kafka-connect-mongodb:1.7.0Copy

Download installation

Or download the ZIP file and extract it into one of the directories that is listed on the Connect worker’s plugin.path configuration properties. This must be done on each of the installations where Connect will be run.

Configure an instance of your connector

Once installed, you can then create a connector configuration file with the connector’s settings, and deploy that to a Connect worker.

Install the Connector on Apache Kafka

  1. Locate and download the uber JAR to get all the dependencies required for the connector. Check the reference table to find the uber JAR.
  2. Copy the JAR and any dependencies into the Kafka plugins directory which you can specify in your plugin.path configuration setting (e.g. plugin.path=/usr/local/share/kafka/plugins).

Download a Connector JAR File

You can download the MongoDB Kafka Connector source and JAR files from the following locations:

Kafka Connector GitHub repository (source code):- mongodb/mongo-kafka

Maven Central repository (JAR files):- mongo-kafka-connect

Connect to MongoDB

Connect the MongoDB Kafka Connector to MongoDB using a connection Uniform Resource Identifier (URI). A connection URI is a string that contains the following information:

  • The address of your MongoDB deployment required
  • Connection settings optional
  • Authentication settings optional
  • Authentication credentials optional

The following is an example of a connection URI for a MongoDB replica set:

mongodb://mongodb0.example.com:27017,mongodb1.example.com:27017,mongodb2.example.com:27017/?replicaSet=myRepl

How to Connect

Specify a connection URI with the following configuration option in both a source and sink connector:

connection.uri=<your connection uri>

How to Configure Your Connection

The MongoDB Kafka Connector uses the MongoDB Java driver to parse your connection URI. The MongoDB Java driver is an artifact that enables Java applications like Kafka Connect to interact with MongoDB.

Authentication

All authentication mechanisms available in the MongoDB Java driver are available in the MongoDB Kafka Connector.

The following is an example of a connection URI that authenticates with MongoDB using SCRAM-SHA-256 authentication:

mongodb://<username>:<password>@<hostname>:<port>/?authSource=<authenticationDb>&authMechanism=SCRAM-SHA-256

Conclusion

In this blog, we learned about how to connect MongoDB with Apache Kafka

for more info click here

Written by 

Chiranjeev kumar is a Software intern at Knoldus. He is passionate about the java programming . He is recognized as a good team player, a dedicated and responsible professional, and a technology enthusiast. He is a quick learner & curious to learn new technologies. His hobbies include listening music , playing video games.

Discover more from Knoldus Blogs

Subscribe now to keep reading and get access to the full archive.

Continue reading