Custom DynamoDB Docker Instance

Hey guys, I hope you all are doing well, I am back with another blog on custom docker instances for databases. In my last blog we saw how we can have our custom docker instance for MySQL. Similarly, in this blog we would look how we can do the same with DynamoDB, so let’s get started.

So just like the scenario in previous blog, I was working on a project with DynamoDB as the database due to its many features like scalability, cloud storage etc. And I wanted to test somethings, and did not want to mess with the cloud instance, so I thought to make an instance of my own, so what to do?

To_DO

I started searching over the internet, the things I found are as follows.

  1. You can run DynamoDB miniature instance on your local, which you can download here. (Great)
  2. Running the local version gives us an option of sharedDB that saves us from the region options and saves all the data in file named shared-local-instance.db. (fine).
  3. A Github library that let’s you to take the dump of dynamo and populate it as well. (Almost There)
  4. And finally building the docker instance. (Bingo)

So let’s get started, once you are done downloading the tar file from the above link, just extract it some where, I would suggest in a separate directory and then you can run it simply using.

java -Djava.library.path=DynamoDBLocal_lib -jar DynamoDBLocal.jar -optimizeDbBeforeStartup -sharedDb

The above command would run the local instance for DynamoDB accessible on localhost:8000. Once you populate it with some data you would start to see the file shared-local-instance.db.

Dumping the data

Now, comes the tricky part, to take the dump from the remote instance, obviously you can write your own utility for that but thanks to the developers/contributors of the above lib. Now simply clone the project and have a look at the documentation for the project to understand how to use the python script. Let me give you an example, I am taking an example to dump all tables but you can also modify the command for different options.

But before that just make sure that your DynamoDB instance is accessible from the machine from which you will run the python script.

python dynamodump.py -m backup --host dynamoHost --port 8000 -r us-west-1 -s production*

The above command would dump the data from the dynamo host for aws region us-west-1 from the tables that has production in starting of their name. You may simply choose to use "*" to dump all tables, while running you may encounter failure for the dependencies which can be installed beforehand using the requirements.txt file from above project. Just run

pip install -r requirement.txt

and you will be all set. Once you are done you will get a file structure like this, suppose you have two tables in your remote DynamoDB then structure would be as follows.

dump
|
|
+-------TableA
| |
| |
| +------data
| |
| +------schema.json
+-------TableB
|
|
+------data
|
+------schema.json

So basically a dump folder would be created and table data with schema would be there in JSON format.

Now the idea would be to take the dump which we already did and then use the back up part to populate the local DynamoDB instance. So then just use

python dynamodump.py -m restore --host localhost --port 8000 -r us-west-1 -s "*

the above command when ran, it would populate the local DynamoDB instance. And voila we are done. We can use this DynamoDB instance with the data of remote instance and we are good to do experiments on it.

It is pretty handy now but what if you want to ship it or run it in a cluster environment, for that I took one more step and dockerized it, I tried many days to create my custom DynamoDB base image but failed hence used a custom image from one of the forums you can pull it using anapsix/alpine-java, now for dockerization we can use the file shared-local-instance.db file the Dockerfile below

Dockerfile

Using the above docker file just use the command

docker build -t <image-name>:<tag> .

to build the docker image, assuming that Dockerfile and db file is in the same directory, and then you are good to use it as an standalone container or may be in cluster environments.

So that’s how you do it, please share the post if you find it useful and drop any comments and suggestions below till then happy coding 🙂

Knoldus-blog-footer-image

Written by 

Shubham Verma is a software consultant. He likes to explore new technologies and trends in the IT world. Shubham is familiar with programming languages such as Java, Scala, C, C++, HTML, Javascript and he is currently working on reactive technologies like Scala, Akka , spark and Kafka. His hobbies includes playing computer games and watching hollywood movies.

Leave a Reply

Knoldus Pune Careers - Hiring Freshers

Get a head start on your career at Knoldus. Join us!