MachineX: Image Data Augmentation Using Keras

Reading Time: 4 minutes

In this blog , we will focus on Image Data Augmentation using keras and how we can implement same.

Problem

When we work with image classification projects, the input which a user will give can vary in many aspects like angles, zoom and stability while clicking the picture. So we should train our model to accept and make sense of almost all types of inputs.

This can be done by training the model for all possibilities. But we can’t go around clicking the same training picture in every possible angles and imagine that when the training set is as big as 10000 pictures!

This can be easily be solved by a technique called Image Data Augmentation, which takes an image, converts it and save it all the possible forms we specify.

Image augmentation Introduction

Image augmentation is a technique that is used to artificially expand the data-set. This is helpful when we are given a data-set with very few data samples. In case of Deep Learning, this situation is bad as the model tends to over-fit when we train it on limited number of data samples.

Image augmentation parameters that are generally used to increase the data sample count are zoom, shear, rotation, preprocessing_function and so on. Usage of these parameters results in generation of images having these attributes during training of Deep Learning model. Image samples generated using image augmentation, in general results in increase of existing data sample set by nearly 3x to 4x times.

Implementation

Let’s start with importing all necessary libraries:

pip install tensorflow
pip install scipy
pip install numpy
pip install h5py
pip install pyyaml
pip install keras

We have installed scipy ,numpy ,h5py ,pyyaml because they are dependencies required for keras and since keras works on a tensorflow backend, there is a need to install that as well. You can read more about tensorflow installation here. We will be using keras for performing Image Augmentation.

Let’s import keras image preprocessing.

from keras.preprocessing.image import ImageDataGenerator,img_to_array, load_img

Here, ImageDataGenerator is used to specify the parameters like rotation, zoom, width we will be using to generate images, more of which will be covered later. img_to_array is used to convert the given image to a numpy array which will be used by the ImageDataGenerator, load_img will be used to load the image to modify into our program.

datagen = ImageDataGenerator(
        rotation_range=40,
        width_shift_range=0.2,
        height_shift_range=0.2,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True,
        fill_mode='nearest')

Arguments:

We have used ImageDataGenerator() here to specify the parameters for generating our image, which can be explained as follows:

rotation_range : amount of rotation

width_shift_range , height_shift_range : amount of shift in width, height

shear_range : shear angle in counter-clockwise direction as radians

zoom_range : range for random zoom

horizontal_flip : Boolean (True or False). Randomly flip inputs horizontally

fill_mode : One of {“constant”, “nearest”, “reflect” or “wrap”}. Points outside the boundaries of the input are filled according to the given mode

After specifying the parameters and storing them in datagen variable, we move towards importing our image.

img = load_img('lion.jpg') 

Here, I am using lion image , you can simply use your own sample image.

x = img_to_array(img)  # creating a Numpy array with shape (3, 150, 150)
x = x.reshape((1,) + x.shape)  # converting to a Numpy array with shape (1, 3, 150, 150)

load_img is used to load the required image, you can use any image you like but I would recommend an image with a face like that of a cat, a dog or a human!

Next, we use img_to_array to convert the image to something numerical, in this case a numpy array, which can be easily fed into our flow() function (don’t worry it is explained later!). We store our converted numpy array to a variable x.

Then, we have to reshape the numpy array, adding another parameter of size 1. We do so in order to make it a numpy array of order 4 instead of order 3, to accommodate a parameter called channels axis. In case of grayscale data, the channels axis should have value 1, and in case of RGB data, it should have value 3.

This is my input image (a lion):

Input image

output:

Now that we have our input in form, let’s start producing some output.

i = 0
for batch in datagen.flow(x,save_to_dir='output', save_prefix='lion', save_format='jpeg'):
    i += 1
    if i > 20:
        break 

we use datatgen.flow() function in each iteration. We have given x– the numpy array for the input image, save_to_dir– the directory to save output, save_prefix– the prefix for the names of the images and save_format– the image format as input.

This is how our output images will look like:

Notice that each image is a bit different from the other due to zoom, rotation, width or height shift etc. This will help the model you will be building to recognize a large number of images, thus making it more efficient.

So , this is overview of image augmentation in keras.

Moreover follow Machinex page for more updates for same:

Reference

Written by 

Shubham Goyal is a Data Scientist at Knoldus Inc. With this, he is an artificial intelligence researcher, interested in doing research on different domain problems and a regular contributor to society through blogs and webinars in machine learning and artificial intelligence. He had also written a few research papers on machine learning. Moreover, a conference speaker and an official author at Towards Data Science.