In this blog , we will focus on Image Data Augmentation using keras and how we can implement same.

Problem
When we work with image classification projects, the input which a user will give can vary in many aspects like angles, zoom and stability while clicking the picture. So we should train our model to accept and make sense of almost all types of inputs.
This can be done by training the model for all possibilities. But we can’t go around clicking the same training picture in every possible angles and imagine that when the training set is as big as 10000 pictures!
This can be easily be solved by a technique called Image Data Augmentation, which takes an image, converts it and save it all the possible forms we specify.
Image augmentation Introduction
Image augmentation is a technique that is used to artificially expand the data-set. This is helpful when we are given a data-set with very few data samples. In case of Deep Learning, this situation is bad as the model tends to over-fit when we train it on limited number of data samples.
Image augmentation parameters that are generally used to increase the data sample count are zoom, shear, rotation, preprocessing_function and so on. Usage of these parameters results in generation of images having these attributes during training of Deep Learning model. Image samples generated using image augmentation, in general results in increase of existing data sample set by nearly 3x to 4x times.
Implementation
Let’s start with importing all necessary libraries:
pip install tensorflow
pip install scipy
pip install numpy
pip install h5py
pip install pyyaml
pip install keras
We have installed scipy
,numpy
,h5py
,pyyaml
because they are dependencies required for keras
and since keras works on a tensorflow
backend, there is a need to install that as well. You can read more about tensorflow installation here. We will be using keras
for performing Image Augmentation.
Let’s import keras image preprocessing.
from keras.preprocessing.image import ImageDataGenerator,img_to_array, load_img
Here, ImageDataGenerator
is used to specify the parameters like rotation, zoom, width we will be using to generate images, more of which will be covered later. img_to_array
is used to convert the given image to a numpy array which will be used by the ImageDataGenerator
, load_img
will be used to load the image to modify into our program.
datagen = ImageDataGenerator(
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest')
Arguments:
We have used ImageDataGenerator()
here to specify the parameters for generating our image, which can be explained as follows:
rotation_range : amount of rotation
width_shift_range , height_shift_range : amount of shift in width, height
shear_range : shear angle in counter-clockwise direction as radians
zoom_range : range for random zoom
horizontal_flip : Boolean (True or False). Randomly flip inputs horizontally
fill_mode : One of {“constant”, “nearest”, “reflect” or “wrap”}. Points outside the boundaries of the input are filled according to the given mode
After specifying the parameters and storing them in datagen
variable, we move towards importing our image.
img = load_img('lion.jpg')
Here, I am using lion image , you can simply use your own sample image.
x = img_to_array(img) # creating a Numpy array with shape (3, 150, 150)
x = x.reshape((1,) + x.shape) # converting to a Numpy array with shape (1, 3, 150, 150)
load_img
is used to load the required image, you can use any image you like but I
would recommend an image with a face like that of a cat, a dog or a
human!
Next, we use img_to_array
to convert the image to something numerical, in this case a numpy array, which can be easily fed into our flow()
function (don’t worry it is explained later!). We store our converted numpy array to a variable x
.
Then, we have to reshape the numpy array, adding another parameter of size 1. We do so in order to make it a numpy array of order 4 instead of order 3, to accommodate a parameter called channels axis. In case of grayscale data, the channels axis should have value 1, and in case of RGB data, it should have value 3.
This is my input image (a lion):



output:
Now that we have our input in form, let’s start producing some output.
i = 0
for batch in datagen.flow(x,save_to_dir='output', save_prefix='lion', save_format='jpeg'):
i += 1
if i > 20:
break
we use datatgen.flow()
function in each iteration. We have given x
– the numpy array for the input image, save_to_dir
– the directory to save output, save_prefix
– the prefix for the names of the images and save_format
– the image format as input.
This is how our output images will look like:



Notice that each image is a bit different from the other due to zoom, rotation, width or height shift etc. This will help the model you will be building to recognize a large number of images, thus making it more efficient.
So , this is overview of image augmentation in keras.
Moreover follow Machinex page for more updates for same: