Know basics of OpenCV for Image Processing in Python

Reading Time: 5 minutes

OpenCV known as Open Source Computer Vision, is a library in Python used for Computer Vision and Image Processing Tasks. It has a modular structure and includes several shared and static libraries. OpenCV can be used in languages such as Python, C++, Java etc. Some of the applications of OpenCV include Edge Detection, Face Detection, Object Detection, Face Recognition etc.

Using OpenCV we can process images to extract the desired information from them. OpenCV makes use of Numpy (Numerical Python) which is a highly optimized library for numerical operations. One can easily install OpenCV from here.

If one wants to use opencv it can be installed using below command on anaconda prompt –

conda install -c anaconda opencv

Basics of OpenCV in Python

1. Reading an Image

## Import OpenCV library
import cv2  
import numpy as np
# Load an image
img = cv2.imread(r"dogs.jpg", 1)
# Show image
cv2.imshow('image',img)  #1st arg - window name, 2nd - our image

The imread() function reads the image in different forms :

Image formDescriptionFlag
cv2.IMREAD_COLORLoads a color image. (by default)1
cv2.IMREAD_GRAYSCALELoads image in grayscale mode0
cv2.IMREAD_UNCHANGEDLoads image as such including alpha channel-1

cv2.imshow() function has Name of image screen and image to display paramters.

cv2.waitKey() is a keyboard binding function. Its argument is the time in milliseconds. The function waits for specified milliseconds for any keyboard event. If you press any key in that time, the program continues. If 0 is passed, it waits indefinitely for a key stroke.

cv2.destroyAllWindows() simply destroys all the windows we created. If you want to destroy any specific window, use the function cv2.destroyWindow() where you pass the exact window name as the argument.

2. Get details of image

print("Shape of image : ",img.shape) # returns a tuple of number of rows,columns, and channels
print("Sizeof Image : ",img.size) # returns total number of pixels is accessed
print("Datatype of image : ",img.dtype) # returns image dtype obtained

3. Resize and Rescale image

We resize and rescale images and files to prevent computational strain. Large media files tends to store a lot of information. Displaying it takes a lot of processing. Here we are actually try to get rid of some of that information.

Usually we try to downscale the width and height of images. In below code the image size is rescaled to 75% of its total size.

import cv2 
# Load an image
img = cv2.imread(r"dogs.jpg", 1)

def rescaleFrame(frame,scale = 0.75):   # Rescale Image
    width = int(frame.shape[1] * scale)
    height = int(frame.shape[0] * scale)
    dimensions = (width, height)
    return cv2.resize(frame,dimensions,interpolation= cv2.INTER_AREA)  # Resize Image
frame_resized = rescaleFrame(img)
# Show image
cv2.imshow('Original image',img)
cv2.imshow('Resized image', frame_resized)

4.Draw shapes and add text to images

import numpy as np
import cv2
img = np.zeros([512, 512, 3], np.uint8) # draw a blank image
img = cv2.rectangle(img, (384, 0), (510, 128), (0, 0, 255),thickness= 4)  # Draw a rectangle
cv2.putText(img, 'OpenCV', (10, 500), font, 4, (255, 255, 255), 2, cv2.LINE_AA) # Add text in image
cv2.imshow("shapes", img)

The image drawn using np.zeros() by default draws a blank image.

cv2.rectangle() function takes the below parameters to draw a rectangle.

Image name(img), Vertex of angle (point1), Vertex of rectangle opposite to point1 (point2), Color of rectangle (0,0,255) represents red color.

2. cv2.putText() function adds text in image. It takes the below parameters.

Image name (img), Text to add in image, Position where to add text & others are fontFace, fontScale, Color, Thickness and LineType for text.

Similarly can also draw a circle, eclipse, line and polylines. Here is the description for them.

Essential Functions in OpenCV

In this section, we will look at some of the functions that are usually applied on image data. These are preprocessing step for building Machine Learning and Deep Learning model.

Image Blurring

Image Blurring refers to making the image less clear or distinct. It is done with the help of various low pass filter kernels. It helps in removing noise in image and smoothing image. There are three types of blurring. They are Gaussian Blur, Median Blur and Bilateral Blur. Gaussian blur is the result of blurring an image by Gaussian function.

Edge Cascading

Canny edge detection is a popular edge detection algorithm. It is applied on blurred images to gain the edges or features from image.

Image Dilating

Image Dilating increases the object area. With dilation we can get better noise removal results. It helps join some broken parts of an object in image. Below code displays the above operations performed on image :

img = cv2.imread(r"dog_1.jpg", 1)
# Blur Image
blur = cv2.GaussianBlur(img,(5,5),cv2.BORDER_DEFAULT)
# Edge Cascade
cascade = cv2.Canny(img,125,175)
# Edge cascade of Blurred image
canny = cv2.Canny(blur,125,175)
# Dilating the image
dilated = cv2.dilate(canny, (7,7), iterations = 4) # (7,7) - kernel size & iterations can be changed 

cv2.imshow('Original image',img)
cv2.imshow('Edge cascade',cascade)
cv2.imshow('Blurred image Edge cascade',canny)
cv2.imshow('Dilated image',dilated)

Image Transformations

Basic image transformation techniques includes Image Translation, Rotation, Flipping, Cropping and Resizing of images.


Translation is basically shifting of an image along the x and y axis. Using translation you can shift an image up(-y), down(y), left(-x), right(+x) or with any combinations.


Rotation rotates an image by some angle. OpenCV allows you to specify any rotation point that you would like to rotate the image around.


Flipping is mirroring an image. We can flip an image vertically(0), horizontally(1), vertically and horizontally(-1). The values 0,1, -1 specify the flip code for flipping an image.

Below code shows all the transformations applied on image:

import cv2 as cv
import numpy as np
img = cv.imread(r"dog_1.jpg", 1)
cv.imshow('Dog', img)

# Translation
def translate(img, x, y): # x,y specify the axis for translation
    transMat = np.float32([[1,0,x],[0,1,y]]) # translation matrix
    dimensions = (img.shape[1], img.shape[0]) # image width, height 
    return cv.warpAffine(img, transMat, dimensions)
translated = translate(img, -50, 50)
cv.imshow('Translated', translated)

# Rotation
def rotate(img, angle, rotPoint=None):  # angle of roation, rotation point as input
    (height,width) = img.shape[:2]
    if rotPoint is None:
        rotPoint = (width//2,height//2) # rotate around center
    rotMat = cv.getRotationMatrix2D(rotPoint, angle, 1.0)
    dimensions = (width,height)
    return cv.warpAffine(img, rotMat, dimensions)
rotated = rotate(img, -45) # clockwise rotation
cv.imshow('Rotated', rotated)
rotated_rotated = rotate(img, -90) # rotated image can again be rotated
cv.imshow('Rotated Rotated', rotated_rotated)

# Flipping
flip = cv.flip(img, -1)
cv.imshow('Flip', flip)


We have learned the basics of how to work with images in python using OpenCV. Most importantly, these operations can be applied on video data because OpenCV can be used for processing videos.

Keep exploring OpenCV !


Written by 

Working as a Sr. Software Consultant AI/ML at Knoldus. Like exploring more of Data Science and its related technology. Current learning areas are Natural Language Processing, Deep Learning and Artificial Intelligence.

Leave a Reply