What is OCR
OCR is an acronym for optical character recognition. It is a widespread technology to recognize text inside images, such as scanned documents and photos. OCR technology is used to convert virtually any kind of image containing written text (typed, handwritten, or printed) into machine-readable text data.
OCR using Pytesseract
Python-tesseract is a wrapper for Google’s Tesseract OCR engine. It can read any image types supported by the Pillow and Leptonica imaging libraries, including jpeg, png, gif, bmp, tiff, and others, making it usable as a standalone tesseract invocation script. Python-tesseract will print the recognized text rather of writing it to a file if used as a script.
For installation we only require to install a few python modules that are listed below
OpenCV- pip install opencv-python Pytesseract - pip install pytesseract Pillow -pip install pillow
There are only three easy stages required in this process. First, we’ll load an image saved on the computer or downloaded via a browser, and then we’ll load that. (Any image including text) The image will then be preprocessed to ensure that it is clean before being converted to grayscale, noise removal and binarization. The main objective of the Preprocessing phase is to make as easy as possible for the OCR system to distinguish a character/word from the background. Finally, we’ll run the image through an OCR machine to generate a string format. Let’s have a look at how to build a simple program for optical character recognition in Python.
from PIL import Image import PIL import pytesseract import cv2 #main function def ocr_main(img): text = pytesseract.image_to_string(img) return text #reading the image from local directory img = cv2.imread('sample.jpg') #PREPROCESSING THE IMAGE #Grayscaling def get_grayscale(image): return cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) # Noise removal def remove_noise(image): return cv2.medianBlur(image,3) #Thresholding def thresholding(image): return cv2.threshold(image, 100, 230, cv2.THRESH_BINARY + cv2.THRESH_OTSU) #Calling Preprocessing functions according to user needs img= get_grayscale(img) img= thresholding(img) img= remove_noise(img) #Using OpenCV to Preview Preprocessed image #cv2.imshow('img', img) #cv2.waitKey(0) #cv2.destroyAllWindows() #Calling the main function to display result print(ocr_main(img))
After OCR Sample image given below has been converted to string successfully.
OCR is a very remarkable technology that holds a lot of potential. In this day and age, such tools are already quite advanced. However, Optical Character Recognition is going to look even better in the future.