As you might know, Neural networks reflect the behaviour of the human brain, allowing computer programs to recognise patterns and solve common problems in the fields of AI, machine learning, and deep learning.
Neural networks are comprised of a node layers, containing an input layer, one or more hidden layers, and an output layer. Each node, or artificial neuron, connects to another and has an associated weight and threshold. If the output of any individual node is above the specified threshold value, that node is activated, sending data to the next layer of the network. Otherwise, no data is passed along to the next layer of the network.
Many might be familiar with concepts like image processing, how we can identify an image object by simply building a machine learning model for it. A simple naive algorithm would be, looking at each pixels of image and using the training set learning’s to identify the object.
While this is a good way to identify, but also a bit time taking. Imaging there are thousands of objects that a neural network is trained for. Then any image that we are trying to identify with this neural network will have to match each pixels to finally give the result. So how can we improve it? Let’s see.
What is Convolution?
The convolutional neural network, or CNN for short, is a specialised type of neural network model designed for working with two-dimensional image data. Central to the convolutional neural network is the convolutional layer that gives the network its name.
The innovation of convolutional neural networks is the ability to automatically learn a large number of filters in parallel specific to a training data set under the constraints of a specific predictive modelling problem. We get the most specific feature from the input image, required for classification. Ex: Image Classification
With convolution their is another important layer that goes on with it known as Pooling layer.
Pooling layers provide an approach to down sampling feature maps by summarising the presence of features in patches of the feature map. In simple terms, Pooling is a way of compressing an image.(i.e go over image four pixels at a time).
A pooling layer is a new layer added after the convolutional layer. The pooling layer operates upon each feature map separately to create a new set of the same number of pooled feature maps. Pooling involves selecting a pooling operation, much like a filter to be applied to feature maps. The size of the pooling operation or filter is smaller than the size of the feature map.
Reasons to use Pooling with Convolution:
- There is a limitation of the feature map output of convolutional layers is that they record the precise position of features in the input.
- This means that small movements in the position of the feature in the input image will result in a different feature map.
- This can happen with re-cropping, rotation, shifting, and other minor changes to the input image.
Now that we have some idea of how a convolutional neural network works. Lets go ahead and see the code for it.
model = tf.keras.models.Sequential([ tf.keras.layers.Conv2D(64, (3,3), activation='relu', input_shape=(28, 28, 1)), tf.keras.layers.MaxPooling2D(2, 2), tf.keras.layers.Conv2D(64, (3,3), activation='relu'), tf.keras.layers.MaxPooling2D(2,2), tf.keras.layers.Flatten(), tf.keras.layers.Dense(128, activation='relu'), tf.keras.layers.Dense(10, activation='softmax') ])
So this is pretty much about the Convolutional Neural Network. Hopefully this gives the idea of how it helps us in image processing and giving a good generalised result. Let’s keep learning more.
- Convolutional Neural Networks in TensorFlow by DeepLearning.AI
- Machine Learning Mastery
- Towards Data Science