MachineX: The alphabets of Artificial Neural Network – (Part 2)

Table of contents
Reading Time: 3 minutes

If you are reading this blog, it is supposed that you have already done with Part 1


Then visit to the previous blog The alphabets of Artificial Neural Network first and comeback here for an awesome knowledge about Neural network working.

We got the basic understanding of neural network so let’s get into deep.

Let’s understand how neural networks work.

Once you got the dataset and problem identified, you can follow the below Steps:

1. Pick the network architecture(initialize with random weights)
2. Do a forward pass (Forward propagation)
3. Calculate the total error(we need to minimize this error)
4. Back propagate the error and Update weights(Back propagation)
5. Repeat the process(2-4)for no of epochs/until error is minimum.
Step 1 : Pick the network architecture 

Lets take a toy dataset (XOR) and pick the architecture with

2 inputs , 2 outputs and 1 hidden of 3 neurons.

Step 2 : Forward propagation

This is an easy process, we tend to feed forward the inputs through each layer within the network, the outputs from the previous layer become the inputs to the next layer. (first, we tend to feed our data as the inputs)

First we provide the inputs(example) from our dataset ,

dataset (XOR table) 
 X     y
1 1    0 --> X1=1 and X2=1 
1 0    1     H1 = Sigmoid(X1*w1+X2*w2) = 0.5(assume with random                                  
0 1    1                                                 weights)
0 0    0     similarly H2, H3 and O1, O2
Step 3: Calculate the total error.

Assume random weights and Activation(A1,2…) we get the errors for each neuron.

sum = inputs*weights and A = activation(sum) here Sigmoid(sum).

Out cost function according to Andrew Ng is

Note: we take partial derivative w.r.t result (by using Chain rule in calculus)

Step 4: Back propagation

Don’t worry it’s easy! or I will make it easy.

The main goal of backpropagation is to update each of the weights within the network so they cause the predicted output to be nearer the target output, thereby minimizing the error for every output neuron and the network as a whole.

So far we got the total error which is to be minimized.

if you know how gradient descent work, the rest is pretty easy if you don’t know, here is my article that talks about Gradient descent.

We need to calculate the below terms

  1. how much does the total error change with respect to the result? (or how much is a change in results) already we did in the above picture.
  2. Next, how much does the result of change with respect to its sum? (or how much is a change in sum)
  3. Finally, how much does the sum of change with respect to weights? (or how much is a change in weights)

Well now , that’s it.

Step 5: Repeat the process(2-4)for no of epochs/until error is minimum.

We repeat the process forwarding the weights(FP) and change weights(BP) for no of epochs or we reach the minimum error.

Once the training process is completed, we are able to do the prediction by feed forwarding input to the trained network, that’s it.

In the next part, I will build the neural network from scratch and it will be fully cleared if there is any doubt remaining from this blog.

stay tuned !!!!!!!

Written by 

Shubham Goyal is a Data Scientist at Knoldus Inc. With this, he is an artificial intelligence researcher, interested in doing research on different domain problems and a regular contributor to society through blogs and webinars in machine learning and artificial intelligence. He had also written a few research papers on machine learning. Moreover, a conference speaker and an official author at Towards Data Science.