In my last blog we were discussing about the biological motivation on the artificial neural network. This time we will discuss more about the artificial neural network in practice. In ANN we have different layers of networks to solve a problem. For which problem, how many layers required to solve it, is a different topic and will be writing a blog on that soon, however we can still proceed to implementing the network and making it learn to solve the problem. No doubt as programmer we understand code better and pick anything up pretty quickly. ANN also could be learn directly by going through the code, however so far what I feel is that knowing the maths behind the ANN algorithm helps more in understanding the process. So before going to the code I would be discussing about the maths behind it. The below image and all its description are in terms of “Feedforward network”. There are several network structures in ANN but let’s start with this one first.
As shown in the above image the network has three layers input, hidden and the output layer. In the input layer we have the inputs as X1, X2.. Xn. For the middle layer or hidden we have its outputs as Yh1, Yh2 …Yhn. And for the output layer we have the outputs as Y1, Y2, Y3. And let’s take the targeted outputs as Ŷ1 , Ŷ2 …… Ŷn. Similarly we have different weights among between different neurons, we have name them like W11 between X1 to Yh1, W12 between X1 to Yh2, W13 between X1 to Yh3 etc. and similar for the output layer neurons as well. Important thing to note here is that ANN works on real-valued, discrete-valued and vector valued inputs.
Just to summarize we have the terms as below and frankly if you’re new to neural network I would recommend go through them –
Inputs = X1, X2, X3
hidden outputs = Yh1, Yh2, Yh3
outputs = Y1, Y2, Y3
Targeted outputs = Ŷ1 , Ŷ2 , Ŷ3
Weights to Yh1 = W11, W21, W31
Weights to Yh2 = W12, W22, W32
Weights to Yh3 = W13, W23, W33
Weights to Y1 = W41, W51, W61
Weights to Y2 = W42, W52, W62
Weights to Y3 = W43, W53, W63
So now we are all set and ready for implementing the network mathematically for now. Every neuron has an activation function like f(x) = sigmoid(x). The activation function takes a parameter. So our first step would be create the input for the activation function. We do that by multiplying the weights into the input value. So seems like below formula
XWh1 = X1.W11 + X2. W21 + X3. W31
XWh2 = X1.W12 + X2. W22 + X3. W32
XWh3 = X1.W13 + X2. W23 + X3. W33
So the output that the hidden layers would release are
Yh1 = sigmoid(XWh1)
Yh2 = sigmoid(XWh2)
Yh3 = sigmoid(XWh3)
These outputs from hidden layer becomes the input for the output layer and they would be multiplied to the weights for the output layer. Therefore the multiplication would be something like below.
So the final output in the output layer would be like below
Y1 = sigmoid(YhWo1)
Y2 = sigmoid(YhWo2)
Y3 = sigmoid(YhWo3)
If you’re reading about neural network for the first time, you must be thinking what is sigmoid function now. Well it looks like below formula –
We can take different activation function to solve different problems using ANN. When to choose which activation function is again a different topic which we will try to cover in another blog. Just to give a brief idea about sigmoid function, Sigmoid function produces an S shaped curve when put on graph. When input for the network is Real-valued and differentiable we take sigmoid function, so that we can easily find the gradient out of it.
And that’s it ! If we implement what we have just mentioned here, our neural network is ready. The next step would be to train them. But before we go detail into how to train them, our next step would be to see the implementation in Scala. Which we are going to see in our next blog.