Getting started with TensorFlow: Writing your first program


In my previous blog , we saw what Tensorflow is and some of it’s terminologies. In this blog, we are going to go ahead and implement a very basic program in TensorFlow using Python to see it in action.

To import TensorFlow library, use import tensorflow as tf

The computation in TensorFlow consists of two stages –

  1. Building the computational graph
  2. Running the computational graph

Computational graphs are nothing but the data flow graphs that I mentioned in my previous blog. Each node of the data flow graph will represent an operation that will contribute towards evaluating the TensorFlow computation, hence computational graph. In TensorFlow, each node takes zero or more tensors as inputs and produces a tensor as an output.

One type of node is constant, which takes no input, and outputs a value it stores internally. Let’s see how to define a constant in TensorFlow.

constantValue1 = tf.constant(9.0, dtype=tf.float32)
constantValue2 = tf.constant(19.0)

print("constantValue1 = %s" % constantValue1)
print("constantValue2 = %s" % constantValue2)

The outputs of print statements will be –

constantValue1 = Tensor("Const:0", shape=(), dtype=float32)
constantValue2 = Tensor("Const_1:0", shape=(), dtype=float32)

Notice that the output wasn’t 9.0 or 19.0 but Tensor objects. This is because we just built the computational graph but did not run it. Before running it, let’s see what the above output means. So, in the Tensor object, first parameter is the name for that tensor. The “Const” part of the name is assigned to it by the TensorFlow itself if not explicitly given by the programmer. The name generated is then followed by a “:” which is followed by a number, 0 in this case. This number is the index of that tensor which is being named. What I mean by that is, a node can produce multiple outputs or multiple tensors as output. So, in that case this number would be the index of each of the tensor in output. Here, though, there is only one output, so the tensor gets assigned 0. If there was one more output, that tensor would have been assigned 1. Second parameter signifies the shape of that tensor. I have already talked about shape of tensors in my previous blog. Third type is the data type of that tensor. You can either explicitly give it, as done for the first constant, or TensorFlow can also infer it, as done for the second constant.

If we want to see 9.0 and 19.0 as output, we will have to actually run the computational graph we just built. To do that, we will have to create a Session object and invoke it’s run method. We can do that as done below –

sess = tf.Session()

print(sess.run(constantValue1))
print(sess.run(constantValue2))

The output of the above code will be 9.0 and 19.0.

Now, let’s add these two constants. Adding is an operation, and an operation is just another node in tensorflow.

addConstants = constantValue1 + constantValue2
print("addConstants = ", addConstants)

sumOfConstants = sess.run(addConstants)
print("sum = ", sumOfConstants)

The output of the above code is –

addConstants = Tensor("add:0", shape=(), dtype=float32)
sum = 28.0

Here, ‘+’ is just a shorthand for “tf.add()”.

Now, how do we supply our own values to TensorFlow? For these purposes, placeholder comes in picture. A placeholder is a promise to provide a value later. Let’s quickly create two placeholders and perform an operation on them to see them in action.

myValue1 = tf.placeholder(dtype=tf.float32)
myValue2 = tf.placeholder(dtype=tf.float32)

sumOfMyValuesNode = myValue1 + myValue2
sumOfMyValues = sess.run(sumOfMyValuesNode, {myValue1: 5.0, myValue2: 6.0})

print("Sum of myValues = ", sumOfMyValues)

Here, myValue1 and myValue2 both are placeholders, whose value will be supplied later. Notice here that giving the data type is compulsory (dtype). The values to the placeholder can be supplied when the run method of the session object is invoked, as done in the above example. The values are supplied in the feed_dict argument of the run method. So, the output of the above code is

Sum of myValues = 11.0

But the whole point of machine learning is to make our data trainable so that we can train it, optimize it based on the training results and finally achieve a model that can work almost perfectly on the real data. So, how do we make our data trainable in TensorFlow? For this purpose, comes Variables to our rescue.

Variables allow us to add trainable parameters to our program. Variables can be defined as follows –

myVariable = tf.Variable(2.0, dtype=tf.float32)

Evey variable is initialized with some value, 2.0 in this case, and giving a data type is optional. But the variable is only defined using the above way, it is not yet initialized. Variables are not initialized when you call tf.Variable. To initialize all the variables in a TensorFlow program, you must explicitly call a special operation as follows –

init = tf.global_variables_initializer()
sess.run(init)

It is important to realize that init is a handle to the TensorFlow sub-graph that initializes all the global variables. Until we call sess.run, the variables are uninitialized.

print("myVariable = ", sess.run(myVariable))

This prints out myVariable = 2.0

And if we want to change the value of our variable, we can use the assign function as below –

sess.run(tf.assign(myVariable, 10.0))
print(sess.run(myVariable))

which prints 10.0 as output.

Ok, so now that we are clear with the basic terms for writing a TensorFlow program, we will take a very easy example and implement it. We will implement the following model-

y = W * x

We will provide our program with some training data, i.e., some values of x and desired values of y for that x, calculate the value of W on the basis of the training data, and then provide test data to see how accurate the results are on test data. Since we have taken a very simple model, our accuracy would easily reach 100%. But this almost never happens in the real and more complex models. But for understanding purposes, this will do.

Since we will supply the values for x and y, we will declare them as placeholders and since the value of W will have to be changed for every input, we will declare it as a variable, with some initial value, let’s say 1. So declarations will go something like this –

W = tf.Variable(1, dtype=tf.float32)
x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)

Now, we will define our simple model as below –

myModel = W * x

Now, to train the data and get closer to the real model, we will have to write a loss function and then minimize it. To keep things simple, we will take the sum of squared errors as the loss function. Error is nothing but the difference between what the result came using our model, and what the desired value(y) was. Then, we will square those error for each of the input and add them. Below is the implementation for the same –

delta = myModel - y
squaredDelta = tf.square(delta)
loss = tf.reduce_sum(squaredDelta)

To keep things simple, we will make our own little optimizer, based on the concept of gradient descent optimizer(if you don’t know about it, don’t worry just keep reading) to correct the value of W and then test it on some test data.

So, what we will be doing is calculating loss of our model, manipulating the value of W to minimize the loss, checking if loss has decreased or not, and manipulating the value of W further based on the result of loss. The code I’ve written for this optimizer is as below –

oldLoss = sys.float_info.max

adding = 0
subtracting = 0

def addOne():
    sess.run(tf.assign(W, sess.run(W) + 1.0))

def subtractOne():
    sess.run(tf.assign(W, sess.run(W) - 1.0))


while oldLoss > 0:
   currentLoss = sess.run(loss, {x: [1, 2, 3, 4], y: [10, 20, 30, 40]})
   if currentLoss == 0:
       break
   elif adding == 0 and subtracting == 0:
       addOne()
       adding = 1
   elif adding == 1 and currentLoss <= oldLoss:
       addOne()
       adding = 1
       subtracting = 0
   elif adding == 1 and currentLoss >= oldLoss:
       subtractOne()
       adding = 0
       subtracting = 1
   elif subtracting == 1 and currentLoss <= oldLoss:
       subtractOne()
       subtracting = 1
       adding = 0
   elif subtracting == 1 and currentLoss >= oldLoss:
       addOne()
       subtracting = 0
       adding = 1
   oldLoss = currentLoss

Please keep in mind that we are certain here that our loss can reach 0, because we have used a simple model. For more complex models, the conditions can be changed appropriately.

In the above code, adding and subtracting are flags that are used to remember what operation was performed last(addition or subtraction). currentLoss is a variable that stores the value of loss function at the starting of the loop and oldLoss is a variable that stores the value of loss function at the end of the loop. These two variables are compared in between the loop to check how the operation(addition or subtraction) affected the loss value, i.e., decreased or increased it and on the basis of that, further operations are performed. We are either decreasing the value of W by 1 or increasing it by 1. This is just a sample optimizer. Good optimizers are much more complex and efficient and many are already implemented in TensorFlow, which we will talk about in future blogs. This is just a sample optimizer which may not work perfectly but is well enough to give you an idea of how TensorFlow is working, which was my main objective here. The code written above is very simple to understand once you go through it and everything used in the code has been discussed in the blog.

For input we are giving [1, 2, 3, 4] for x and [10, 20, 30, 40] for y(desired value). So, as we can see, value of W should be 10.0, which we have currently initialized to 1.0. So our model should use the training data supplied to it and convert W from 1.0 to 10.0, and use this W on the test data.

So, to run our program, we have to initialize the global variables and make a session object and invoke it’s run method on the global variables handle, like below –

init = tf.global_variables_initializer()

sess = tf.Session()
sess.run(init)

Ok, we are done. To check the value of W, we will put a print statement at the end –

print(sess.run(W))

This should print 10.0 as output when run. This means that the value of W has been changed from 1.0 to 10.0. If we supply some other data to our model to check the value of y, then we should always get 10 times of whatever value we supply. I put three print statements after the code to check the outputs –

print(sess.run(myModel, {x: 27.0}))
print(sess.run(myModel, {x: 10.0}))
print(sess.run(myModel, {x: 80.0}))

And the outputs I received were –

270.0
100.0
800.0

As expected.

I hope I was able to introduce the concepts to you in an easy and yet understandable way. This was a very simple example, I encourage you to go ahead and examine with the example, play around with it, look into optimizers(Gradient Descent Optimizer would be a great start) and try to implement them in TensorFlow. Many optimizers have been implemented in TensorFlow about which I’ll be discussing in my future blogs. For my next blog, I’ll be using MNIST dataset of handwritten digits and recognize them using TensorFlow.

References –

  1. https://www.tensorflow.org
  2. https://www.analyticsvidhya.com/blog/2017/03/introduction-to-gradient-descent-algorithm-along-its-variants/

Git hub repository for the implemented program – https://github.com/akshanshjain95/TensorFlow-Sample-Program

I hope this blog turned out to be helpful for you.

Advertisements
This entry was posted in machine learning. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s