Boost Factorial Calculation with Spark

We all know that, Apache Spark is a fast and a general engine for large-scale data processing. It can process data up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk.

But, is that the only task (i.e., MapReduce) for which Spark can be used ? The answer is: No. Spark is not only a Big Data processing engine. It is a framework which provides a distributed environment to process data. This means we can perform any type of task using Spark.

For example, lets take Factorial. We all know that calculating Factorial for Large numbers is cumbersome in any programming language and on top of that, CPU takes a lot of time to complete the calculations. So, what can be the solution ?

Well, Spark can be the solution to this problem. Lets see that in form of code.

First, we will try to implement Factorial using only Scala in a Tail Recursive way.

def factorial(number: Int): BigInt = {
def recursiveFactorial(number: Int, accumulator: BigInt): BigInt = {
if(number == 0)  accumulator
else  recursiveFactorial((number - 1), accumulator * number)
}
recursiveFactorial(number, 1)
}

The time taken by above code to find the Factorial of 200000 on my machine (Quad Core Intel i5) was about 20.21s.

Now, lets implement the same function using Spark.

def factorial(number: Int): BigInt = {
val list = if(number == 0 ) List(BigInt(1)) else (BigInt(1) to number).toList
val rdd = sparkContext.parallelize(list)
rdd.reduce(_ * _)
}

The time taken by Spark to find the factorial of 200000 on the same machine was only 5.41s, which is almost 4x faster than using Scala alone.

Of course, the calculation time can vary depending on the H/W we are using. But, still we have to admit that Spark not only reduced the calculation time, but also gave a much cleaner way to code it.

This entry was posted in Scala, Spark and tagged , . Bookmark the permalink.

5 Responses to Boost Factorial Calculation with Spark

1. sandeep says:

Reblogged this on sandeepknol.

2. Reblogged this on pushpendupurkait.

3. Reblogged this on knoldermanish.

4. Reblogged this on himanshu2014.

5. Nirmalya Sengupta (@baatchitweet) says:

I like the blogs that you guys at Knoldus, write: I have learnt about many things from them and will keep doing so in the future.