For all these years, we have have been galvanized by the advancements in the field of computing. Most of it is aligned with the introduction of better and powerful CPUs. First there was a single CPU whose speed was the main focus. Once, that part hit the wall, there were more cores on the CPUs. Today, even the mobile phones boast of quad core processors. While CPU (the brain) has been getting quite some attention, slowly but quietly GPU, the Soul! is building up its reputation.
In this post and hopefully a series which follows hereafter, we would like to understand the power of GPUs and how at Knoldus we have harnessed the power of GPUs with Spark resulting in Spradus, our optimized version of Apache Spark, which harnesses the power of CPUs and GPUs to give you best of the breed.
Before we go there, let us start with some basics.
A CPU consists of a few cores optimized for sequential serial processing while a GPU has a massively parallel architecture consisting of thousands of smaller, more efficient cores designed for handling multiple tasks simultaneously. So what does it mean?
It means that in the application code, sequential processing can keep happening on the CPUs, where as the compute intensive jobs can be passed on to the GPU. See the image below,
So what is GPU more suited for?
- Computational requirements are large. Real-time rendering requires billions of pixels per second, and each pixel requires hundreds or more operations. GPUs must deliver an enormous amount of compute performance to satisfy the demand of
complex real-time applications. This characteristic of the GPU can be used anywhere where we need high amount of computation.
- Parallelism is substantial. Fortunately, the graphics pipeline is well suited for parallelism. If we need to do things in parallel then GPU is our friend.
- Throughput is more important than latency. This one is very interesting. GPU implementations of the graphics pipeline prioritize throughput over latency. What does it mean? GPUs value throughput over latency. Let us see what these terms mean
- Latency is the time required to perform some action or to produce some result. Latency is measured in units of time — hours, minutes, seconds, nanoseconds or clock periods.
- Throughput is the number of such actions executed or results produced per unit of time. This is measured in units of whatever is being produced per unit of time.
- Example if an iPhone factory in China produces one iPhone in 15 minutes and produces 96 iPhones a day then the latency is 15 mins and the throughput is 96 iPhones / day or 4 iPhones/hour
GPUs are optimized for taking huge batches of data and performing the same operation over and over very quickly.
Architecturally, the CPU is composed of just few cores with lots of cache memory that can handle a few software threads at a time. In contrast, a GPU is composed of hundreds of cores that can handle thousands of threads simultaneously. The ability of a GPU with 100+ cores to process thousands of threads can accelerate some software by 100x over a CPU alone.
Ok I guess we have sold GPU enough. We would get deeper in the next topic. Stay tuned!
Watch this incredible Nvidia demonstration.