fast data analytics

Apache Spark: Read Data from S3 Bucket

Reading Time: 2 minutes Well, a one working with spark is very much familiar with the ways of reading the file from local either from a Table or HDFS or from any file. But do you know how tricky it is to read data into spark from an S3 bucket? So, this blog makes you give a stepwise follow up to how to read data from an S3 bucket. Continue Reading

Apache Spark: Repartitioning v/s Coalesce

Reading Time: 3 minutes Does partitioning help you increase/decrease the Job Performance? Spark splits data into partitions and computation is done in parallel for each partition. It is very important to understand how data is partitioned and when you need to manually modify the partitioning to run spark applications efficiently. Now, diving into our main topic i.e Repartitioning v/s Coalesce What is Coalesce? The coalesce method reduces the number Continue Reading

Fast Data: The New Age Analytics For Enhanced Customer Experience

Reading Time: 6 minutes Data is evolving both in terms of quality and quantity in today’s enterprises and in the past few years, changes have occurred at a much faster pace. Not long ago, Big Data was considered the next big thing for digital transformation. Technologies like Hadoop and HBase made sense as batch processing of data was the norm. But things are not the same now.  By the Continue Reading