Advertisements

Big Data Analytics

retail analytics

Real-Time Analytics in Retail: The Key to unlock customer insights

Reading Time: 4 minutes Retail today is in the midst of an exciting revolution in which power has shifted from the retailer to the customer. Customers today have high expectations. They anticipate that companies will meet them where they are and when they want. They respond to experiences that are timely, targeted, and tailored to their specific needs—and reject those that aren’t. But, what’s the best way for businesses to differentiate themselves today?

Advertisements
Apache Spark

Deep Dive into Apache Spark Transformations and Action

Reading Time: 4 minutes In our previous blog of Apache Spark, we discussed a little about what Transformations & Actions are? Now we will get deeper into the topic and will understand what actually they are & how they play a vital role to work with Apache Spark? What is Spark RDD? Spark introduces the concept of an RDD (Resilient Distributed Dataset), an immutable fault-tolerant, distributed collection of objects Continue Reading

Tale of Apache Spark

Reading Time: 6 minutes Data is being produced extensively in today’s world and it is going to be generated more rapidly in future. 90% of total data that is produced in the world is produced in last two years only and it is estimated that in 2020 world’s total data would reach 45 ZB and data generated each day would be enough that if we try to store it Continue Reading

Big Data Evolution: Migrating on-premise database to Hadoop

Reading Time: 4 minutes We are now generating massive volumes of data at an accelerated rate. To meet business needs, address changing market dynamics as well as improve decision-making, sophisticated analysis of this data from disparate sources is required. The challenge is how to capture, store and model these massive pools of data effectively in relational databases. Big data is not a fad. We are just at the beginning Continue Reading

Do you really need Spark? Think Again!

Reading Time: 5 minutes With the massive amount of increase in big data technologies today, it is becoming very important to use the right tool for every process. The process can be anything like Data ingestion, Data processing, Data retrieval, Data Storage, etc. Today we are going to focus on one of those popular big data technologies i.e., Apache Spark. Apache Spark is an open-source distributed general-purpose cluster-computing framework. Spark Continue Reading

Having Issue How To Order Streamed Dataframe ?

Reading Time: 3 minutes A few days ago, i have to perform aggregation on streaming dataframe. And the moment, i apply groupBy for aggregation, data gets shuffled. Now the situation arises how to maintain order? Yes, i can use orderBy with streaming dataframe using Spark Structured Streaming, but only in complete mode. There is no way of doing ordering of streaming data in append mode and update mode. I Continue Reading

Tableau: Getting into Tableau Public

Reading Time: 2 minutes Big Data visualization and Business Intelligence got so easy using Tableau, millions and billions of records can be analyzed in just one go whether your data format is excel, csv, text or database, Tableau make it easy for you. So finally you have make your mind to generate visualizations using Tableau and want to know what are the heights of Tableau in visualizations?. You are Continue Reading

Spark – IoT : Combining Big Data Analysis with IoT

Reading Time: 3 minutes Welcome back , folks ! Time for some new gig ! I think that last series i.e. Scala – IOT was pretty amazing , which got an overwhelming response from you all which resulted in pumping up the idea of this new web-series Spark-IOT. So let’s get started, What was the motivation ? I have been active in the IoT community here, and I found Continue Reading

Meetup: An Overview of Spark DataFrames with Scala

Reading Time: < 1 minute Knoldus organized a Meetup on Wednesday, 18 Nov 2015. In this Meetup, an overview of Spark DataFrames with Scala, was given. Apache Spark is a distributed compute engine for large-scale data processing. A wide range of organizations are using it to process large datasets. Many Spark and Scala enthusiasts attended this session and got to know, as to why DataFrames are the best fit for building an application in Spark with Scala Continue Reading

Simplifying Sorting with Spark DataFrames

Reading Time: 2 minutes In our previous blog post, Using Spark DataFrames for Word Count, we saw how easy it has become to code in Spark using DataFrames. Also, it has made programming in Spark much more logical rather than technical. So, lets continue our quest for simplifying coding in Spark with DataFrames via Sorting. We all know that Sorting has always been an inseparable part of Analytics. Whether it is E-Commerce or Applied Continue Reading

Introduction to Machine Learning with Spark (Clustering)

Reading Time: 2 minutes In this blog, we will learn how to group similar data objects using K-means clustering offered by Spark Machine Learning Library. Prerequisites The code example needs only Spark Shell to execute. What is Clustering Clustering is like grouping data objects in some random clusters (with no initial class of group defined) on the basis of similarity or the natural closeness to each other. The “closeness” Continue Reading

Play with Spark: Building Apache Spark with Play Framework – (Part – 2)

Reading Time: 2 minutes Last week, we saw how to build a Simple Spark Application in Play using Scala. Now in this blog we will see how to add Spark’s Twitter Streaming feature in a Play Scala application. Spark Streaming is a powerful tool of Spark. It runs on top of Spark. It gives the ability to process and analyze real-time streaming data (in batches) along with fault-tolerant characteristics Continue Reading

Knoldus Pune Careers - Hiring Freshers

Get a head start on your career at Knoldus. Join us!