Category Archives: Spark

Spark Cassandra Connector On Spark-Shell


Using Spark-Cassandra-Connector on Spark Shell Hi All , In this blog we will see how we can execute our spark code on spark shell using Cassandra . This is very efficient at testing or learning time , where we have … Continue reading

Posted in apache spark, big data, Cassandra, Scala, Spark | 2 Comments

Introduction to Structured Streaming


Hello!!  Knoldus had organized half an hour session on Structured Streaming briefing about the API changes, how it is different from the early Stream Computation paradigm (DStreams) and example API demonstration. Hope you will enjoy. Below are the slides and Video … Continue reading

Posted in apache spark, Scala, Spark, Streaming | 1 Comment

Partition-Aware Data Loading in Spark SQL


Data loading, in Spark SQL, means loading data in memory/cache of Spark worker nodes. For which we use to write following code: val connectionProperties = new Properties() connectionProperties.put(“user”, “username”) connectionProperties.put(“password”, “password”) val jdbcDF = spark.read .jdbc(“jdbc:postgresql:dbserver”, “schema.table”, connectionProperties) In here we are … Continue reading

Posted in Scala, Spark | Tagged , , , | 2 Comments

Application compatibility for different Spark versions


Recently spark version 2.1 was released and there is a significant difference between the 2 versions. Spark 1.6 has DataFrame and SparkContext while 2.1 has Dataset and SparkSession. Now the question arises how to write code so that both the versions of … Continue reading

Posted in apache spark, Java, Scala, Spark | Tagged , , , , | 3 Comments

Knoldus Bags the Prestigious Huawei Partner of the Year Award


Knoldus was humbled to receive the prestigious partner of the year award from Huawei at a recently held ceremony in Bangalore, India. It means a lot for us and is a validation of the quality and focus that we put … Continue reading

Posted in Akka, Scala, Spark | Tagged , | 5 Comments

Twitter’s tweets analysis using Lambda Architecture


Hello Folks, In this blog i will explain  twitter’s tweets analysis with lambda architecture. So first we need to understand  what is lambda architecture,about its component and usage. According to Wikipedia, Lambda architecture is a data processing architecture designed to handle … Continue reading

Posted in Akka, akka-http, Apache Kafka, apache spark, Architecture, Batch, big data, Cassandra, Scala, Spark, Streaming | 5 Comments

Short Interview With SMACK Tech Stack !!!


Hello guy’s, today’s we conduct short interview with SMACK about its architecture and there uses. Let’s start with of some introduction. Interviewer: How would you describe your self ? SMACK: I am SMACK (Spark, Mesos, Akka, Cassandra and Kafka) and … Continue reading

Posted in Akka, Apache Kafka, apache spark, big data, Cassandra, Scala, Spark | Tagged , , , , , , , , , , , , | Leave a comment

Tableau: Getting into Tableau Public


Big Data visualization and Business Intelligence got so easy using Tableau, millions and billions of records can be analyzed in just one go whether your data format is excel, csv, text or database, Tableau make it easy for you. So … Continue reading

Posted in apache spark, big data, Scala, Spark, Tableau | Tagged , , , , , , , | Leave a comment

2017 – Year of FAST Data


As we approach 2017, there is a strong focus on Fast Data. This is a combination of data at rest and data in motion and the speed has to be remarkably fast. In the deck that follows, we at Knoldus … Continue reading

Posted in Apache Kafka, Cassandra, Scala, Spark | Tagged | 3 Comments

Finding the Impact of a Tweet using Spark GraphX


Social Network Analysis (SNA), a process of investigating social structures using Networks and Graphs, has become a very hot topic nowadays. Using it, we can answer many questions like: How many connections an individual have ? What is the ability … Continue reading

Posted in apache spark, big data, graph, Scala, Spark | Tagged , , , | 3 Comments