Tag Archives: Spark SQL

What’s new in Apache Spark 2.2


Apache recently released a newer version of Spark i.e Apache Spark2.2. The new version comes with new improvements as well as the addition of new functionalities. The major addition to this release is Structured Streaming. It has been marked as production … Continue reading

Posted in apache spark, big data, Scala, Spark, Streaming | Tagged , , , , , , , , , , | 2 Comments

Partition-Aware Data Loading in Spark SQL


Data loading, in Spark SQL, means loading data in memory/cache of Spark worker nodes. For which we use to write following code: val connectionProperties = new Properties() connectionProperties.put(“user”, “username”) connectionProperties.put(“password”, “password”) val jdbcDF = spark.read .jdbc(“jdbc:postgresql:dbserver”, “schema.table”, connectionProperties) In here we are … Continue reading

Posted in Scala, Spark | Tagged , , , | 7 Comments

Cassandra with Spark 2.0 : Building Rest API !


In this tutorial , we will be demonstrating how to make a REST service in Spark using Akka-http as a side-kick  😉  and Cassandra as the data store. We have seen the power of Spark earlier and when it is … Continue reading

Posted in Akka, akka-http, apache spark, Cassandra, Scala, scalatest, Spark | Tagged , , , , , , , , , , , , , | 2 Comments

UDF overloading in spark


UDF are User Defined Function which are register with hive context to use custom functions in spark SQL queries. For example if you want to prepend some string in any other string or column then you can create a following … Continue reading

Posted in apache spark, big data, Scala, Spark | Tagged , , | 3 Comments

Meetup: An Overview of Spark DataFrames with Scala


Knoldus organized a Meetup on Wednesday, 18 Nov 2015. In this Meetup, an overview of Spark DataFrames with Scala, was given. Apache Spark is a distributed compute engine for large-scale data processing. A wide range of organizations are using it to process large datasets. … Continue reading

Posted in apache spark, big data, Scala, Spark | Tagged , , , , , | 4 Comments

Meetup: Introduction to Spark with Scala


Knoldus organized a Meetup on Wednesday, 1 April 2015. In this Meetup, we gave a brief introduction of Spark with Scala. Apache Spark is a fast and general engine for large-scale data processing. A wide range of organizations are using it to process large datasets. Many … Continue reading

Posted in Agile, Scala, Spark | Tagged , , , , , , | 5 Comments

Play with Spark: Building Spark SQL in a Play Spark Application


In our last post of Play with Spark! series, we saw how to integrate Spark Streaming in a Play Scala application. Now in this blog we will see how to add Spark SQL feature in a Play Scala application. Spark SQL is a … Continue reading

Posted in Agile, Play Framework, Scala, Spark | Tagged , , | 2 Comments