fetching data from different sources using Spark 2.1

Spark: createDataFrame() vs toDF()

Reading Time: 2 minutes There are two different ways to create a Dataframe in Spark. First, using toDF() and second is using createDataFrame(). In this blog we will see how we can create Dataframe using these two methods and what’s the exact difference between them. toDF() toDF() method provides a very concise way to create a Dataframe. This method can be applied to a sequence of objects. To access Continue Reading

A tour to the Scala Tuples

Reading Time: 3 minutes In Scala, a tuple is a class that gives us a simple way to store heterogeneous items or different data types in the same container. A tuples purpose is to combine a fixed and finite number of items together to allow the programmer to pass a tuple as a whole. A tuple is immutable in Scala. Click here to know more about mutability and immutability. Continue Reading

Tuple in Python

Reading Time: 3 minutes In this blog, we are going to discuss the tuple data structure of python. A tuple is a collection which is immutable, we cannot change the elements of a tuple once it is assigned. It allows duplicate members i.e. (1, 1) is allowed. Any set mutiple comma-separated symbols written default to tuples. >>> x, y = 1, 2 Dictionaries have a method called items that Continue Reading

How to flatten nested tuples in scala

Reading Time: 2 minutes In a project which I’ve been working on, I encountered a situation to flatten a nested tuple but couldn’t come up with a way to do so, hence out of curiosity I started googling about it and came to the following conclusion. As for an example I had a structure something similar to the one mentioned below, though not identical: val structureToOperateOn = List(List(“a1″,”a2″,”a3”), List(“b1″,”b2″,”b3”) Continue Reading

Scala Nuggets: The Tasty Tuples

Reading Time: 2 minutes One of the interesting things that Scala has to offer is a concept called Tuples. It is a literal syntax of comma-separated list of items inside parenthesis. Something like (a1, a2, a3). These “groupings” are instantiated as scala.TupleN instances, where the N is the number of items in the tuple. The Scala API defines separate TupleN classes for N between 1 and 22, inclusive. (I Continue Reading