Tag Archives: Big Data

Spark Structured Streaming: A Simple Definition


“Structured Streaming”, nowadays we are hearing this term in Apache Spark ecosystem quite a lot, as it is being preached as next big thing in scalable big data world. Although, we all know that Structured Streaming means a stream having … Continue reading

Posted in Scala, Spark, Streaming | Tagged , , , , | 1 Comment

Installing and Running Presto


Hi Folks ! In my previous blog, I had talked about Getting Introduced with Presto. In today’s blog, I shall be talking about setting up(installing) and running presto. The basic pre-requisites for setting up Presto are: Linux or Mac OS … Continue reading

Posted in big data, database, Scala | Tagged , , , , | Leave a comment

Partition-Aware Data Loading in Spark SQL


Data loading, in Spark SQL, means loading data in memory/cache of Spark worker nodes. For which we use to write following code: val connectionProperties = new Properties() connectionProperties.put(“user”, “username”) connectionProperties.put(“password”, “password”) val jdbcDF = spark.read .jdbc(“jdbc:postgresql:dbserver”, “schema.table”, connectionProperties) In here we are … Continue reading

Posted in Scala, Spark | Tagged , , , | 7 Comments

Short Interview With SMACK Tech Stack !!!


Hello guy’s, today’s we conduct short interview with SMACK about its architecture and there uses. Let’s start with of some introduction. Interviewer: How would you describe your self ? SMACK: I am SMACK (Spark, Mesos, Akka, Cassandra and Kafka) and … Continue reading

Posted in Akka, Apache Kafka, apache spark, big data, Cassandra, Scala, Spark | Tagged , , , , , , , , , , , , | Leave a comment

Tableau: Getting into Tableau Public


Big Data visualization and Business Intelligence got so easy using Tableau, millions and billions of records can be analyzed in just one go whether your data format is excel, csv, text or database, Tableau make it easy for you. So … Continue reading

Posted in apache spark, big data, Scala, Spark, Tableau | Tagged , , , , , , , | Leave a comment

Business Intelligence-Data Visualization: Tableau


Spark, Bigdata, NoSQL, Hadoop are some of the most using and top in charts technologies that we frequently use in Knoldus, when these terms used than one thing comes into picture is ‘Huge Data, millions/billions of records’ Knoldus developers use … Continue reading

Posted in Scala, Tableau | Tagged , , , , , , , , | 2 Comments

Setting Up Multi-Node Hadoop Cluster , just got easy !


In this blog,we are going to embark the journey of how to setup the Hadoop Multi-Node cluster on a distributed environment. So lets do not waste any time, and let’s get started. Here are steps you need to perform. Prerequisite: … Continue reading

Posted in Architecture, big data, Scala | Tagged , , , , , , | 7 Comments

Cassandra Data Modeling – Primary , Clustering , Partition , Compound Keys


In this post we are going to discuss more about different keys available in Cassandra . Primary key concept in Cassandra is different from Relational databases. Therefore it is worth spending time to understand this concept. Lets take an example … Continue reading

Posted in Best Practices, big data, Cassandra, database, NoSql, Scala | Tagged , , , , , , | 4 Comments

Spark – IoT : Combining Big Data Analysis with IoT


Welcome back , folks ! Time for some new gig ! I think that last series i.e. Scala – IOT was pretty amazing , which got an overwhelming response from you all which resulted in pumping up the idea of … Continue reading

Posted in apache spark, IOT, Scala, Spark | Tagged , , , , , , , , , , , | 2 Comments

Hive-Metastore : A Basic Introduction


As we know database is the most important and powerful part for any organisation. It is the collection of Schema, Tables, Relationships, Queries and Views. It is an organized collection of data. But can you ever think about these question … Continue reading

Posted in database, Scala | Tagged , , , , | 1 Comment