Author: Shashikant tyagi

The ecosystem of Apache Spark

Reading Time: 4 minutes Apache Spark is a powerful alternative to Hadoop MapReduce, with several, rich functionality features, like machine learning, real-time stream processing, and graph computations. It is an open-source distributed cluster-computing framework. It is designed to cover a wide range of workloads such as batch applications, iterative algorithms, interactive queries, and streaming. Apart from supporting all these workloads in a respective system. It reduces the management burden of Continue Reading

The Future of Functional Programming- ZIO

Reading Time: 3 minutes In the current scenario, lots of companies try to adopt functional programming or some elements of the functional programming paradigm, so they want to choose to take what’s best available in both worlds (functional & object-oriented programming). But, this mindset of people is slowly changing now, with the introduction of ZIO. Big companies like Apple, DHL, and Wix.com are adopting ZIO in their production apps. Let’s dive Continue Reading

Things to know about Spark RDD

Reading Time: 3 minutes What is RDD in Spark? RDD stands for Resilient Distributed Dataset. Spark RDD is the backbone of Apache Spark. That’s why RDD is a fundamental data structure of Apache Spark and RDD is the fundamental data structure of Apache Spark which are an immutable collection of objects which computes on the different node of the cluster. Each and every dataset in Spark RDD is logically partitioned across many servers Continue Reading

Apache spark

All Basics of Spark Streaming

Reading Time: 3 minutes Spark Streaming is one of the most essential parts of the Big Data ecosystem. It is a software framework from Apache Spark Foundation used to manage Big Data. Basically, it ingests the data from sources like Twitter in real-time, processes it using functions and algorithms, and pushes it out to store it in databases and other places. Spark Streaming extends the core Spark API that Continue Reading

Basic things related to Apache Spark

Reading Time: 3 minutes What is Apache Spark – Apache Spark is an open-source data processing engine to store and process data in real-time across various clusters of computers using simple programming constructs. Spark has consistent and composable API’s and Spark supports multiple languages like Python, Java, Scala And R. Developers and data sientists incorporate Spark into their applications to rapidly query, analyze, and transform data at large scale. Continue Reading

Everything you need to know about Scala

Reading Time: 3 minutes What is Scala? Scala stands for Scalable Language. It is a multi-paradigm programming language. Scala language has features of functional programming and object-oriented programming. It is a statically typed language. Scala’s source code is compiled into bytecode and executed by Java virtual machine(JVM). Scala is an immutable-first language. Scala makes it easy to write code using immutable data for developers. This includes constructs such as Continue Reading

WAVELET TREES( A SUCCINCT-DATA STRUCTURE)

Reading Time: 3 minutes WAVELET TREES  INTRODUCTION- The Wavelet Tree is a relatively new, but versatile data structure, offering solutions for many problem domains such as string processing, computational geometry, and data compression. Storing, in its basic form, a sequence of characters from an alphabet enables higher-order entropy compression and supports various fast queries. A wavelet tree is a succinct data structure that recursively partitions a stream into two Continue Reading