Spark: Introduction to Datasets

As I have already discussed in my previous blog Spark: RDD vs DataFrames about the shortcomings of RDDs and how DataFrames overcome them. Now we’ll try to have a look at the shortcomings of DataFrames and how Dataset APIs can overcome them. DataFrames:- A DataFrame is a distributed collection of data, which is organized into named columns. Conceptually, it is equivalent to the relational tables with Continue Reading

Frame Cease-less User-Interactive Concept in Rust Programming

Cease-less is the concept of making a program perpetual until the user wants to terminate it, and this concept is implemented by providing a list of option from which the user inputs his/her choice. So, this blog pertains to the building a Menu-Driven program in Rust Programming Langage.

Signals Handling Inside Docker Container

Linux supports both POSIX reliable signals and POSIX real-time signals. Signal Dispositions: Each signal has a current disposition, which determines how the process behaves when it is delivered the signal. We have different types of signals in Linux The specified default disposition of few signals are below: Term : default action is to terminate the process.Ign     : default action is to ignore the signal.Stop   : Continue Reading

Digital Thinking – The Tools

In the previous edition of the Digital Transformation Series, we talked about the forces which are relevant for digital transformation. We understood that Customers, Competition, Data, Innovation and Value play important roles towards the success of a Digital transformation journey. Let us look at the tools which are relevant for the the journey Strategic Ideation Tools These are the tools which allow organisations to come Continue Reading

Scala Zero Hour: Lists

A class for immutable linked lists representing ordered collections of elements of type. This class comes with two implementing case classes scala.Nil and scala.:: that implement the abstract members isEmpty, head and tail. This class is optimal for,last-in-first-out (LIFO) stack-like access patterns. Given below are a few examples val myList = List(3, 2, 1) myList: List[Int] = List(1, 2, 3, 4)val myListwith4 = 4 :: myList // Continue Reading

Vault: A secure way to keep your App’s secrets

In this blog, we will discuss the Vault. In modern scenarios, we want to secure our system as much as possible. We don’t want to store our secret keys and certificates in the system or configurations. We need a place where we can keep our secrets with more security and access them securely whenever we need them. We can use the Vault. Vault is the Continue Reading

Akka Stream: Map And MapAsync

In this blog, we will discuss what are “map” and “mapAsync” when used in the Akka stream and how to use them. The difference is highlighted in their signatures:- takes in a function that returns a type T, while Flow.mapAsync takes in a function that returns a type Future[T]. Let’s take one practical example to understand both:- Problem – Suppose we have a user with a userId and Continue Reading

Spark Streaming vs. Structured Streaming

Fan of Apache Spark? I am too. The reason is simple. Interesting APIs to work with, fast and distributed processing, unlike map-reduce no I/O overhead, fault tolerance and many more. With this much, you can do a lot in this world of Big data and Fast data. From “processing huge chunks of data” to “working on streaming data”, Spark works flawlessly in all. In this Continue Reading

Scala Extractors

Futures with Timeout in Scala

You all must be wondering timeouts in Futures, how is that possible? Don’t worry this is the right place to explore what do we mean by Futures with a timeout. I had encountered an issue of finding out if we can provide timeouts to futures in Scala without actually blocking them and guess what, this is possible. In this blog, we will be talking about Continue Reading

The Twelve-factor app principle with Lagom framework

Well, the Twelve-factor app principle is not new for software development. It was drafted by developers at Heroku and was first presented by Adam Wiggins circa 2011. It’s a methodology for building software as a service application and these best practices are designed to enable applications to be built with portability and resilience when deployed to the web. You can easily find a lot of blogs Continue Reading

Spark: RDD vs DataFrames

Spark SQL is a Spark module for structured data processing. Unlike the basic Spark RDD API, the interfaces provided by Spark SQL provide Spark with more information about the structure of both the data and the computation being performed. Internally, Spark SQL uses this extra information to perform extra optimizations.One use of Spark SQL is to execute SQL queries. When running SQL from within another Continue Reading

PlantUML (Easy Diagram Creation Tool)

Hello All, In this blog, we are going to explain how we can create a different level of diagrams using PlantUML. Generally, developers communicate with their audience using the following ways: Technical text. Screenshots of the code. OR Diagrams Technical text and Screenshots of the code works well for developers but fails while communicating with the clients or users, but diagrams work well with both Continue Reading

Knoldus Pune Careers - Hiring Freshers

Get a head start on your career at Knoldus. Join us!