Advertisements

Database

An Overview of the Stored Procedures in PostgreSQL

Reading Time: 4 minutes As you may know in all the versions up to PostgreSQL 10, it was not possible to create a procedure in PostgreSQL. In  PostgreSQL11, PROCEDURE was added as a new schema object which is a similar object to FUNCTION, but without a return value. Over the years many people were anxious to have the functionality and it was finally added in  PostgreSQL. Traditionally,  PostgreSQL has provided all Continue Reading

Advertisements
Apache Spark

Deep Dive into Apache Spark Transformations and Action

Reading Time: 4 minutes In our previous blog of Apache Spark, we discussed a little about what Transformations & Actions are? Now we will get deeper into the topic and will understand what actually they are & how they play a vital role to work with Apache Spark? What is Spark RDD? Spark introduces the concept of an RDD (Resilient Distributed Dataset), an immutable fault-tolerant, distributed collection of objects Continue Reading

Reactive Architecture

Reading Time: 2 minutes Recently I got an invitation to present a guest lecture for faculty of Engineering colleges in ABES college of Engineering. I came up with the most trending topic i.e Reactive Architecture. We talked about what is this buzzing keywords and why does it came into existence. Also What are the challenges one were facing and how are the real world problems being solved by using Continue Reading

Tale of Apache Spark

Reading Time: 6 minutes Data is being produced extensively in today’s world and it is going to be generated more rapidly in future. 90% of total data that is produced in the world is produced in last two years only and it is estimated that in 2020 world’s total data would reach 45 ZB and data generated each day would be enough that if we try to store it Continue Reading

Streaming data from Cassandra using Alpakka

Reading Time: 7 minutes Alpakka project is an open-source initiative to implement stream aware and reactive pipelines using Java and Scala which is built on top of Akka streams and specially designed to provide a DSL for reactive and stream-oriented programming with built-in support for backpressure to avoid the flood of data. As a reference, Akka streams supports reactive streams and JDK 9+ compliant implementation and therefore fully interoperable Continue Reading

Defining your workflow: Why Not Airflow?

Reading Time: 4 minutes What is Apache Airflow? Airflow is a platform to programmatically author, schedule & monitor workflows or data pipelines. These functions achieved with Directed Acyclic Graphs (DAG) of the tasks. It is an open-source and still in the incubator stage. It was initialized in 2014 under the umbrella of Airbnb since then it got an excellent reputation with approximately 800 contributors on GitHub and 13000 stars. Continue Reading

Can we do joins in MongoDB?

Reading Time: 3 minutes MongoDB is a NoSQL document database designed for ease of development and scaling. The best part about using a relational DBMS is that we can perform a wide range of relational queries on it. Doing joins on different tables is very easy. But, when we talk about MongoDB, the way data is stored here is quite different from any relational DBMS. How data is Stored Continue Reading

Custom DynamoDB Docker Instance

Reading Time: 3 minutes Hey guys, I hope you all are doing well, I am back with another blog on custom docker instances for databases. In my last blog we saw how we can have our custom docker instance for MySQL. Similarly, in this blog we would look how we can do the same with DynamoDB, so let’s get started. So just like the scenario in previous blog, I Continue Reading

Using Vertica with Spark-Kafka: Writing

Reading Time: 4 minutes In previous blog of this series, we took a glance over the basic definition of Spark and Vertica. We also did a code overview for reading data from Vertica using Spark as DataFrame and saving the data into Kafka. In this blog we will be doing the reverse flow i.e. working on reading the data from Kafka as a DataFrame and writing that DataFrame into Continue Reading

Using Vertica with Spark-Kafka: Reading

Reading Time: 4 minutes We live in a world of Big data where the size of data is so big even for small results. This is the result of an increase in data collection on a rapid scale in the modern world. This massiveness of data brings the requirements of such tools which can work upon such a big chunk of data. I am pretty sure that you guys Continue Reading

Reactive Java: Different flavors of querying the Couchbase using Spring Web Flux Reactive Couchbase API

Reading Time: 3 minutes As I am exploring Spring Web flux these days so I got an opportunity to explore the different ways of interacting with couch base using reactive APIs. We will not discuss how we can make a crud application in Spring Web flux. My main focus will be on, in how many different ways, we can query the couchbase with reactive couchbase API in Spring Web Continue Reading

Delete operation in Dgraph using GRPC

Reading Time: 2 minutes Deletion in dgraph is an easy operation we just have to keep few things in mind before deleting anything in dgraph. Before I explain how can we delete I am going to explain the different scenario. For example, Delete the edge Delete the node. Delete the one value in a list We will take a scenario where we have a person who has the following Continue Reading

Knoldus Pune Careers - Hiring Freshers

Get a head start on your career at Knoldus. Join us!