Author: Sumit Agarwal

Apache Beam ParDo Transformations

Reading Time: 2 minutes What is a PCollection? A PCollection represents a distributed data set that your Beam pipeline operates on. The data set can be bounded, meaning it comes from a fixed source like a file, or unbounded, meaning it comes from a continuously updating source via a subscription or other mechanism. Your pipeline typically creates an initial PCollection by reading data from an external data source, but you can also create a PCollection form of Continue Reading

Introduction to InfluxDB

Reading Time: 4 minutes InfluxDB A Time Series database that stores and manages data in time series form. What is Time Series? A time series is a collection of observations of well-defined data items resulting through repeated measurements over time. Time series data is indexed in time order which is a sequence of data points. What is Time Series Database? A time-series database (TSDB) is a database system that Continue Reading

Introduction to Google Cloud Functions

Reading Time: 4 minutes What are Google Cloud Functions?  Google Cloud Functions is a Function as a Service (FaaS) that allows engineers and developers to run code without worrying about server management. Cloud Functions scales as needed and integrates with Google Cloud’s operations suite (such as Cloud Logging) out of the box.   Functions are useful when you have a task or series of tasks that need to happen in response Continue Reading

How to use GCP Cloud KMS for symmetric keys

Reading Time: 3 minutes Intoduction GCP Cloud KMS is a cloud-hosted key management service that lets you manage cryptographic keys for your cloud services the same way you do on-premises. It includes support for encryption, decryption, signing, and verification using a variety of key types and sources including Cloud HSM for hardware-backed keys. 1. How to enable the Cloud KMS API Sign in to Cloud Console and create a Continue Reading

Terraform Basic Concepts

Reading Time: 4 minutes What is terraform? According to the official Terraform documentation, Terraform is a tool for building, changing, and versioning infrastructure safely and efficiently. Terraform can manage existing and popular service providers as well as custom in-house solutions. It is an open-source IaC  (Infrastructure as Code) tool developed by HashiCorp.Terraform creates an execution plan that explains how it will get to the desired state and then implements Continue Reading

What are Java 8 Data Structures and How to use them?

Reading Time: 3 minutes What is Data Structure A data structure is a particular way of organizing data in a computer so that it can be used effectively. Basic types of Data Structure in Java Arrays The array is a fixed-size data structure with random access. Fixed-size means that while creating an array we need to specify its size(number of elements array can hold). It has a special syntax Continue Reading

Understanding Types of the Rx Java Scheduler

Reading Time: 3 minutes What is Scheduler in Rx Java Scheduler are one of the main components in RxJava. They are responsible for performing operations of Observable on different threads. They help to offload the time-consuming onto different threads. How default thread works in Rx Java Rx is single-threaded by default. It implies that an Observable and the chain of operators that we can apply to it. That will Continue Reading

Understanding Types of the Rx Java Subject

Reading Time: 2 minutes What is Subject? A Subject is a sort of bridge or proxy that is available in some implementations of ReactiveX . That acts both as an observer and as an Observable. Because it is an observer, it can subscribe to one or more Observables. And because it is an Observable, it can pass through the items it observes by re-emitting them. And it can also Continue Reading

How to search in the Google Data Catalog by tags in Java

Reading Time: 3 minutes What is Data Catalog? Data Catalog is a fully managed, scalable metadata management service in Google Cloud’s Data Analytics . Data Catalog Search scope In Data Catalog search scope depends on users i.e. Search results may be different for users with different permissions. For example, if a user has BigQuery metadata read access to an object. Than object will appear in their Data Catalog search Continue Reading