Analyzing data.

How to Setup, Create a Database, and Communicate with MarkLogic

Reading Time: 5 minutes Marklogic, Formally known as MarkLogic Server, is an enterprise NoSQL database with broad support for unstructured and unstructured data, including JSON, XML, RDF, text, and binary data types. MarkLogic has schema flexibility, High Scalability, high availability of a NoSQL database, and enterprise features. Before starting with how to make the communication let’s understand the MarkLogic system requirement. Christopher Lindblad founded MarkLogic in 2001, particularly in Continue Reading

Developing programming and coding technologies working in a software engineers.

Ways to Retrieve Data From PostgreSQL Hstore

Reading Time: 3 minutes Postgre without a doubt is one of the most popular databases in the market. The reason behind this is speed, security, and robustness. One of the reasons for its popularity is that it has many amazing features. In this blog, we are going to discuss one of its important features which is hstore. Here we will see how we can retrieve data from hstore column. Continue Reading

BigQuery: Querying nested arrays

Reading Time: 2 minutes In a previous blog, we had seen BigQuery facilitate efficient data warehouse schema design. BigQuery supports the nested & repeated columns. We can use a combination of ARRAY and STRUCT data types to define our schema in BigQuery. It enables to denormalize data efficiently in single table. In this blog, for the same schema of sales data, we will execute a few DML operations on nested array fields. Schema In Continue Reading

JOOQ overview, setup and code generation

Reading Time: 2 minutes Introduction JOOQ is an acronym for Java Object oriented query. It is framework which is built on top of a functional programming, which helps increase in the readability of the code. The library generates Java classes based on the database tables and various constraints. Which let us create type safe queries through API. In this tutorial we are going to set up the SpringBoot and Continue Reading

How to Analyze query performance in MongoDB

Reading Time: 2 minutes Analyze query performance in mongodb may became complicated if we do not really know which part should be measured. Fortunately, MongoDB provides very handy tool which can be used to evaluate query performance: explain(“executionStats”). This tool provide us some general measurements such as number of examined document and execution time that can be used to do statistical analysis. The Database and Collection In this easy tutorial, Continue Reading

Spark SQL in Delta Lake 0.7.0

Reading Time: 3 minutes Nowadays Delta lake is a buzz word in the Big Data world, especially among the spark developers because it relegates lots of issues found in the Big Data domain. Delta Lake is an open-source storage layer that brings reliability to data lakes. Delta Lake provides ACID transactions, scalable metadata handling, and unifies streaming and batch data processing. It is evolving day by day and adds cool features in its every release. Continue Reading

Kafka Streams

Interactive Queries in Apache Kafka

Reading Time: 4 minutes Apache Kafka v0.10 introduced a new feature Kafka Streams API – a client library which can be used for building applications and microservices, where the input and output data can be stored in Kafka clusters. Kafka Streams provides state stores, which can be used by stream processing applications to store and query data.  Every task in Kafka Streams uses one or more state stores which Continue Reading