Cassandra

Creating Data Pipeline with Spark streaming, Kafka and Cassandra

Reading Time: 3 minutes Hi Folks!! In this blog, we are going to learn how we can integrate Spark Structured Streaming with Kafka and Cassandra to build a simple data pipeline. Spark Structured Streaming is a component of Apache Spark framework that enables scalable, high throughput, fault tolerant processing of data streams.Apache Kafka is a scalable, high performance, low latency platform that allows reading and writing streams of data Continue Reading

Data Lake – Build it in Phases

Reading Time: 3 minutes Data Lake – How to build a data lake and what are the phases involved in the same.

Understanding data persistence in Lagom

Reading Time: 4 minutes When we create any microservice, or in general any service, one of the biggest task is to manage data persistence. Lagom supports various databases for doing this task. By default, Lagom uses Cassandra to persist data.

Big Data Landscape explained

Reading Time: 5 minutes Big Data has now evolved into a buzz word and it seems everyone is either working on it or want to work on it. However, most of the people associate Big Data with some of the popular tool sets like Hadoop, Spark, NoSql databases like Hive, Cassandra , HBase etc. HDFS made Big Data popular as it gave us an option to distribute the data Continue Reading

Streaming data from Cassandra using Alpakka

Reading Time: 7 minutes Alpakka project is an open-source initiative to implement stream aware and reactive pipelines using Java and Scala which is built on top of Akka streams and specially designed to provide a DSL for reactive and stream-oriented programming with built-in support for backpressure to avoid the flood of data. As a reference, Akka streams supports reactive streams and JDK 9+ compliant implementation and therefore fully interoperable Continue Reading

Tuning consistency with Apache Cassandra

Reading Time: 4 minutes One of the challenges faced by distributed systems is how to keep the replicas consistent with each other. Maintaining consistency requires balancing availability and partitioning. Fortunately, Apache Cassandra lets us tune this balancing according to our needs. In this blog, we are going to see how we can tune consistency levels during reads and writes to achieve faster reads and writes. Before digging more about Continue Reading

Commit Log: A commitment that Cassandra provides.

Reading Time: 5 minutes Welcome back, everyone. I have been working on Cassandra for quite some time now but never actually got to explore its working in depth. We know that its decentralized nature, as well as its ability to handle such a large volume of writes, makes it really commendable. But how does it manage to be efficient? How is it able to achieve what it is so Continue Reading

Cassandra Data Modeling

Reading Time: 3 minutes The goal of this blog is to explain the basic rules you should keep in mind when designing your schema for Cassandra. If you follow these rules, you’ll get pretty good performance out of the box. Let’s first discuss keys in Cassandra: Primary Key – Made by a single column. CREATE TABLE blogs ( key text PRIMARY KEY, data text ); Composite Key – Generated Continue Reading

Is Apache Cassandra really the Database you need?

Reading Time: 6 minutes Welcome back, everyone. It has been quite some time since I have been working with Cassandra. To be honest, it is a quite cool database. Its decentralized nature, as well as its ability to handle such a large volume of writes, is really commendable. But as we know nothing is perfect. So is the Cassandra Database. What I mean by this is that you cannot Continue Reading

Distributed Transactions and Saga Patterns

Reading Time: 6 minutes In a Knolx session organized by Knoldus, we discussed the idea of following Saga Patterns. For that to be more accessible, I’d like to share the session with the help of this blog. Service-oriented architecture has given us enough advantages to be a predominant architecture in our Industry, but it can’t be all sunshine and rainbows. There are use cases where monoliths are not only Continue Reading

A Simple walk-through to set up a local Cassandra multi-node cluster

Reading Time: 5 minutes In our earlier blogs we have already gone through The basic Introduction to Cassandra and also tried to explore the Cassandra Reads and Writes. Today we will be discussing something apart from the in-depth theoretical knowledge of Cassandra. In one of our projects , we came through a basic requirement in which we needed to required a local Cassandra cluster for some kind of testing.  Continue Reading

A Beginner’s Guide to Deploying a Lagom Service Without ConductR

Reading Time: 2 minutes How to deploy a Lagom Service without ConductR? This question has been asked and answered by many, on different forums. For example, take a look at this question on StackOverflow – Lagom without ConductR? Here the user is trying to know whether it is possible to use Lagom in production without ConductR or not. To which the best answer that came up was – “Yes, it is Continue Reading