Category Archives: Best Practices

Simple Things You Can Learn From Cassandra Nodetool (Monitor/Manage) For DC/OS


Cassandra native tool called nodetool is used for monitoring and managing cassandra cluster for dcos Continue reading

Advertisements
Posted in Best Practices, big data, Cassandra, cluster, NoSql | Tagged , , , , , , , , , , , , , | 2 Comments

A Java Lagom service which only consumes from Kafka topic (Subscriber only service)


Subscriber only service means an application which only consumes, does not produce. We have generally seen the applications which both produces and consumes data from a Kafka topic but sometimes we need to write an application which only consumes data … Continue reading

Posted in Akka, Apache Kafka, Architecture, Best Practices, big data, Functional Programming, github, Java, MessagesAPI, Microservices, Scala | Leave a comment

What to do for overriding the PureConfig behavior in Scala ?


PureConfig has its own predefined behavior for reading and writing to the configuration files, but sometimes we got the tricky requirement in which we need some specific behavior; for example to read the config. It is possible to override the … Continue reading

Posted in Agile, Best Practices, big data, knoldus, Reactive, Scala | 1 Comment

Apache Spark : Spark Union adds up the partition of input RDDs


Some days back when I was doing union of 2 pair rdds, I found the strange behavior for the number of partitions. The output RDD got different number of partition than input Rdd. For ex: suppose rdd1 and rdd2, each … Continue reading

Posted in Agile, apache spark, Best Practices, big data, Scala | Leave a comment

Scala Trait and Mixin – Points to Remember


Trait can be viewed not only as interfaces in other languages, but also as classes with only parameterless constructor. Whenever there is some code in trait, the trait is called mixin. trait Alarm { def trigger(): String } In scala … Continue reading

Posted in Best Practices, Functional Programming, Scala | Tagged , , , , , , , , , , , , , | Leave a comment

Cassandra Data Modeling – Primary , Clustering , Partition , Compound Keys


In this post we are going to discuss more about different keys available in Cassandra . Primary key concept in Cassandra is different from Relational databases. Therefore it is worth spending time to understand this concept. Lets take an example … Continue reading

Posted in Best Practices, big data, Cassandra, database, NoSql, Scala | Tagged , , , , , , | 4 Comments

Meetup: Stream Processing Using Spark & Kafka


Knoldus organized a Meetup on Friday, 9 September 2016. Topics which were covered in this meetup are: Overview of Spark Streaming. Fault-tolerance Semantics & Performance Tuning. Spark Streaming Integration with  Kafka. Meetup code sample available here Real time stream processing … Continue reading

Posted in Apache Kafka, apache spark, Best Practices, big data, Elasticsearch, Scala, Spark, Streaming | 1 Comment

RDF – Basic Building Blocks of Semantic Web


In the first post, we talked about the general description of Semantic Web and how it can be useful. In this post, we would try to look at RDF which is the basic building block. RDF is Resource Description Framework … Continue reading

Posted in Best Practices, big data, Scala | Tagged , , | Leave a comment

Semantic Web – The lure to a better world


The story of the Semantic Web is not new, however, it is interesting how some things become more and more important with the passage of time. The term was coined by Sir Tim Berners-Lee in May 2001 however, it took … Continue reading

Posted in Best Practices, big data, Scala | Tagged , | 1 Comment

Building Analytics Engine Using Akka, Kafka & ElasticSearch


In this blog , I will share my experience on building scalable, distributed and fault-tolerant  Analytics engine using Scala, Akka, Play, Kafka and ElasticSearch. I would like to take you through the journey of  building an analytics engine which was primarily … Continue reading

Posted in Akka, akka-http, Amazon, Amazon EC2, Apache Kafka, Architecture, AWS, AWS Services, Batch, Best Practices, big data, Cassandra, database, Elasticsearch, Java, Non-Blocking, NoSql, Reactive, S3, Scala, Streaming, Web | 10 Comments