Studio-Scala

Why Scala.js is preferred language for front-end development

Reading Time: 4 minutes From last few years, I have been working on Scala as a back-end and Javascript as front-end for web application.  It becomes very painful, when I have to refactor a large Javascript code base, written by someone else. There are also many annoying language warts. I tried to use other front-end framework, targeting JS but wanted to use same platform to go all the way Continue Reading

Ganglia Cluster Monitoring: monitoring spark cluster

Reading Time: 3 minutes Ganglia is cluster monitoring tool to monitor the health of distributed cluster of spark and hadoop. I know you all have question that we already have a Application UI (http://masternode:4040) and Cluster UI (http://masternode:8080) then why we need ganglia? So answer is, Spark cluster UI and application UI dont provide us all information related to our cluster like Network I/O and health of every node. Continue Reading

Best Practices for Using Slick on Production

Reading Time: 5 minutes Slick is most popular library for relational database access in Scala ecosystem. When we are going to use Slick for production , then some questions arise  like where should the mapping tables be defined and how to join with other tables, how to write unit tests. Apart from this there is lack of clarity on the design guidelines. In this blog post , I am Continue Reading

S3Ninja an Introduction

Reading Time: 2 minutes S3Ninja is an emulator that emulates the S3API. S3Ninja provides an environment for your local system to support integration of upload a file, just as we do on S3. Currently it supports objects methods only like GET, PUT, HEAD, DELETE. S3Ninja can be used, to upload file on our local system instead of S3 to write integration tests that may integrate with upload of file Continue Reading

Getting close to Apache Flink, albeit in a Träge manner – 2

Reading Time: 6 minutes From the preceding post in this series In the last blog , we had taken a look at Flink’s CountWindow feature. Here’s a quick recap: As a stream of events enter a Flink-based application, we can apply a transformation of CountWindow on it (there are many such transformations the Flink offers us, we will meet them as we go). CountWindow allows us to create a Continue Reading

Domain Driven Design with Scala

Reading Time: < 1 minute The benefits of DDD have been elucidated multiple times. For us, at Knoldus, we want to make sure that quality of software developed goes a long way. More than 70% of the cost of the software is spent in the maintenance of the software and hence it becomes absolutely essential that a good amount of time is spent in making the right software right. The Continue Reading

Getting close to Apache Flink, albeit in a Träge manner – 1

Reading Time: 7 minutes Of late, I have begun to read about Apache Flink. Apache Flink (just Flink hereafter), is an ‘open source platform for distributed stream and batch data processing’, to quote from the homepage.  What has caught my interest is Flink’s idea that, the ability operate on unit of data streaming in gives one the flexibility to decide what constitutes a batch: count of events or events Continue Reading

BlinkDB by Databricks Engineer @ Knoldus

Reading Time: < 1 minute On 24 Nov, 2015, Sameer Agarwal, Software Engineer at Databricks, gave us an introduction of BlinkDB in the MeetUp organized by Knoldus. It was a great session. We are thankful to Sameer. It was quite inspiring and appreciated by all attendees. BlinkDB is a massively parallel, approximate query engine for running interactive SQL queries on large volumes of data. It allows users to trade-off query accuracy Continue Reading

Meetup: An Overview of Spark DataFrames with Scala

Reading Time: < 1 minute Knoldus organized a Meetup on Wednesday, 18 Nov 2015. In this Meetup, an overview of Spark DataFrames with Scala, was given. Apache Spark is a distributed compute engine for large-scale data processing. A wide range of organizations are using it to process large datasets. Many Spark and Scala enthusiasts attended this session and got to know, as to why DataFrames are the best fit for building an application in Spark with Scala Continue Reading

MeetUp on “BlinkDB and G-OLA: Supporting Continuous Answers with Error Bars in SparkSQL”

Reading Time: < 1 minute Big datasets are growing exponentially, but our needs to get quick interactive responses to our queries remain ever as important. This talk will feature an overview of various components in BlinkDB and introduce a new generalized online aggregation (G-OLA) paradigm in SparkSQL to incrementally process massive amounts of data on clusters of tens, hundreds or thousands of machines while returning approximate answers. More precisely, this Continue Reading

Script-less Test Automation – Create Automated Test-Cases Automatically

Reading Time: 4 minutes Test automation is the activity to creating test script that can run with out human interventions on the same UI that the customer and end user would seen. These scripts can reduce the execution time and cover all the aspects of application. The first generation of test automation tools provided the macro recording facility which run on the synchronous API rather than UI. The  test Continue Reading

Service Virtualization in Testing

Reading Time: 2 minutes Application are very important for the business today. The development cost and the quality of application are remains challenges. Service Virtualization allowing developers, testers and performance teams to work in parallel for faster delivery and higher application quality and reliability. it Simulates the behavior of selected components within a composite application to enable end-to-end testing as a whole. Service Virtualization is not a substitute for Continue Reading