AWS

Apache Spark: Handle Corrupt/Bad Records

Reading Time: 3 minutes Most of the time writing ETL jobs becomes very expensive when it comes to handling corrupt records. And in such cases, ETL pipelines need a good solution to handle corrupted records. Because, larger the ETL pipeline is, the more complex it becomes to handle such bad records in between. Corrupt data includes: Missing information Incomplete information Schema mismatch Differing formats or data types Apache Spark: Continue Reading

Amazon EMR

Reading Time: 3 minutes Businesses worldwide are discovering the power of new big data processing and analytics frameworks like Apache Hadoop and Apache Spark, but they are also discovering some of the challenges of operating these technologies in on-premises data lake environments. They may also have concerns about the future of their current distribution vendor. Common problems of on-premises big data environments include a lack of agility, excessive costs, Continue Reading

Setting up the IAM permission for AWS Lambda

Reading Time: 3 minutes About AWS AWS Lambda is a compute service that lets you run code without managing servers. AWS Lambda executes your code only when needed and scales automatically. You pay only for the compute time you consume – there is no charge when your code is not running. Why IAM? Before we call Lambda function we need to get correct permission setup to access the function. Continue Reading

Getting started with Amazon SNS

Reading Time: 2 minutes Introduction The Simple Notification Service (SNS) is used as a publish and subscribe messaging service. But what does it mean? SNS is centered around topics and you can think of a topic as a group for collecting messages. Users or endpoints can then subscribe to this topic and messages or events are then published to that topic. When a message is published, all subscribers to Continue Reading

Getting started with Amazon SQS

Reading Time: 4 minutes With the continuing growth of microservices and a cloud best practice of designing decoupled systems, it’s important that developers have the ability to utilize a service or system that handles the delivery of messages between components and this is where SQS comes in. Amazon SQS (Simple Queue Service) is a fully managed service offered by AWS, that works seamlessly with server-less systems, microservices or any Continue Reading

HAWK-Rust Series: Automate Infrastructure using Terraform

Reading Time: 3 minutes HAWK is a Rust based Image Recognition project, which implements a two-factor authentication by using the RFID card for user identification and Image for user validation. In this project, we have used AWS services and the whole AWS infrastructure required by this project is automated using Terraform (a tool for building, changing, and versioning infrastructure safely and efficiently).

Let’s create your first Grafana dashboard

Reading Time: 4 minutes In my previous blog, we discussed the setup of Grafana-Graphite for JMX monitoring.  Now we will create a first Grafana dashboard where we will create Grafana queries to visualize JMX metrics stored in Graphite. As we know, Grafana UI runs on http://localhost:3000/ by default so let’s open the URL in the browser with the default username and password which is admin: admin After login either Continue Reading

Deep Dive: AWS AssumeRole using STS API

Reading Time: 4 minutes Introduction In this blog post we shall be discussing on AWS:AssumeRole approach for obtaining temporary security credentials using STS(Security Token Service) end to end setup.  Temporary Security Credentials (consisting of access key id, secret access key and a security token) enables you to have an access to AWS Environment for a specified duration. It solves use cases like cross account access and single sign-on to Continue Reading

AWS STS:AssumeRole vs Federation

Reading Time: 2 minutes Introduction AssumeRole and Federation are two widely used approaches provided by AWS to facilitate authorization of cloud resources via Identity Management System (IDMS). In this post, we will describe the Federation and AssumeRole approaches and their integration. We will also see how they are different, and finally, conclude which one serves better. FEDERATED USERS Federated Users(external identities) are users you manage outside of AWS for Continue Reading

Running jmx2graphite as a java agent to push the JMX metrics into Graphite

Reading Time: 2 minutes In my previous blog, we discussed how to monitor a Kafka stream application using Grafana and Graphite. In this solution, we used jmx2graphite as a metrics exporter which takes the metrics from the Jolokia URL where Jolokia exposes the JMX metrics and pushes those metrics to Graphite. But, there is a problem with this solution that we need to deploy one jmx2graphite per service. So Continue Reading

How to create a bucket on Amazon S3 and getting security credential keys?

Reading Time: 3 minutes Amazon S3 has a simple web services interface that you can use to store and retrieve any amount of data, at any time, from anywhere on the web. This blog describes : how you can create buckets on S3, getting credential keys, where should you keep your credential keys. CREATION OF BUCKET First of all, you need to sign up in aws S3 after that Continue Reading

Automating Infrastructure on AWS Using Terraform

Reading Time: 2 minutes In this blog, I am going to showcase how to create an infrastructure on AWS using Terraform.  Let’s have a brief introduction of Terraform before jumping to the particular use-case. What is Terraform? Terraform is an infrastructure provisioning tool created by Hashicorp. It allows you to describe your infrastructure as code, creates execution plans that outline exactly what will happen when you run your code, builds Continue Reading

DEPLOYING A PLAY APPLICATION(SCALA) WITH ANSIBLE ON AWS

Reading Time: 2 minutes In this blog, I will be demonstrating how to create an AWS instance and how to deploy your play application on that AWS instance using Ansible. First let’s see how to make an AWS instance:- Login to AWS console Click on launch instance on top and select the free instance Select the machine which you want to set up. Click on Configure Instance Details -> Continue Reading