AWS

KnolSnow: Load continuous data into Snowflake using Snowpipe

Reading Time: 5 minutes In this blog, we will discuss loading streaming data into Snowflake table using Snowpipe. But before that, if you haven’t read the previous part of this blog i.e., Loading Bulk Data into Snowflake then I would suggest you go through it. As now we have been set so let’s get started and see what Snowpipe is all about. Introduction Snowpipe is a mechanism provided by Continue Reading

KnolSnow: Loading Data Into Snowflake

Reading Time: 5 minutes This blog pertains to Loading Data into Snowflake, and I will explain you about the various step involved in this process. So let’s get started. Before moving ahead, you can visit the blog on understanding the basic of Snowflake Data Warehouse in case you want to refresh your concepts. Now let’s talk about the actual topic for which you have click on this blog. To Continue Reading

Automate deployment using AWS CodeDeploy

Reading Time: 6 minutes In the CodeDeploy blog series, we are going to write two blogs the first blog covers the CodeDeploy theory-based and In the second blog, we will cover the full end-to-end automation practical of the application deployment using CodeDeploy and Jenkins. Let’s start aws CodeDeploy is basically a deployment service through which we can easily automate your deployment.

Apache Spark: Handle Corrupt/Bad Records

Reading Time: 3 minutes Most of the time writing ETL jobs becomes very expensive when it comes to handling corrupt records. And in such cases, ETL pipelines need a good solution to handle corrupted records. Because, larger the ETL pipeline is, the more complex it becomes to handle such bad records in between. Corrupt data includes: Missing information Incomplete information Schema mismatch Differing formats or data types Apache Spark: Continue Reading

Amazon EMR

Reading Time: 3 minutes Businesses worldwide are discovering the power of new big data processing and analytics frameworks like Apache Hadoop and Apache Spark, but they are also discovering some of the challenges of operating these technologies in on-premises data lake environments. They may also have concerns about the future of their current distribution vendor. Common problems of on-premises big data environments include a lack of agility, excessive costs, Continue Reading

Setting up the IAM permission for AWS Lambda

Reading Time: 3 minutes About AWS AWS Lambda is a compute service that lets you run code without managing servers. AWS Lambda executes your code only when needed and scales automatically. You pay only for the compute time you consume – there is no charge when your code is not running. Why IAM? Before we call Lambda function we need to get correct permission setup to access the function. Continue Reading

Getting started with Amazon SNS

Reading Time: 2 minutes Introduction The Simple Notification Service (SNS) is used as a publish and subscribe messaging service. But what does it mean? SNS is centered around topics and you can think of a topic as a group for collecting messages. Users or endpoints can then subscribe to this topic and messages or events are then published to that topic. When a message is published, all subscribers to Continue Reading

Getting started with Amazon SQS

Reading Time: 4 minutes With the continuing growth of microservices and a cloud best practice of designing decoupled systems, it’s important that developers have the ability to utilize a service or system that handles the delivery of messages between components and this is where SQS comes in. Amazon SQS (Simple Queue Service) is a fully managed service offered by AWS, that works seamlessly with server-less systems, microservices or any Continue Reading

HAWK-Rust Series: Automate Infrastructure using Terraform

Reading Time: 3 minutes HAWK is a Rust based Image Recognition project, which implements a two-factor authentication by using the RFID card for user identification and Image for user validation. In this project, we have used AWS services and the whole AWS infrastructure required by this project is automated using Terraform (a tool for building, changing, and versioning infrastructure safely and efficiently).

Let’s create your first Grafana dashboard

Reading Time: 4 minutes In my previous blog, we discussed the setup of Grafana-Graphite for JMX monitoring.  Now we will create a first Grafana dashboard where we will create Grafana queries to visualize JMX metrics stored in Graphite. As we know, Grafana UI runs on http://localhost:3000/ by default so let’s open the URL in the browser with the default username and password which is admin: admin After login either Continue Reading

Deep Dive: AWS AssumeRole using STS API

Reading Time: 4 minutes Introduction In this blog post we shall be discussing on AWS:AssumeRole approach for obtaining temporary security credentials using STS(Security Token Service) end to end setup.  Temporary Security Credentials (consisting of access key id, secret access key and a security token) enables you to have an access to AWS Environment for a specified duration. It solves use cases like cross account access and single sign-on to Continue Reading