Introduction To Amazon Athena
Data analysis is a complex process and there have always been attempts to ease it. We have many tools for analytics and even the popular tech giant which is provided by Amazon with an AWS service named Amazon Athena. In this blog, We will learn the basic and advanced usage of Athena.
Amazon Athena is an interactive data analysis tool that is used to process complex queries. Athena takes very little time and is serverless. It is not a Database service hence, you just pay for the queries you run. You just point your data in S3, define the schema required, and with a standard SQL.
Amazon launched Athena as one of its services on November 20, 2016. As I also mention that, Amazon Athena is a serverless query service that makes analysis of data, using standard SQL, stored in an Amazon S3 bucket. We can point Amazon Athena at their data stored in Amazon S3 and run queries using standard SQL to get results in seconds.
In Amazon Athena, there is no need to set up infrastructure. and We need to pay only for the queries that we run. Amazon Athena scales automatically, executing queries in parallel, which gives fast results, even with a large dataset and complex queries.
Difference Between Microsoft SQL Server And AWS Athena
|Features||Microsoft SQL Server||Amazon Athena|
|DEFINITION||Microsoft SQL Server is a database management and analysis system.||Amazon Athena is an interactive query service that makes data analysis easy.|
|USAGE||Used for DML, DCL, DDL, and TCL operations on Database.||only Used for DML operations on Database.|
|BENEFITS||1. Reliable and easy to use.|
2. High performance.
3. Easy to maintain.
|1. Easy to use.|
2. High performance.
3. No maintenance is required.
2. AWS Glue
3. Amazon S3
|LIMITATIONS||1. Limited RDS storage.|
2. Limited instances.
|1. No DDL’s supported.|
2. Works with external table only.
Features Of Athena
Amazon Athena is one of the best services which is provided by Amazon. It has many features that make it suitable for Data Analysis. Let’s take a look at the different features one by one.
Athena has many features but one of the best features is that Athena doesn’t require any installation. It can be accessed directly from the AWS Console also directly by AWS CLI.
As I mention above that It is serverless, so we do not need to worry about infrastructure, configuration, scaling or failure. Athena takes care of everything on its own.
Pay per query:
Athena is cost-effective and charges only for the query which we will run, i.e. the amount of data that is managed by Athena per query. We can save a lot if we compress the data and format our dataset accordingly.
Athena is a very fast analytics tool. It can perform complex queries in very little time. It breaks the large query into simpler ones and runs all of them parallelly and then combines the result to give our required output.
With the help of IAM policies and AWS Identity, Athena gives you complete control over the data set. As the data is stored in S3 buckets, IAM policies can help you manage control of users.
Creating a table in Athena
Step 1 First, create an s3 bucket then upload a JSON or CSV format file.
Step 2 Search for Glue (It is a service of aws). We will see many options on the left side. We have to select crawlers to create a crawler then fill the required details in it and run the crawler.
Step3 Go to the table option, We will see a new table there.
Step4 In Athena, We can preview the tables
I have covered all the basics of Athena. You can check the official documentation.
Reference – https://docs.aws.amazon.com/athena/