Amazon

Saving Spark DataFrames on Amazon S3 got Easier !!!

Reading Time: < 1 minute In our previous blog post, Congregating Spark Files on S3, we explained that how we can Upload Files(saved in a Spark Cluster) on Amazon S3. Well, I agree that the method explained in that post was a little bit complex and hard to apply. Also, it adds a lot of boilerplate in our code. So, we started working on simplifying it & finding an easier way to provide a wrapper around Spark Continue Reading

Classification using AWS Machine Learning

Reading Time: 4 minutes One of the most common uses of Machine Learning algorithms is for the purpose of classification. Classification comes in couple of varieties. Binary classification is when we classify a given set of inputs into two classes. If there are more than 2 classes, then it is Multiclass classification. AWS ML supports both kinds of classifications. In order to use AWS Machine Learning, we downloaded a Continue Reading

Congregating Spark files on S3

Reading Time: 2 minutes We all know that Apache Spark is a fast and general engine for large-scale data processing and it is because of its speed that Spark was able to become one of the most popular frameworks in the world of big data. Working with Spark is a pleasant experience as it has a simple API for Scala, Java, Python and R. But, some tasks, in Spark, are still tough rows Continue Reading

AWS Services: AWS SDK on the Scala with Play Framework

Reading Time: 3 minutes playing-aws-scala The following blog and attached code represent a simple example of Amazon Web Services in the Scala way with Play Framework using AWScala but in this blog I have implemented only Amazon Simple Storage Service (Amazon S3) functionalities. AWScala: AWS SDK on the Scala REPL AWScala enables Scala developers to easily work with Amazon Web Services in the Scala way. Though AWScala objects basically Continue Reading