Reading Time: 2 minutes In this blog, we are going to learn about one of the evaluation metrics that is used for evaluating a classification ML model, which is, Jaccard Index. But first, let’s see what evaluation metrics are.
Reading Time: 6 minutes In this blog we will be demonstrating the functionality of applying the full ML pipeline over a set of documents which in this case we are using 10 books from the internet. So lets start with first thing first.. What is Clustering ? Clustering is the task of grouping a set of objects in such a way that objects in the same group (called a Continue Reading
Reading Time: 2 minutes Often a machine learning task contains several steps such as extracting features out of raw data, creating learning models to train on features and running predictions on trained models, etc. With the help of the pipeline API provided by Spark, it is easier to combine and tune multiple ML algorithms into a single workflow. Whats is in the blog? We will create a sample ML pipeline Continue Reading