K-Means Clustering

K-Means-Algorithm

Reading Time: 3 minutes Machine Learning has gained popularity in the last couple of years and has witnessed an exponential rise in its usage. It gives a computer/machine to act without being explicitly programmed. Unsupervised learning is a technique to model the underlying structure or distribution in the data. It enables us to learn more about the data without providing any pre-assigned labels or scores for the training data. Continue Reading

A sample ML Pipeline for Clustering in Spark

Reading Time: 2 minutes Often a machine learning task contains several steps such as extracting features out of raw data, creating learning models to train on features and running predictions on trained models, etc.  With the help of the pipeline API provided by Spark, it is easier to combine and tune multiple ML algorithms into a single workflow. Whats is in the blog? We will create a sample ML pipeline Continue Reading