MachineX

MachineX: Run ML model prediction faster with Hummingbird

Reading Time: 3 minutes In this blog, we will see how to make our machine learning model’s prediction faster with a recently open-sourced library Hummingbird. Nowadays, we can see a lot of frameworks for deploying or serving the machine learning model into production. As a result, It is a headache for a data scientist to choose between these frameworks, keeping in mind how their model either Sklearn or LightGBM Continue Reading

MachineX: performance metrics for Model Evaluation

Reading Time: 6 minutes In this blog, we are going to see how to choose the right metrics for model evaluation in different kinds of applications. There are different metric categories based on the ML model/application, and we are going to cover the popular metrics used in the following problems: Classification Metrics (accuracy, precision, recall, F1-score, ROC, AUC) Regression Metrics (MSE, MAE) there are more metrics like Computer Vision Continue Reading

MachineX: Ultimate guide to NLP (Part 1)

Reading Time: 7 minutes In this blog, we are going to see some basic text operations with NLP, to solve different problems. This Blog is a part of a series Ultimate guide to NLP , which will focus on Basic text pre-processing techniques. Some of the major areas that we will be covering in this series of Blogs include the following: Text Pre-Processing Understanding of Text & Feature Engineering Continue Reading

MachineX: Boosting performance with XGBoost

Reading Time: 5 minutes In this blog, we are going to see how XGBoost works and some of the important features of XGBoost with the help of an example. So, many of us heard about tree models and boosting techniques. Let’s put these concepts together and talk about XGBoost, the most powerful machine learning Algorithm out there. XGboost called for eXtreme Gradient Boosted trees. The name XGBoost, though, actually Continue Reading

TensorFlow Quantum: beauty and the beast

Reading Time: 4 minutes So, we are finally here, after a long wait, we are going to be in an era of quantum computing. TFQ, the beauty of TensorFlow and beast nature of quantum computing. Quantum computing is becoming a technology to observe more closely in 2020. We have seen some recent announcements from Honeywell, Google and others, it’s worth looking forward to new pieces of hardware coming this year. Now, Google has Continue Reading

MachineX: Demystifying Market Basket analysis

Reading Time: 7 minutes In this blog, we are going to see how we can Anticipate customer behavior with Market Basket analysis By using Association rules. Introduction to Market Basket analysis Market Basket Analysis is one of the key techniques used by large retailers to uncover associations between items. It works by looking for combinations of items that occur together frequently in transactions. To put it another way, it Continue Reading

MachineX: What is K-Fold Cross Validation?

Reading Time: 3 minutes In this blog, we are going to explore and learn about K-Fold Cross Validation. K-Fold Cross Validation is a statistical method to evaluate a Machine Learning model’s performance. So, to understand what K-Fold Cross Validation is, we first need to understand what evaluating a model means, and why do we need to do that.

MachineX: Association Rule Learning with KSAI

Reading Time: 2 minutes In many of my previous blogs, I have posted about Association Rule Learning, what it’s about and how it is performed. In this blog, we are going to use Association Rule Learning to actually see it in action, and for this purpose, we are going to use KSAI, a machine learning library purely written in Scala. So, let’s begin. Adding KSAI to your project You Continue Reading

MachineX: Total Support Tree for Association Rule Generation

Reading Time: 5 minutes In our previous blogs on Association Rule Learning, we have seen the FP-Tree and the FP-Growth algorithm. We also generated the frequent itemsets using FP-Growth. But a problem arises when we try to mine the association rules out of these frequent itemsets. Generally, the number of frequent itemsets is massive and to run an algorithm on them becomes very memory inefficient. So, to store these Continue Reading

MachineX: Frequent Itemset generation with the FP-Growth algorithm

Reading Time: 4 minutes In our previous blog, MachineX: Understanding FP-Tree construction, we discussed the FP-Tree and its construction. In this blog, we will be discussing the FP-Growth algorithm, which uses FP-Tree to extract frequent itemsets in the given dataset. FP-growth is an algorithm that generates frequent itemsets from an FP-tree by exploring the tree in a bottom-up fashion. We will be picking up the example we used in Continue Reading

MachineX: Understanding FP-Tree construction

Reading Time: 5 minutes In my previous blog, MachineX: Why no one uses apriori algorithm for association rule learning?, we discussed one of the first algorithms in association rule learning, apriori algorithm. Although even after being so simple and clear, it has some weaknesses as discussed in the above-mentioned blog. A significant improvement over the apriori algorithm is FP-Growth algorithm. To understand how FP-Growth algorithm helps in finding frequent Continue Reading

MachineX: Why no one uses apriori algorithm for association rule learning?

Reading Time: 3 minutes In my previous blog, MachineX: Two parts of Association Rule Learning, we discussed that there are two parts in performing association rule learning, namely, frequent itemset generation and rule generation. In this blog, we are going to talk about one of the algorithms for frequent itemset generation, viz., Apriori algorithm. The Apriori Principle Apriori algorithm uses the support measure to eliminate the itemsets with low Continue Reading