Reading Time: 2 minutes In this blog, we are going to learn about one of the evaluation metrics that is used for evaluating a classification ML model, which is, Jaccard Index. But first, let’s see what evaluation metrics are.
Reading Time: 3 minutes In this blog, we are going to explore and learn about K-Fold Cross Validation. K-Fold Cross Validation is a statistical method to evaluate a Machine Learning model’s performance. So, to understand what K-Fold Cross Validation is, we first need to understand what evaluating a model means, and why do we need to do that.
Reading Time: 2 minutes In many of my previous blogs, I have posted about Association Rule Learning, what it’s about and how it is performed. In this blog, we are going to use Association Rule Learning to actually see it in action, and for this purpose, we are going to use KSAI, a machine learning library purely written in Scala. So, let’s begin. Adding KSAI to your project You Continue Reading
Reading Time: 5 minutes In our previous blogs on Association Rule Learning, we have seen the FP-Tree and the FP-Growth algorithm. We also generated the frequent itemsets using FP-Growth. But a problem arises when we try to mine the association rules out of these frequent itemsets. Generally, the number of frequent itemsets is massive and to run an algorithm on them becomes very memory inefficient. So, to store these Continue Reading
Reading Time: 4 minutes In our previous blog, MachineX: Understanding FP-Tree construction, we discussed the FP-Tree and its construction. In this blog, we will be discussing the FP-Growth algorithm, which uses FP-Tree to extract frequent itemsets in the given dataset. FP-growth is an algorithm that generates frequent itemsets from an FP-tree by exploring the tree in a bottom-up fashion. We will be picking up the example we used in Continue Reading
Reading Time: 5 minutes In my previous blog, MachineX: Why no one uses apriori algorithm for association rule learning?, we discussed one of the first algorithms in association rule learning, apriori algorithm. Although even after being so simple and clear, it has some weaknesses as discussed in the above-mentioned blog. A significant improvement over the apriori algorithm is FP-Growth algorithm. To understand how FP-Growth algorithm helps in finding frequent Continue Reading
Reading Time: 3 minutes In my previous blog, MachineX: Two parts of Association Rule Learning, we discussed that there are two parts in performing association rule learning, namely, frequent itemset generation and rule generation. In this blog, we are going to talk about one of the algorithms for frequent itemset generation, viz., Apriori algorithm. The Apriori Principle Apriori algorithm uses the support measure to eliminate the itemsets with low Continue Reading
Reading Time: 2 minutes In our previous blog, MachineX: Layman guide to Association Rule Learning, we discussed what Association rule learning is all about. And as you can already tell, with a large dataset, which almost every market has, finding association rules isn’t very easy. For these, purposes, we introduced measures of interestingness, which were support, confidence and lift. Support tells us how frequent an itemset is in a given dataset and confidence Continue Reading
Reading Time: 4 minutes When one has to select the best classifier from all the good options, often he would be on the horns of a dilemma. Decision tree/Random Forest, ANN, KNN, Logistic regression etc. are some options that often used for the choice of classification. Every one of it has its pros and cons and when to select the best one the probably the most important thing to Continue Reading
Reading Time: 4 minutes I hope we understand the conditional probabilities and Bayes theorem through our previous blog. Now let’s use this understanding to find out more about the naive Bayes classifier. NAIVE BAYES CLASSIFIER Naive Bayes is a simple technique for constructing classifiers: models that assign class labels to problem instances, represented as vectors of feature values, where the class labels are drawn from some finite set. Naive Bayes Continue Reading
Reading Time: 4 minutes In machine learning, Naive Bayes classifiers are a family of simple “probabilistic classifiers “based on applying Bayes’ theorem with strong (naive) independence assumptions between the features. The Naive Bayes Classifier technique is based on the so-called Bayesian theorem and is particularly suited when the dimensionality of the inputs is high. Despite its simplicity, Naive Bayes can often outperform more sophisticated classification methods. In simple terms, a Naive Bayes classifier assumes that Continue Reading