Machine Learning

MachineX: Total Support Tree for Association Rule Generation

Reading Time: 5 minutes In our previous blogs on Association Rule Learning, we have seen the FP-Tree and the FP-Growth algorithm. We also generated the frequent itemsets using FP-Growth. But a problem arises when we try to mine the association rules out of these frequent itemsets. Generally, the number of frequent itemsets is massive and to run an algorithm on them becomes very memory inefficient. So, to store these Continue Reading

MachineX: Frequent Itemset generation with the FP-Growth algorithm

Reading Time: 4 minutes In our previous blog, MachineX: Understanding FP-Tree construction, we discussed the FP-Tree and its construction. In this blog, we will be discussing the FP-Growth algorithm, which uses FP-Tree to extract frequent itemsets in the given dataset. FP-growth is an algorithm that generates frequent itemsets from an FP-tree by exploring the tree in a bottom-up fashion. We will be picking up the example we used in Continue Reading

MachineX: Understanding FP-Tree construction

Reading Time: 5 minutes In my previous blog, MachineX: Why no one uses apriori algorithm for association rule learning?, we discussed one of the first algorithms in association rule learning, apriori algorithm. Although even after being so simple and clear, it has some weaknesses as discussed in the above-mentioned blog. A significant improvement over the apriori algorithm is FP-Growth algorithm. To understand how FP-Growth algorithm helps in finding frequent Continue Reading

MachineX: Why no one uses apriori algorithm for association rule learning?

Reading Time: 3 minutes In my previous blog, MachineX: Two parts of Association Rule Learning, we discussed that there are two parts in performing association rule learning, namely, frequent itemset generation and rule generation. In this blog, we are going to talk about one of the algorithms for frequent itemset generation, viz., Apriori algorithm. The Apriori Principle Apriori algorithm uses the support measure to eliminate the itemsets with low Continue Reading

MachineX: Two parts of Association Rule Learning

Reading Time: 2 minutes In our previous blog, MachineX: Layman guide to Association Rule Learning, we discussed what Association rule learning is all about. And as you can already tell, with a large dataset, which almost every market has, finding association rules isn’t very easy. For these, purposes, we introduced measures of interestingness, which were support, confidence and lift. Support tells us how frequent an itemset is in a given dataset and confidence Continue Reading

MachineX: Choosing Support Vector Machine over other classifiers

Reading Time: 4 minutes When one has to select the best classifier from all the good options, often he would be on the horns of a dilemma. Decision tree/Random Forest, ANN, KNN, Logistic regression etc. are some options that often used for the choice of classification. Every one of it has its pros and cons and when to select the best one the probably the most important thing to Continue Reading

What is Deep Learning??

Reading Time: 4 minutes This term “Deep Learning”, is on fire for past two decades. Every machine learning enthusiast wants to work on it and many big companies are already making an impact on Data Science field by exploring it e.g. Google Brain project from Google or DeepFace from Facebook. The reason is simple, experts say and I quote “for most flavors of the old generations of learning algorithms … performance will Continue Reading

MachineX: One more step towards NAIVE BAYES

Reading Time: 4 minutes I hope we understand the conditional probabilities and Bayes theorem through our previous blog. Now let’s use this understanding to find out more about the naive Bayes classifier. NAIVE BAYES CLASSIFIER Naive Bayes is a simple technique for constructing classifiers: models that assign class labels to problem instances, represented as vectors of feature values, where the class labels are drawn from some finite set. Naive Bayes Continue Reading

MachineX: Unfolding Mystery Behind NAIVE BAYES CLASSIFIER

Reading Time: 4 minutes In machine learning, Naive Bayes classifiers are a family of simple “probabilistic classifiers “based on applying Bayes’ theorem with strong (naive) independence assumptions between the features. The Naive Bayes Classifier technique is based on the so-called Bayesian theorem and is particularly suited when the dimensionality of the inputs is high. Despite its simplicity, Naive Bayes can often outperform more sophisticated classification methods.   In simple terms, a Naive Bayes classifier assumes that Continue Reading

MachineX: Layman guide to Association Rule Learning

Reading Time: 6 minutes Association rule learning is one of the most common techniques in data mining and as well as machine learning. The most common use, which I’m sure you all must be aware of, is the recommendation systems used by various e-shops like Amazon, Flipkart, etc. Association rule learning is a technique to uncover the relationship between various items, elements, or more generally, various variables in a Continue Reading

MachineX: The second dimensionality reduction method

Reading Time: 5 minutes In the previous blog we have gone through how more data or to be precise more dimensions in the data creates different problems like overfitting in classification and regression algorithms. This is known as “curse of dimensionality”. Then we have gone through the solutions to the problem i.e. dimensionality reduction. We were mainly focused on one of the dimensionality reduction method called feature selection. In this Continue Reading

MachineX: When data is a curse to learning

Reading Time: 4 minutes Data and learning are like best friends, perhaps learning is too dependent on data to be called as friends. When data overwhelms, learning acts pricey, so it feels more like a girlfriend-boyfriend sort of a relationship. Well don’t get confused or bothered on how I am comparing the data and learning, it is just my depiction of something called Dimensionality reduction in machine learning. On Continue Reading

MachineX: Simplifying Logistic Regression

Reading Time: 3 minutes Logistic regression is one of the most popular machine learning algorithms for binary classification. This is because it is a simple algorithm that performs very well on a wide range of problems. It is used when you know that the data is linearly separable/classifiable and the outcome is Binary or Dichotomous but it can be extended when the dependent has more than 2 categories. It Continue Reading

Knoldus Pune Careers - Hiring Freshers

Get a head start on your career at Knoldus. Join us!