ML, AI and Data Engineering

MachineX: SVM as Non-Linear Classifiers

In our previous blogs, we have already looked and had a higher level understanding of SVM and why to choose SVM over other classifiers. In this blog post, we will look at a detailed explanation of how to use SVM for complex decision boundaries and build Non-Linear Classifiers using SVM. The primary method for doing this is by using Kernels. In linear SVM we find Continue Reading

Protein Structure determination aided by Stochastic Search (Replica Exchange Monte-Carlo Method)

Introduction Proteins are large molecules, which occur in abundance in every single living organism. They carryout vital functions such as transporting oxygen, converting the food you eat into energy your body can use, and many more. Proteins are long chains of linked units called amino acids. There are 20 types of amino acids. Proteins fold into different shapes depending upon their sequence of amino acids. Continue Reading

Data processing using ML Supervised classification algorithm to find accuracy

Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. Machine learning focuses on the development of computer programs that can access data and use it to learn for themselves. Type of machine learning Supervised learning Unsupervised Learning Reinforcement Learning In Supervised Learning, algorithms learn from labeled data. After understanding Continue Reading

MachineX: The alphabets of Artificial Neural Network – (Part 2)

If you are reading this blog, it is supposed that you have already done with Part 1 No???? Then visit to the previous blog The alphabets of Artificial Neural Network first and comeback here for an awesome knowledge about Neural network working. We got the basic understanding of neural network so let’s get into deep. Let’s understand how neural networks work. Once you got the Continue Reading

MachineX: The inevitable Principal Component Analysis

In this blog post, we will look at an interesting feature extraction technique of Machine Learning known as Principal Component Analysis (PCA). PCA is one of the powerful techniques in dimensionality reduction, in fact, the de facto standard for human face recognition. Let’s first understand what is dimensionality reduction Dimensionality Reduction As an example let’s say we have a data set with many-many features(which is Continue Reading

MachineX: The alphabets of Artificial Neural Network

In this blog, we will talk about Neural network which is the base of deep learning which gave machine learning and ultra edge in the current AI revolution. Let’s get started!!!!!! before diving into deep learning, let’s know – Why Deep Learning ??? Well, there are plenty of reason , few of them are: Deep learning is most popular than shallow level learning once you Continue Reading

MachineX: Cosine Similarity for Item-Based Collaborative Filtering

“A recommender system or a recommendation system (sometimes replacing “system” with a synonym such as platform or engine) is a subclass of information filtering system that seeks to predict the “rating” or “preference” a user would give to an item. “ – Wikipedia In simple terms a recommender system is where the system is capable of producing a list of recommendation with respect to an Continue Reading

MachineX: What is K-Fold Cross Validation?

In this blog, we are going to explore and learn about K-Fold Cross Validation. K-Fold Cross Validation is a statistical method to evaluate a Machine Learning model’s performance. So, to understand what K-Fold Cross Validation is, we first need to understand what evaluating a model means, and why do we need to do that.

Spark: Introduction to Datasets

As I have already discussed in my previous blog Spark: RDD vs DataFrames about the shortcomings of RDDs and how DataFrames overcome them. Now we’ll try to have a look at the shortcomings of DataFrames and how Dataset APIs can overcome them. DataFrames:- A DataFrame is a distributed collection of data, which is organized into named columns. Conceptually, it is equivalent to the relational tables with Continue Reading

Spark Streaming vs. Structured Streaming

Fan of Apache Spark? I am too. The reason is simple. Interesting APIs to work with, fast and distributed processing, unlike map-reduce no I/O overhead, fault tolerance and many more. With this much, you can do a lot in this world of Big data and Fast data. From “processing huge chunks of data” to “working on streaming data”, Spark works flawlessly in all. In this Continue Reading

Spark: RDD vs DataFrames

Spark SQL is a Spark module for structured data processing. Unlike the basic Spark RDD API, the interfaces provided by Spark SQL provide Spark with more information about the structure of both the data and the computation being performed. Internally, Spark SQL uses this extra information to perform extra optimizations.One use of Spark SQL is to execute SQL queries. When running SQL from within another Continue Reading

Knoldus Pune Careers - Hiring Freshers

Get a head start on your career at Knoldus. Join us!