python

MachineX: Run ML model prediction faster with Hummingbird

Reading Time: 3 minutes In this blog, we will see how to make our machine learning model’s prediction faster with a recently open-sourced library Hummingbird. Nowadays, we can see a lot of frameworks for deploying or serving the machine learning model into production. As a result, It is a headache for a data scientist to choose between these frameworks, keeping in mind how their model either Sklearn or LightGBM Continue Reading

MachineX: Ultimate guide to NLP (Part 1)

Reading Time: 7 minutes In this blog, we are going to see some basic text operations with NLP, to solve different problems. This Blog is a part of a series Ultimate guide to NLP , which will focus on Basic text pre-processing techniques. Some of the major areas that we will be covering in this series of Blogs include the following: Text Pre-Processing Understanding of Text & Feature Engineering Continue Reading

MachineX: Boosting performance with XGBoost

Reading Time: 5 minutes In this blog, we are going to see how XGBoost works and some of the important features of XGBoost with the help of an example. So, many of us heard about tree models and boosting techniques. Let’s put these concepts together and talk about XGBoost, the most powerful machine learning Algorithm out there. XGboost called for eXtreme Gradient Boosted trees. The name XGBoost, though, actually Continue Reading

MachineX: Demystifying Market Basket analysis

Reading Time: 7 minutes In this blog, we are going to see how we can Anticipate customer behavior with Market Basket analysis By using Association rules. Introduction to Market Basket analysis Market Basket Analysis is one of the key techniques used by large retailers to uncover associations between items. It works by looking for combinations of items that occur together frequently in transactions. To put it another way, it Continue Reading

Python Scripts: An Introduction

Reading Time: 7 minutes Introduction Python is a great flexible programming language that can be used in many situations. In this tutorial, we will focus primarily on it’s ability to enhance the Unix/Linux shell environment. Typically in Unix we will create “bash” shell scripts, but we can also create shell scripts using python, and it’s really simple! We can even name our shell scripts with the .sh extension and Continue Reading

Build your first web application using Django

Reading Time: 3 minutes In our previous blog Introduction to Django, we discussed the Django’s features and architecture. In this blog, we will create a web application in Django. For starting a new project, go to the folder where you want your project to be and run the command: django-admin startproject django_proj django-admin Django’s command-line utility for administrative tasks.manage.py is automatically created in each Django project. manage.py does the Continue Reading

MachineX :k-Nearest Neighbors(KNN) for classification

Reading Time: 4 minutes In this blog, we are going to go through about one of the widely used classification algorithm called KNN (K-Nearest Neighbors). Since I started doing data science, I observed that most of the problems end up with classification model The main reason behind this biased property is, most of the analytic problems are based on decision making. For instance, to identify loan applicants as low, Continue Reading

Introduction to Django

Reading Time: 3 minutes In this blog, we are going to talk about Django. Before that let’s understand what is web framework and why do we need it? A web framework is a software tool that helps us develop application faster and smarter. It eliminates the need to write a lot of repetitive code and saves time. What is Django? Django is a free open source high-level web framework Continue Reading

MachineX: Data Cleaning in Python

Reading Time: 8 minutes In this Blog, we are going to learn about how to do Data Cleaning in Python. Most data scientists spend only 20 percent of their time on actual data analysis and 80 percent of their time finding, cleaning, and reorganizing huge amounts of data, which is an inefficient data strategy. The reason data scientists are hired in the first place is to develop algorithms and Continue Reading

Introduction to NumPy

Reading Time: 4 minutes In this blog, I will walk you through the basics of NumPy. If you want to do machine learning then knowledge of NumPy is necessary. It one of the most widely used python library Numeric Python. It is the most useful library if you are dealing with numbers in python. NumPy guarantees great execution speed comparing it with python standard libraries. It comes with a Continue Reading

Data Analysis using Python: Pandas

Reading Time: 3 minutes In this blog, I am going to explain pandas which is an open source library for data manipulation, analysis, and cleaning. Pandas is a high-level data manipulation tool developed by Wes McKinney. The name Pandas is derived from the word Panel Data – an Econometrics from Multidimensional data. Pandas is built on the top of NumPy. Five typical steps in the processing and analysis of Continue Reading