MachineX: Run ML model prediction faster with Hummingbird

Table of contents

Reading Time: 3 minutes

In this blog, we will see how to make our machine learning model’s prediction faster with a recently open-sourced library Hummingbird.

Nowadays, we can see a lot of frameworks for deploying or serving the machine learning model into production. As a result, It is a headache for a data scientist to choose between these frameworks, keeping in mind how their model either Sklearn or LightGBM or PyTorch will perform in real-world or production environment.

In addition to take care of accuracy and performance of a model , speed is also an important thing frameworks are taking care of.

Recently, Hummingbird – a library from the Microsoft researcher team – was released, it gives a strong answer to every question on speeding the prediction results in production.

What is Hummingbird?

As per their documentation:

Hummingbird is a library for compiling trained traditional ML models into tensor computations. Hummingbird allows users to seamlessly leverage neural network frameworks (such as PyTorch) to accelerate traditional ML models.

If I have to define it in simple words, Hummingbird can compile featurization operators and Traditional ML models into a small set of tensor operations to enhance the efficient computations for both CPU and hardware accelerators (GPU, TPU). It opens the new way to reduce infrastructure complexity and model scoring cost of Traditional ML.

Why to use Hummingbird?

So, there is a question that, why should someone go with the hummingbird.
there are a lot of benefits of hummingbird like:

All the current and future optimizations implemented in neural network frameworks.
Native hardware acceleration
Having a unique platform to support both traditional and neural network models;
Users benefit without having to re-engineer their models.
In general, Hummingbird syntax is quite minimal and intuitive.

Put even more simply; you can now convert your models written in Scikit-learn or Xgboost or LightGBM into PyTorch models and gain the performance benefits of Pytorch while inferencing.

It means, you can just train your model with Scikit-learn or any other framework like Xgboost, and with Hummingbird, you can convert it into Pytorch model to get all Pytorch’s benefits while inferencing

Let’s look this library into action

Hummingbird Example

Dataset

We are using for this task is diagnostic measures dataset to predict the possibility of diabetes
You can download the data here.

Install the hummingbird library

Firstly we need to install hummingbird library using pip:

!pip install hummingbird-ml

view raw hummingbird_install.py hosted with ❤ by GitHub

Import libraries

Now, I am importing all the necessary libraries:

	import numpy as np
	import pandas as pd
	import matplotlib.pyplot as plt
	from sklearn.ensemble import RandomForestClassifier
	from hummingbird.ml import convert
	from sklearn.model_selection import train_test_split

view raw install_library.py hosted with ❤ by GitHub

Data pre-processing

After importing all the libraries, We need to load our dataset and then we need to split the data into 70% training and 30% for testing.

	pima = pd.read_csv("diabetes.csv")
	X = pima.iloc[:,:-1]
	y = pima.iloc[:,-1]
	X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1) # 70% training and 30% test

view raw preprocessing.py hosted with ❤ by GitHub

Model training

Now we need to implement a random forest classifier and fit the model with training data.

	model=RandomForestClassifier(n_estimators=300)
	model.fit(X_train,y_train)

view raw model_training.py hosted with ❤ by GitHub

Model prediction time

Now, lets see how much time , our normal model taking to make prediction on test data:

	%%time

	#prediction of labels for test data
	y_pred=model.predict(np.array(X_test))

view raw model_prediction.py hosted with ❤ by GitHub

Output:

CPU times: user 58.7 ms, sys: 0 ns, total: 58.7 ms 
Wall time: 59.5 ms

We can see that our model is taking 58.7 ms to predict labels for test data.

Now, we have to use hummingbird’s convert to convert our model into pytorch and then we have to apply DNN framework to enable GPU to our model.

	model_torch=convert(model,'pytorch')
	model_torch.to('cuda')

view raw model_convert.py hosted with ❤ by GitHub

Now , we have prepared our hummingbird converted model, lets make prediction from this model and calculate the time difference.

	%%time
	y_pred_torch=model_torch.predict(np.array(X_test))

view raw torch_model_prediction.py hosted with ❤ by GitHub

Output:

CPU times: user 11.8 ms, sys: 0 ns, total: 11.8 ms 
Wall time: 15.2 ms

Wow, our converted model is taking 11.8 ms which is very less from that previous prediction, that was 58.7 ms.

Even Hummingbird is very useful and saves a ton of time and money for enterprise and data scientist, it is still under development and has a lot of limitations that need more contribution from the community to improve:

Do not support arbitrary user-defined operators, sparse data well
Do not support text feature extraction
For missing and categorical values is currently under development
HummingBird is supported in Python ≥ 3.5.
HummingBird can only convert ML models to the PyTorch framework and doesn’t support Keras.

Stay Tunes, happy learning 🙂

Follow MachineX Intelligence for more: