ML, AI and Data Engineering

Things That Make You Love Machine Learning

Reading Time: 3 minutes This blog is about Machine Learning and its overview. What is Machine? In a simple sentence, we can say that it is the functional system made by humans which follows some steps defined by the person who made it. The system consists of functional properties and this model is called a system architecture model. This is used to perform a particular task that reduces human Continue Reading

Cloud Data Loss Prevention (DLP): Part-2

Reading Time: 2 minutes Google Cloud Platform’s Data Loss Protection API provides a service that can make organizations manage sensitive data, including detecting and redaction, masking, and tokenizing such data. This can help organizations comply with regulations such as GDPR, and reduce the risk of data exposure and data breaches. Such as a name, email address, telephone number, identification number, or credit card number. In the previous blog Cloud Data Loss Continue Reading

A Simple Guide to OCR using Pytesseract

Reading Time: 2 minutes What is OCR OCR is an acronym for optical character recognition. It is a widespread technology to recognize text inside images, such as scanned documents and photos. OCR technology is used to convert virtually any kind of image containing written text (typed, handwritten, or printed) into machine-readable text data.  OCR using Pytesseract Python-tesseract is a wrapper for Google’s Tesseract OCR engine. It can read any Continue Reading

Apache Beam: Side input Pattern

Reading Time: 3 minutes Apache Beam is a unified programming model for defining both batch and streaming data-parallel processing pipelines. It is a modern way of defining data processing pipelines. It has rich sources of APIs and mechanisms to solve complex use cases. In some use cases, while we define our data pipelines the requirement is, the pipeline should use some additional inputs. For example, In streaming analytics applications, it Continue Reading

Fundamentals Of Classification Models Part-2

Reading Time: 3 minutes This article is the continuation of “Fundamentals of Classification Models Part – 1” You need to go through this part before going to learn about Classifier Models. Classifier Models As discussed in the previous article ” We prepare the data for training the algorithm” the first step is to pre-process and clean the data The cleaning we need for this dataset is to change the Continue Reading

Tabula : Scraping Table Data From PDF Files

Reading Time: 3 minutes In this Blog , You will learn the best way to scrape tables from PDF files to the panda’s data frame . Fetching tables from PDF files is no more a difficult task, you can do this using a single line in python. What you will learn Installing a tabula-py library. Importing library. Reading a PDF file. Reading a table on a particular page of Continue Reading

Activation Function in Neural Network

Reading Time: 14 minutes An Activation Function decides whether a neuron should be activate or not. This means that it will decide whether the neuron’s input to the network is important or not in the process of prediction using simpler mathematical operations.  The role of the Activation Function is to derive output from a set of input values fed to a node (or a layer). The primary role of Continue Reading

Convolutional Neural Network in TensorFlow

Reading Time: 3 minutes Introduction: As you might know, Neural networks reflect the behaviour of the human brain, allowing computer programs to recognise patterns and solve common problems in the fields of AI, machine learning, and deep learning. Neural networks are comprised of a node layers, containing an input layer, one or more hidden layers, and an output layer. Each node, or artificial neuron, connects to another and has Continue Reading

Google BigQuery: An Introduction to Big Data Analytics Platform.

Reading Time: 6 minutes Hey Folks, Today we going to discuss Google BigQuery, an enterprise data warehouse with built-in machine learning capabilities. Before going to BigQuery, let’s understand what is Google Cloud Platform?Google Cloud Platform is a suite of public cloud computing services offered by Google. The platform includes a range of hosted services for compute, storage and application development that run on Google hardware. Google Cloud protects your data, applications, Continue Reading

Text Data Vectorization Techniques in Natural Language Processing

Reading Time: 6 minutes Features in any Machine Learning algorithms are generally numerical data on which we can easily perform any mathematical operations. But Machine Learning algorithms cannot work on raw text data. Machine Learning algorithms can only process numerical representation in form of vector(matrix) of actual text. For converting textual data into numerical representation of features we can use the following text vectorization techniques in Natural Language Processing. Continue Reading

DBSCAN Clustering Algorithm

Reading Time: 4 minutes What is Clustering? Clustering, often known as cluster analysis, is an unsupervised machine learning task. Using a clustering algorithm entails providing the algorithm with a large amount of unlabeled data and allowing it to locate whatever groupings in the data it can. The names given to these groups are clusters. A cluster is a collection of data points that are related to one another based Continue Reading

Fundamentals of Classification Models Part-1

Reading Time: 3 minutes Introduction Supervised Machine Learning algorithm can be broadly classified into Regression and Classification Algorithms. In Regression algorithms , we have to predict the output for continuous values, but to predict the categorical values , we need Classification algorithms. Classification can be performed on structured or unstructured data. The main goal of a classification problem is to identify the category or a class to which a Continue Reading

BigQuery Machine Learning using GCP

Reading Time: 4 minutes BigQuery ML enables users to create and execute machine learning models in BigQuery using SQL queries. in this blog we will cover How to create, evaluate and use machine learning models in BigQuery Setup and requirements first create a profile on GCP platform and login using your credentials. and in the API option , enable the BigQuery API. Open BigQuery Console In the Google Cloud Continue Reading