Analytics

Knime: Accessing a REST API with dynamic query param

Reading Time: 3 minutes Nowadays Rest API is the most widely used way to share data, In which many API returns a subset of complete data in form of page. Sometimes we need to append multiple query param in the URL to get some specific and filtered data. In this blog, we will learn how to generate dynamic URLs by adding query param and get data. Knime platform supports Continue Reading

Data Visualisation In KNIME

Reading Time: 3 minutes KNIME is definitely a dream for data scientists. It makes the work of an Data Scientist much easier. If you haven’t heard about KNIME, you can find all about it in our blog Knime Analytics Platform: A dream for a data scientist Continuing on, in this blog we will now see how to create visualizations in KNIME and how easy it is to create visualizations. Continue Reading

Knime Analytics Platform: A dream for a data scientist

Reading Time: 3 minutes In this blog, we are going to see, what is the Knime analytics platform and its important features to create an analytics workflow in an easy way. Introduction to Knime Analytics Platform KNIME is a platform built for powerful analytics on a GUI based workflow. This means you do not have to know how to code to be able to work using KNIME and derive Continue Reading

Tracking Pixels with Google Tag Manager

Reading Time: 3 minutes Introduction Hi Everyone! This is my first blog and in this, I’ll try to explain how we can use Google Tag Manager to manage and deploy tracking pixels (eg. Google Analytics) on your website without having to modify the code Tracking Pixels A tracking pixel is an HTML code snippet that is loaded when a user visits a website. It is useful for tracking user Continue Reading

MachineX: Boosting performance with XGBoost

Reading Time: 5 minutes In this blog, we are going to see how XGBoost works and some of the important features of XGBoost with the help of an example. So, many of us heard about tree models and boosting techniques. Let’s put these concepts together and talk about XGBoost, the most powerful machine learning Algorithm out there. XGboost called for eXtreme Gradient Boosted trees. The name XGBoost, though, actually Continue Reading

top 7 data analytics trends

Top 7 Data Analytics and Management Trends for 2020

Reading Time: 5 minutes We live in an era of data as it lies at the heart of digital transformation. And datasets are no longer as simple as before. They have increased in volumes, velocity, complexity and above all, are coming from multiple sources. Top tech giants like Google, Netflix, Amazon, and others are crunching massive amounts of data on a daily basis to give you a personalized experience. Continue Reading

Data Lake – Build it in Phases

Reading Time: 3 minutes Data Lake – How to build a data lake and what are the phases involved in the same.

Apache Spark: Repartitioning v/s Coalesce

Reading Time: 3 minutes Does partitioning help you increase/decrease the Job Performance? Spark splits data into partitions and computation is done in parallel for each partition. It is very important to understand how data is partitioned and when you need to manually modify the partitioning to run spark applications efficiently. Now, diving into our main topic i.e Repartitioning v/s Coalesce What is Coalesce? The coalesce method reduces the number Continue Reading

MachineX: Top 10 data Science use cases in Retail

Reading Time: 8 minutes In this blog, we will see some of the data science use cases in Retail industries and how it is transforming the customer experience. We are all aware of the troves of data, retail businesses generate on a daily basis. However, this repository of critical data is worthless if it cannot be translated into valuable insights into the consumer’s minds or market trends. While all Continue Reading

Big Data Evolution: Migrating on-premise database to Hadoop

Reading Time: 4 minutes We are now generating massive volumes of data at an accelerated rate. To meet business needs, address changing market dynamics as well as improve decision-making, sophisticated analysis of this data from disparate sources is required. The challenge is how to capture, store and model these massive pools of data effectively in relational databases. Big data is not a fad. We are just at the beginning Continue Reading

Getting started with TensorFlow: A Brief Introduction

Reading Time: 3 minutes TensorFlow is an open source software library, provided by Google, mainly for deep learning, machine learning and numerical computation using data flow graphs. Looking at their website, the first definition they have written for TensorFlow goes something like this – TensorFlow™ is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges Continue Reading