Analytics

Is TensorFlow a good fit for model optimisation?

Reading Time: 2 minutes In the realm of AI, a great deal of consideration is optimising training. There is a lot less information out there on optimising models. However serving models for forecast is the place where we make money out of ML. Without a doubt, the expense of serving forecasts might be a central point in the all out profit from speculation for a ML application. In this Continue Reading

Big Data Analytics: An Introduction

Reading Time: 5 minutes DATA ANALYTICS Data can help businesses better understand their customers and improve their advertising campaigns. It can also help personalise their content, and improve their bottom lines. The advantages of data are many, but you can’t access these benefits without the proper data analytics tools and processes. While raw data has a lot of potentials, you need data analytics to unlock the power to grow Continue Reading

Loading JSON data into Snowflake

Reading Time: 4 minutes Have you ever faced any use case or scenario where you’ve to load JSON data into the Snowflake? We better know JSON data is one of the common data format to store and exchange information between systems. JSON is a relatively concise format. If we are implementing a database solution, it is very common that we will come across a system that provides data in Continue Reading

Spark SQL in Delta Lake 0.7.0

Reading Time: 3 minutes Nowadays Delta lake is a buzz word in the Big Data world, especially among the spark developers because it relegates lots of issues found in the Big Data domain. Delta Lake is an open-source storage layer that brings reliability to data lakes. Delta Lake provides ACID transactions, scalable metadata handling, and unifies streaming and batch data processing. It is evolving day by day and adds cool features in its every release. Continue Reading

Knime: Accessing a REST API with dynamic query param

Reading Time: 3 minutes Nowadays Rest API is the most widely used way to share data, In which many API returns a subset of complete data in form of page. Sometimes we need to append multiple query param in the URL to get some specific and filtered data. In this blog, we will learn how to generate dynamic URLs by adding query param and get data. Knime platform supports Continue Reading

Data Visualisation In KNIME

Reading Time: 3 minutes KNIME is definitely a dream for data scientists. It makes the work of an Data Scientist much easier. If you haven’t heard about KNIME, you can find all about it in our blog Knime Analytics Platform: A dream for a data scientist Continuing on, in this blog we will now see how to create visualizations in KNIME and how easy it is to create visualizations. Continue Reading

Knime Analytics Platform: A dream for a data scientist

Reading Time: 3 minutes In this blog, we are going to see, what is the Knime analytics platform and its important features to create an analytics workflow in an easy way. Introduction to Knime Analytics Platform KNIME is a platform built for powerful analytics on a GUI based workflow. This means you do not have to know how to code to be able to work using KNIME and derive Continue Reading

Tracking Pixels with Google Tag Manager

Reading Time: 3 minutes Introduction Hi Everyone! This is my first blog and in this, I’ll try to explain how we can use Google Tag Manager to manage and deploy tracking pixels (eg. Google Analytics) on your website without having to modify the code Tracking Pixels A tracking pixel is an HTML code snippet that is loaded when a user visits a website. It is useful for tracking user Continue Reading

A Little Hands-on with NumPy

Reading Time: 4 minutes NumPy’s built-in methods and concepts like vectorization, broadcasting and indexing allows you to focus on answering questions from your data and not about how to code those solutions. NumPy handles most of that for you.

MachineX: Boosting performance with XGBoost

Reading Time: 5 minutes In this blog, we are going to see how XGBoost works and some of the important features of XGBoost with the help of an example. So, many of us heard about tree models and boosting techniques. Let’s put these concepts together and talk about XGBoost, the most powerful machine learning Algorithm out there. XGboost called for eXtreme Gradient Boosted trees. The name XGBoost, though, actually Continue Reading

top 7 data analytics trends

Top 7 Data Analytics and Management Trends for 2020

Reading Time: 5 minutes We live in an era of data as it lies at the heart of digital transformation. And datasets are no longer as simple as before. They have increased in volumes, velocity, complexity and above all, are coming from multiple sources. Top tech giants like Google, Netflix, Amazon, and others are crunching massive amounts of data on a daily basis to give you a personalized experience. Continue Reading

Data Lake – Build it in Phases

Reading Time: 3 minutes Data Lake – How to build a data lake and what are the phases involved in the same.

Apache Spark: Repartitioning v/s Coalesce

Reading Time: 3 minutes Does partitioning help you increase/decrease the Job Performance? Spark splits data into partitions and computation is done in parallel for each partition. It is very important to understand how data is partitioned and when you need to manually modify the partitioning to run spark applications efficiently. Now, diving into our main topic i.e Repartitioning v/s Coalesce What is Coalesce? The coalesce method reduces the number Continue Reading