Author: Chitra Sapkal

Best Way of Optimization: Bucketing in Hive

Reading Time: 4 minutes Apache Hive is an open-source data warehouse system used to query and analyze large datasets. Data in Apache Hive can be categorized into the following three parts : Tables Partitions Buckets What is Bucketing in Hive? Bucketing in the hive is the concept of breaking data down into ranges, which are known as buckets, to give extra structure to the data so it may be Continue Reading

nifi

Apache Nifi – The Ingestion tool

Reading Time: 3 minutes What is Apache NiFi ? Apache Nifi is an open source software for automating and managing the data flow between systems, which Leveraging the concept of Extract,Transform and Load. Apache Nifi a powerful as well as reliable system to process and distribute data. Additionally Apache Nifi has a web-based user interface for design, control, feedback, and monitoring of dataflows. History of Apache NiFi Based on Continue Reading